public inbox for linux-coco@lists.linux.dev
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
	linux-coco@lists.linux.dev, linux-pci@vger.kernel.org,
	gregkh@linuxfoundation.org, aik@amd.com, aneesh.kumar@kernel.org,
	yilun.xu@linux.intel.com, bhelgaas@google.com,
	alistair23@gmail.com, jgg@nvidia.com,
	Donald Hunter <donald.hunter@gmail.com>
Subject: Re: [PATCH v2 08/19] PCI/TSM: Add "evidence" support
Date: Tue, 17 Mar 2026 19:14:51 +0100	[thread overview]
Message-ID: <abmaG0jC7b05Lytz@wunner.de> (raw)
In-Reply-To: <20260314111245.76d18d73@kernel.org>

On Sat, Mar 14, 2026 at 11:12:45AM -0700, Jakub Kicinski wrote:
> On Mon,  2 Mar 2026 16:01:56 -0800 Dan Williams wrote:
> > The implementation adheres to the guideline from:
> > Documentation/userspace-api/netlink/genetlink-legacy.rst
> > 
> >     New Netlink families should never respond to a DO operation with
> >     multiple replies, with ``NLM_F_MULTI`` set. Use a filtered dump
> >     instead.
> 
> My understanding of F_MULTI is that deserializer is supposed to
> continue deserializing into current object.

So is the "should" above meant to be understood in the RFC 2119 way,
i.e. as a mere recommendation?

The problem we're facing is that nlattr::nla_len is u16, so the maximum
size is 65531 bytes (65535 minus header).  That's insufficient for
transmitting blobs that are several megabytes in size.

The obvious solution is to split the blobs into smaller chunks and
transmit each chunk in an attribute of the same type.  The application
then concatenates them together to reconstruct the blob.  For particularly
large blobs, it may even be necessary to split across multiple messages
by way of NLM_F_MULTI.

Apart from the attribute size limitation, there's the problem that copying
large blobs in memory is inefficient.  Ideally we'd want zero-copy.
The solution I came up with is to attach the blob's pages as fragments
to the skb.  Conceptually the fragments succeed the linear buffer of the
skb, so by putting the nlattr header into the linear buffer and attaching
the blob as fragments, the receiver consumes the netlink message in a
natural way.  This patch introduces an nla_put_blob() helper which was
pretty straightforward:

https://github.com/l1k/linux/commit/af9b939fc30b

This patch is taking advantage of the helper:

https://github.com/l1k/linux/commit/009663bd172e

The only change I had to make is amending nlmsg_end() to take the
fragments into account when calculating the nlmsg_len.

The patch does achieve zero-copy on the sender's end.  It may also
achieve zero-copy on the receiver's end if the receiver is in the
kernel.  However it does *not* achieve zero-copy if the receiver is
in user space.  That's because:

simple_copy_to_iter()
  copy_to_iter()
    _copy_to_iter()
      copy_to_user_iter()
        raw_copy_to_user()

... will just stupidly copy the data into the user space buffer.
It might be possible to achieve zero-copy in user space via io_uring.

At this point perhaps your conclusion is that netlink isn't the right
protocol for this job.  It's great for transmitting sets of small items,
some of which may be optional, but it's obviously not well-suited for
large items.

Jason Gunthorpe was quite insistent that we use netlink and you know
how consensus-oriented kernel development is.  Indeed sysfs has turned
out not to be ideal because the protocol that we're dealing with
(SPDM - DMTF DSP0274) allows many degrees of freedom and making
them available through sysfs quickly becomes unwieldy.

E.g. when installing a certificate onto a device, the protocol allows
specifying additional parameters (a keypair ID and a certificate model)
together with the certificate chain that shall be installed.  That doesn't
square well with the "one value per file" sysfs model.  User space would
have to write the keypair ID and certificate model to separate attributes,
then write the certificate chain to a third attribute.  So the kernel would
need some kind of state machine to keep track of which sysfs attributes
have been written.  It gets quite ugly.

As another example, the SPDM protocol allows retrieving measurements
from the device.  The measurements are indexed by an 8-bit number.
To expose them via sysfs, the kernel would have to retrieve all of them
on device enumeration so that it knows which indices are populated
and need to be exposed in sysfs.  That would incur a delay on device
enumeration and thus lead to slower boot times.

If netlink is at all the right protocol for the job, I'm wondering if an
extension for larger attributes would be entertained.  Basically a
variation of struct nlattr, but with a 24-bit or 32-bit size and
maybe a list of fragment numbers.  The latter would be useful to have
*multiple* zero-copy attributes because the patches linked above only
allow for a single zero-copy attribute per nlmsg.

Thanks,

Lukas

  parent reply	other threads:[~2026-03-17 18:14 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03  0:01 [PATCH v2 00/19] PCI/TSM: TEE I/O infrastructure Dan Williams
2026-03-03  0:01 ` [PATCH v2 01/19] PCI/TSM: Report active IDE streams per host bridge Dan Williams
2026-03-09 16:36   ` Jonathan Cameron
2026-03-03  0:01 ` [PATCH v2 02/19] device core: Fix kernel-doc warnings in base.h Dan Williams
2026-03-09 16:39   ` Jonathan Cameron
2026-03-12 14:45     ` Greg KH
2026-03-03  0:01 ` [PATCH v2 03/19] device core: Introduce confidential device acceptance Dan Williams
2026-03-09 16:42   ` Jonathan Cameron
2026-03-12 14:44   ` Greg KH
2026-03-13  4:11     ` Dan Williams
2026-03-13 12:18       ` Greg KH
2026-03-13 18:53         ` Dan Williams
2026-03-13 19:07           ` Jason Gunthorpe
2026-03-13 13:32       ` Jason Gunthorpe
2026-03-13 19:56         ` Dan Williams
2026-03-13 20:24           ` Jason Gunthorpe
2026-03-14  1:32             ` Dan Williams
2026-03-23 18:14               ` Jason Gunthorpe
2026-03-24  2:18                 ` Dan Williams
2026-03-24 12:36                   ` Jason Gunthorpe
2026-03-25  4:13                     ` Dan Williams
2026-03-25 11:56                       ` Jason Gunthorpe
2026-03-26  1:27                         ` Dan Williams
2026-03-26 12:00                           ` Jason Gunthorpe
2026-03-26 15:00                             ` Greg KH
2026-03-26 18:31                             ` Dan Williams
2026-03-26 19:28                               ` Jason Gunthorpe
2026-03-03  0:01 ` [PATCH v2 04/19] modules: Document the global async_probe parameter Dan Williams
2026-03-03  0:01 ` [PATCH v2 05/19] device core: Autoprobe considered harmful? Dan Williams
2026-03-09 16:58   ` Jonathan Cameron
2026-03-03  0:01 ` [PATCH v2 06/19] PCI/TSM: Add Device Security (TVM Guest) LOCK operation support Dan Williams
2026-03-03  0:01 ` [PATCH v2 07/19] PCI/TSM: Add Device Security (TVM Guest) ACCEPT " Dan Williams
2026-03-03  7:15   ` Baolu Lu
2026-03-03  0:01 ` [PATCH v2 08/19] PCI/TSM: Add "evidence" support Dan Williams
2026-03-03  3:14   ` kernel test robot
2026-03-03 10:16   ` Aneesh Kumar K.V
2026-03-03 16:38   ` Aneesh Kumar K.V
2026-03-13 10:07   ` Xu Yilun
2026-03-13 18:06     ` Dan Williams
2026-03-14 18:12   ` Jakub Kicinski
2026-03-17  1:45     ` Dan Williams
2026-03-19  0:00       ` Jakub Kicinski
2026-03-20  2:50         ` Dan Williams
2026-03-17 18:14     ` Lukas Wunner [this message]
2026-03-18  7:56       ` Dan Williams
2026-03-23 18:18         ` Jason Gunthorpe
2026-03-14 18:37   ` Lukas Wunner
2026-03-16 20:13     ` Dan Williams
2026-03-16 23:02       ` Dan Williams
2026-03-17 14:13         ` Lukas Wunner
2026-03-18  7:22           ` Dan Williams
2026-03-17 18:24   ` Lukas Wunner
2026-03-18  7:41     ` Dan Williams
2026-03-03  0:01 ` [PATCH v2 09/19] PCI/TSM: Support creating encrypted MMIO descriptors via TDISP Report Dan Williams
2026-03-04 17:14   ` dan.j.williams
2026-03-13  9:57     ` Xu Yilun
2026-03-05  4:46   ` Aneesh Kumar K.V
2026-03-13 10:23     ` Xu Yilun
2026-03-13 13:36       ` Jason Gunthorpe
2026-03-17  5:13         ` Xu Yilun
2026-03-24  3:26           ` Dan Williams
2026-03-24 12:38             ` Jason Gunthorpe
2026-03-16  5:19       ` Alexey Kardashevskiy
2026-03-23 18:20         ` Jason Gunthorpe
2026-03-26 23:38           ` Alexey Kardashevskiy
2026-03-27 11:49             ` Jason Gunthorpe
2026-03-03  0:01 ` [PATCH v2 10/19] x86, swiotlb: Teach swiotlb to skip "accepted" devices Dan Williams
2026-03-03  9:07   ` Aneesh Kumar K.V
2026-03-13 10:26     ` Xu Yilun
2026-03-03  0:01 ` [PATCH v2 11/19] x86, dma: Allow accepted devices to map private memory Dan Williams
2026-03-03  7:36   ` Alexey Kardashevskiy
2026-03-03  0:02 ` [PATCH v2 12/19] x86, ioremap, resource: Support IORES_DESC_ENCRYPTED for encrypted PCI MMIO Dan Williams
2026-03-19 15:34   ` Borislav Petkov
2026-03-03  0:02 ` [PATCH v2 13/19] samples/devsec: Introduce a PCI device-security bus + endpoint sample Dan Williams
2026-03-03  0:02 ` [PATCH v2 14/19] samples/devsec: Add sample IDE establishment Dan Williams
2026-03-03  0:02 ` [PATCH v2 15/19] samples/devsec: Add sample TSM bind and guest_request flows Dan Williams
2026-03-03  0:02 ` [PATCH v2 16/19] samples/devsec: Introduce a "Device Security TSM" sample driver Dan Williams
2026-03-27  8:44   ` Lai, Yi
2026-03-03  0:02 ` [PATCH v2 17/19] tools/testing/devsec: Add a script to exercise samples/devsec/ Dan Williams
2026-03-03  0:02 ` [PATCH v2 18/19] samples/devsec: Add evidence support Dan Williams
2026-03-03  0:02 ` [PATCH v2 19/19] tools/testing/devsec: Add basic evidence retrieval validation Dan Williams
2026-03-03  9:23 ` [PATCH v2 00/19] PCI/TSM: TEE I/O infrastructure Aneesh Kumar K.V
2026-03-03 22:01   ` dan.j.williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abmaG0jC7b05Lytz@wunner.de \
    --to=lukas@wunner.de \
    --cc=aik@amd.com \
    --cc=alistair23@gmail.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=donald.hunter@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-pci@vger.kernel.org \
    --cc=yilun.xu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox