linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@amd.com>
To: dan.j.williams@intel.com, linux-coco@lists.linux.dev,
	linux-pci@vger.kernel.org
Cc: gregkh@linuxfoundation.org, bhelgaas@google.com,
	yilun.xu@linux.intel.com, aneesh.kumar@kernel.org
Subject: Re: [PATCH 2/7] PCI/TSM: Add pci_tsm_guest_req() for managing TDIs
Date: Fri, 29 Aug 2025 12:21:09 +1000	[thread overview]
Message-ID: <101dc0bb-d6d1-4f29-81fb-fb1ff78891ba@amd.com> (raw)
In-Reply-To: <68b0d30e2a18c_75db10050@dwillia2-mobl4.notmuch>



On 29/8/25 08:07, dan.j.williams@intel.com wrote:
> Alexey Kardashevskiy wrote:
>>
>>
>> On 27/8/25 13:52, Dan Williams wrote:
>>> A PCIe device function interface assigned to a TVM is a TEE Device
>>> Interface (TDI). A TDI instantiated by pci_tsm_bind() needs additional
>>> steps to be accepted by a TVM and transitioned to the RUN state.
>>>
>>> pci_tsm_guest_req() is a channel for the guest to request TDISP collateral,
>>> like Device Interface Reports, and effect TDISP state changes, like
>>> LOCKED->RUN transititions. Similar to IDE establishment and pci_tsm_bind(),
>>
>> s/transititions/transitions/
> 
> Thanks, not sure why checkpatch spell check sometimes misses words.
> 
>>
>>> these are long running operations involving SPDM message passing via the
>>> DOE mailbox, i.e. another 'struct pci_tsm_link_ops' operation.
>>>
>>> The path for a guest to invoke pci_tsm_guest_request() is either via a kvm
>>> handle_exit() or an ioctl() when an exit reason is serviced by a userspace
>>> VMM.
>>>
>>> Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
>>> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>> ---
>>>    drivers/pci/tsm.c       | 60 +++++++++++++++++++++++++++++++++++++++++
>>>    include/linux/pci-tsm.h | 55 +++++++++++++++++++++++++++++++++++--
>>>    2 files changed, 113 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c
>>> index 302a974f3632..3143558373e3 100644
>>> --- a/drivers/pci/tsm.c
>>> +++ b/drivers/pci/tsm.c
>>> @@ -338,6 +338,66 @@ int pci_tsm_bind(struct pci_dev *pdev, struct kvm *kvm, u32 tdi_id)
>>>    }
>>>    EXPORT_SYMBOL_GPL(pci_tsm_bind);
>>>    
>>> +/**
>>> + * pci_tsm_guest_req() - helper to marshal guest requests to the TSM driver
>>> + * @pdev: @pdev representing a bound tdi
>>> + * @scope: security model scope for the TVM request
>>> + * @req_in: Input payload forwarded from the guest
>>> + * @in_len: Length of @req_in
>>> + * @out_len: Output length of the returned response payload
>>> + *
>>> + * This is a common entry point for KVM service handlers in userspace responding
>>
>> KVM will have to know about the host's pci_dev which it does not. I'd
>> postpone mentioning KVM here till it learns about pci_dev, if.
> 
> Oh, this was a copy paste mistake that I meant to cleanup after Yilun's
> clarification [1]
> 
> [1]: http://lore.kernel.org/aCbglieuHI1BJDkz@yilunxu-OptiPlex-7050
> 
> So, the plan is *not* for KVM to have PCI device awareness.
> 
>>> + * to TDI information or state change requests. The scope parameter limits
>>> + * requests to TDISP state management, or limited debug.
>>> + *
>>> + * Returns a pointer to the response payload on success, @req_in if there is no
>>> + * response to a successful request, or an ERR_PTR() on failure.
>>> + *
>>> + * Caller is responsible for kvfree() on the result when @ret != @req_in and
>>> + * !IS_ERR_OR_NULL(@ret).
>>
>> Uff... So the caller (which is IOMMUFD vdevice) has to check and
>> decide on whether to kvfree. ioctl() is likely to have 2 buffers
>> anyway and preallocate the response buffer, why make IOMMUFD care
>> about this?
> 
> No, iommufd does not need to preallocate the response buffer, the
> response buffer is allocated by the responder.
> 
> This follows the example fwctl because this guest_req() transport is in
> the same class of kernel bypass tunnels. No need to reinvent a common
> device RPC transport mechanism.

ok, time to read about fwctl.

>>> + *
>>> + * Context: Caller is responsible for calling this within the pci_tsm_bind()
>>> + * state of the TDI.
>>> + */
>>> +void *pci_tsm_guest_req(struct pci_dev *pdev, enum pci_tsm_req_scope scope,
>>> +			void *req_in, size_t in_len, size_t *out_len)
>>> +{
>>> +	const struct pci_tsm_ops *ops;
>>> +	struct pci_tsm_pf0 *tsm_pf0;
>>> +	struct pci_tdi *tdi;
>>> +	int rc;
>>> +
>>> +	/*
>>> +	 * Forbid requests that are not directly related to TDISP
>>> +	 * operations
>>> +	 */
>>> +	if (scope > PCI_TSM_REQ_STATE_CHANGE)
>>> +		return ERR_PTR(-EINVAL);
>>> +
>>> +	ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem);
>>> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock)))
>>
>> Why double braces?
> 
> Because of the assignment used as truth value. The evaluation of the

Ah -Werror=parentheses.

> ACQUIRE_ERR() result in the assignment was a compactness choice that
> PeterZ made in his original proposal [2] that made sense to me, so I
> carried it forward.

oookay :(

> 
> [2]: http://lore.kernel.org/20250509104028.GL4439@noisy.programming.kicks-ass.net
> 
> [..]
>>> @@ -50,6 +51,9 @@ struct pci_tsm_ops {
>>>    		struct pci_tdi *(*bind)(struct pci_dev *pdev,
>>>    					struct kvm *kvm, u32 tdi_id);
>>>    		void (*unbind)(struct pci_tdi *tdi);
>>> +		void *(*guest_req)(struct pci_dev *pdev,
>>
>> We have pdev in pci_tdi, pci_tsm and pci_tsm_pf0 (via .base), using
>> these in pci_tsm_ops will document better which call is allowed on
>> what entity - DSM or TDI. Or may be ditch those back "pdev"
>> references?
> 
> Not immediately understanding what change you want here. Do you want
> iommufd to track the pci_tdi?

I'd like to either:

- get rid of pdev back refs in pci_tsm/pci_tdi since we pass pci_dev everywhere as if a pdev from pci_tsm/pci_tdi is used in, say, 1-2 places, then it is just cleaner to pass pdev to those places explicitly

oooor

- pass pci_tsm/pci_tdi to pci_tsm_ops hooks and use pdev in those when needed, this way it is clearer from the hook prototype what it operates on.



>>> +				   enum pci_tsm_req_scope scope, void *req_in,
>>> +				   size_t in_len, size_t *out_len);
>>
>>
>> Out of curiosity (probably could go to the commit log) - for what kind
>> of request and on which platform we do not know the response size in
>> advance? On AMD, the request and response sizes are fixed.
> 
> I don't know. Given this is to support any possible combination of TSM
> and ABI I took inspiration from fwctl which is trying to solve a similar
> common transport problem.

If guest_req() returns NULL - what is it - error (no response) or success ("request successfully accepted, no response needed")? The PSP returns fw_err (which I pass in my guest_request hook), does this interface suggest that my TSM dev should allocate a sizeof(fw_err) buffer at least, and if there is more - then sizeof(fw_err)+sizeof(response)? I thought TDX does return an error code too, surprised to see it missing here.


>> And the userspace (which makes such request) will allocate some memory
>> before calling such ioctl(), can "void *req_in" be "void __user
>> *reg_in"? The CCP driver is going to copy the request and response
>> anyway as there are RMP rules about them.
>>
>> And what is wrong with returning "int" as an error vs ERR_PTR(), is
>> there a recommendation for this, or something?
> 
> Keep interface innovation to minimum and follow an existing pattern.
>
> [..]
>>> +/**
>>> + * enum pci_tsm_req_scope - Scope of guest requests to be validated by TSM
>>> + *
>>> + * Guest requests are a transport for a TVM to communicate with a TSM + DSM for
>>> + * a given TDI. A TSM driver is responsible for maintaining the kernel security
>>> + * model and limit commands that may affect the host, or are otherwise outside
>>> + * the typical TDISP operational model.
>>> + */
>>> +enum pci_tsm_req_scope {
>>> +	/**
>>> +	 * @PCI_TSM_REQ_INFO: Read-only, without side effects, request for
>>> +	 * typical TDISP collateral information like Device Interface Reports.
>>> +	 * No device secrets are permitted, and no device state is changed.
>>> +	 */
>>> +	PCI_TSM_REQ_INFO = 0,
>>> +	/**
>>> +	 * @PCI_TSM_REQ_STATE_CHANGE: Request to change the TDISP state from
>>> +	 * UNLOCKED->LOCKED, LOCKED->RUN. No any other device state,
>>> +	 * configuration, or data change is permitted.
>>> +	 */
>>> +	PCI_TSM_REQ_STATE_CHANGE = 1,
>>> +	/**
>>> +	 * @PCI_TSM_REQ_DEBUG_READ: Read-only request for debug information
>>> +	 *
>>> +	 * A method to facilitate TVM information retrieval outside of typical
>>> +	 * TDISP operational requirements. No device secrets are permitted.
>>> +	 */
>>> +	PCI_TSM_REQ_DEBUG_READ = 2,
>>> +	/**
>>> +	 * @PCI_TSM_REQ_DEBUG_WRITE: Device state changes for debug purposes
>>> +	 *
>>> +	 * The request may affect the operational state of the device outside of
>>> +	 * the TDISP operational model. If allowed, requires CAP_SYS_RAW_IO, and
>>> +	 * will taint the kernel.
>>> +	 */
>>> +	PCI_TSM_REQ_DEBUG_WRITE = 3,
>>
>>
>> What is going to enforce this and how? It is a guest request, ideally
>> encrypted, and the host does not really have to know the nature of the
>> request (if the guest wants something from the host to do in addition
>> to what is it asking the TSM to do - then GHCB is for that). And 3 of
>> 4 AMD TIO requests (STATE_CHANGE is a host request and no plan for
>> DEBUG) do not fit in any category from the above anyway. imho we do
>> not need it at least now.
> 
> While the TSM is in the trust boundary of the TVM, the TSM and the TVM
> are not necessarily trusted by the VMM. It has a responsibility to
> maintain its own security model especially when marshaling opaque blobs
> on behalf of a guest. This scope parameter serves the same purpose as it
> does in fwctl to maintain a security model and explicitly control for
> requests that are out of scope.
> 
> The enforcement is market and regulatory forces to make solutions are
> not bypass security model expectations of the operating system.

I get the idea, it just sounds like it should be a mask - READ|WRITE|TDISP_STATE|DEBUG. Which category would MMIO_VALIDATE fall (set "validated" in RMP)? Thanks,


-- 
Alexey


  reply	other threads:[~2025-08-29  2:21 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27  3:52 [PATCH 0/7] PCI/TSM: TEE I/O infrastructure Dan Williams
2025-08-27  3:52 ` [PATCH 1/7] PCI/TSM: Add pci_tsm_{bind,unbind}() methods for instantiating TDIs Dan Williams
2025-09-02  0:12   ` Alexey Kardashevskiy
2025-09-02 15:04     ` Aneesh Kumar K.V
2025-09-02 15:05   ` Aneesh Kumar K.V
2025-09-03 15:17   ` Aneesh Kumar K.V
2025-09-04 10:38     ` Alexey Kardashevskiy
2025-09-04 12:56       ` Aneesh Kumar K.V
2025-08-27  3:52 ` [PATCH 2/7] PCI/TSM: Add pci_tsm_guest_req() for managing TDIs Dan Williams
2025-08-28  9:53   ` Alexey Kardashevskiy
2025-08-28 22:07     ` dan.j.williams
2025-08-29  2:21       ` Alexey Kardashevskiy [this message]
2025-08-30  2:37         ` dan.j.williams
2025-09-01 23:49           ` Alexey Kardashevskiy
2025-08-28 13:02   ` Aneesh Kumar K.V
2025-08-28 22:14     ` dan.j.williams
2025-08-27  3:52 ` [PATCH 3/7] device core: Introduce confidential device acceptance Dan Williams
2025-08-27  6:14   ` Greg KH
2025-08-28 20:07     ` dan.j.williams
2025-08-27  3:52 ` [PATCH 4/7] x86/ioremap, resource: Introduce IORES_DESC_ENCRYPTED for encrypted PCI MMIO Dan Williams
2025-08-27  3:52 ` [PATCH 5/7] PCI/TSM: Add Device Security (TVM Guest) operations support Dan Williams
2025-09-03 15:22   ` Aneesh Kumar K.V
2025-09-04 15:02   ` Aneesh Kumar K.V
2025-08-27  3:52 ` [PATCH 6/7] samples/devsec: Introduce a "Device Security TSM" sample driver Dan Williams
2025-08-27 12:39   ` Jason Gunthorpe
2025-08-27 23:47     ` Alexey Kardashevskiy
2025-08-28 21:38     ` dan.j.williams
2025-08-29 16:02       ` Jason Gunthorpe
2025-08-29 20:00         ` dan.j.williams
2025-08-29 23:34           ` Jason Gunthorpe
2025-08-27  3:52 ` [PATCH 7/7] tools/testing/devsec: Add a script to exercise samples/devsec/ Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=101dc0bb-d6d1-4f29-81fb-fb1ff78891ba@amd.com \
    --to=aik@amd.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-pci@vger.kernel.org \
    --cc=yilun.xu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).