Linux Confidential Computing Development
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Alexander Graf <graf@amazon.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<linux-coco@lists.linux.dev>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	"Borislav Petkov (AMD)" <bp@alien8.de>,
	Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Roth <michael.roth@amd.com>, <x86@kernel.org>,
	<aik@amd.com>, <elena.reshetova@intel.com>, <pgonda@google.com>
Subject: Re: [PATCH 4/4] configfs-tsm-report: Introduce TCB stability enumeration and watchdog
Date: Fri, 4 Oct 2024 14:36:15 -0700	[thread overview]
Message-ID: <67005fcf46742_10a0a2945@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <665c5ae0-4b7c-4852-8995-255adf7b3a2f@amazon.com>

Alexander Graf wrote:
> On 13.09.24 02:26, Dan Williams wrote:
> > One of the points of contention for enabling runtime updates of the TDX
> > Module has been what to do about the fact that it results in
> > confidential VMs seeing surprise updates to their TCB. The general
> > concern is that there is a non-zero confidentiality regression risk for
> > updating measured TCB components. Not only the TDX Module, but
> > microcode, SEV-SNP PSP firmware, RISCV and ARM equivalents etc. The
> > degree to which the TCB is or is not compromised by an unexpected update
> > is unknowable by the kernel, but it should at least try to be
> > transparent about what it knows about TCB stability.
> 
> IMHO this looks at the problem the wrong way around. The typical flow 
> for firmware updates is:
> 
> 1) Researchers find an issue
> 2) Intel fixes it, ideally in TDX Module. Releases TDX Module update.
> 3) Infrastructure providers update TDX Module to resolve issue
> 4) Embargo lifts
> 
> If an issue impacts your confidentiality promises according to your 
> threat model, you are already affected by it after 1). You are able to 
> assess whether that is the case after 4). This patch tells you about it 
> during 3).
> 
> If your threat model really really considers hosting infrastructure as 
> malicious, 

Lets set aside "malicious host", because as you allude, why cloud host
at all in that case? Instead lets focus on "theoretical paranoid tenant"
that wants to trust but verify the TCB in the presence of updates.

More below, but I readily admit that "theoretical tenant" already raises
the "no practical benefit" objection to the proposal.

> you did not gain any benefits from learning about 3). You 
> know that something was patched. You do not know what. You do not know 
> whether anyone malicious was already aware of the issue. If you were 
> strict, you would need to consider all data past 1) in such a VM as 
> compromised. Given SEV-SNP's patch track record, I would expect security 
> relevant patches multiple times per year. If you are really paranoid 
> enough to care, notifications at point 3 will not tell you anything. 
> Instead, your conclusion would probably be "I could get compromised at 
> any point in time, so let me not expose data in the first place" which 
> means you can not run in the Cloud at all. In most risk assessments, 
> customers will typically be ok with a temporary, limited risk between 1) 
> and 3). And that means you want to optimize to shift 3) left as much as 
> possible to fix security issues as quickly as possible.
> 
> Instead, what we do by creating FUD around the patching process is to 
> create a false assumption with customers that "unpatched" == "secure" 
> while it's the exact opposite. We also encourage a shift of 3) from left 
> to right: If I only patch you after embargo lift, you can assess, so 
> it's safer, right? Not really, because you prolong the time an 
> environment is unpatched.
> 
> The crux of the problem is that - by definition - a VM can not 
> autonomously determine step 1) because what really happens here is that 
> the world around it changed. We need to ask the world. So if Intel is 
> really concerned about the update flow and notifying customers that the 
> environment they are running in is potentially insecure, Intel should 
> provide an attestation mechanism to notify them as early as reasonable; 
> probably around 3). That way, customers get the chance to learn first 
> hand that they should be running on a newer revision of TDX Module code 
> and should no longer trust the one with known vulnerabilities.
> 
> The in-VM logic suggested in this patch such as "I'll die if you patch 
> my host" or "I tell my owner that you patched me, but my owner won't be 
> able to tell anything from that info" is not going to help anyone for 
> TDX Module security patching situations.
> 
> Instead, let's start the conversation from how Intel can provide a 
> mechanism to customers to evaluate whether their system is fully 
> patched, work towards closing the gap between 1) and 3) and then build 
> whatever interfaces in Linux we need to enable customers to make use of 
> the evaluation mechanism.

I quote the above in its entirety because, "no lies detected". However,
reframe it in the perspective of paranoid  / vigilant tenant. It is
always going to be the case that a vigilant tenant can see updates
whether the kernel is polling for them or not. Also, it will almost
always be the case that the platform vendor release schedule for updates
will come at some inopportune time for the tenant, or that the CSP
update somehow races to the tenant before the notification from platform
vendor that a new update is available.

At the discovery of an issue impacting TCB1, that is potentially fixed
by TCB2, I think it is reasonable to support a tenant that wants to
pause TCB1 operations until a TCB2 audit is complete and then resume
operations. If that is reasonable then the question becomes how to
ensure periodic renewal in confidence of TCB1 and non-surprise update to
TCB2. A kernel watchdog protects against userspace hangs or exceptions
that block periodic renewal of TCB1, or otherwise confirms that an
update to TCB2 is expected and welcome.

  reply	other threads:[~2024-10-04 21:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-13  0:25 [PATCH 0/4] configfs-tsm-report: TCB Stability Dan Williams
2024-09-13  0:26 ` [PATCH 1/4] configfs-tsm: Namespace TSM report symbols Dan Williams
2024-09-13  0:26 ` [PATCH 2/4] coco/guest: Move shared guest CC infrastructure to drivers/virt/coco/guest/ Dan Williams
2024-09-13  0:26 ` [PATCH 3/4] x86/tdx: Introduce guest global metadata retrieval infrastructure Dan Williams
2024-09-16  8:56   ` Kirill A. Shutemov
2024-10-01  7:56     ` Dan Williams
2024-09-13  0:26 ` [PATCH 4/4] configfs-tsm-report: Introduce TCB stability enumeration and watchdog Dan Williams
2024-09-16  9:06   ` Kirill A. Shutemov
2024-10-01  0:33     ` Dan Williams
2024-10-01  8:50   ` Alexander Graf
2024-10-04 21:36     ` Dan Williams [this message]
2024-10-07  8:33       ` Alexander Graf
2024-10-07 18:22         ` Dave Hansen
2024-10-07 18:59           ` Dan Williams
2024-10-07 19:43             ` Dionna Amalie Glaze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67005fcf46742_10a0a2945@dwillia2-mobl3.amr.corp.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=aik@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=elena.reshetova@intel.com \
    --cc=graf@amazon.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=michael.roth@amd.com \
    --cc=pgonda@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox