SVSM Attestation and vTPM specification additions

linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

* SVSM Attestation and vTPM specification additions - v0.61
@ 2023-02-08 21:55 Tom Lendacky
  2023-02-08 23:19 ` Dionna Amalie Glaze
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Tom Lendacky @ 2023-02-08 21:55 UTC (permalink / raw)
  To: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

[-- Attachment #1: Type: text/plain, Size: 191 bytes --]

Attached is an updated draft version of the SVSM specification with 
feedback incorporated from the previous review. Please take a look and 
reply with any feedback you may have.

Thanks,
Tom

[-- Attachment #2: 58019-Secure_VM_Service_Module_Specification.pdf --]
[-- Type: application/pdf, Size: 568492 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-08 21:55 SVSM Attestation and vTPM specification additions - v0.61 Tom Lendacky
@ 2023-02-08 23:19 ` Dionna Amalie Glaze
  2023-02-08 23:44   ` Tom Lendacky
  2023-02-15  9:49 ` Jörg Rödel
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Dionna Amalie Glaze @ 2023-02-08 23:19 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

8.1 You have RCX RCX in the first table where it is clear by context
that it should be RCX RDX, but just a small edit to keep in mind for
next time.

On Wed, Feb 8, 2023 at 1:55 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> Attached is an updated draft version of the SVSM specification with
> feedback incorporated from the previous review. Please take a look and
> reply with any feedback you may have.
>
> Thanks,
> Tom



-- 
-Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-08 23:19 ` Dionna Amalie Glaze
@ 2023-02-08 23:44   ` Tom Lendacky
  0 siblings, 0 replies; 14+ messages in thread
From: Tom Lendacky @ 2023-02-08 23:44 UTC (permalink / raw)
  To: Dionna Amalie Glaze
  Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

On 2/8/23 17:19, Dionna Amalie Glaze wrote:
> 8.1 You have RCX RCX in the first table where it is clear by context
> that it should be RCX RDX, but just a small edit to keep in mind for
> next time.

Thank you. I could have sworn I fixed that since it was pointed out in the 
previous version... oh well.

Thanks,
Tom

> 
> On Wed, Feb 8, 2023 at 1:55 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> Attached is an updated draft version of the SVSM specification with
>> feedback incorporated from the previous review. Please take a look and
>> reply with any feedback you may have.
>>
>> Thanks,
>> Tom
> 
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-08 21:55 SVSM Attestation and vTPM specification additions - v0.61 Tom Lendacky
  2023-02-08 23:19 ` Dionna Amalie Glaze
@ 2023-02-15  9:49 ` Jörg Rödel
  2023-02-21 22:07   ` Tom Lendacky
  2023-02-15 14:57 ` Tom Lendacky
  2023-03-06 10:33 ` Dov Murik
  3 siblings, 1 reply; 14+ messages in thread
From: Jörg Rödel @ 2023-02-15  9:49 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

Hi Tom,

On Wed, Feb 08, 2023 at 03:55:34PM -0600, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with feedback
> incorporated from the previous review. Please take a look and reply with any
> feedback you may have.

Thanks for your work on the new specification. I was reading through and
have some thoughts on the handling of SVSM_CORE_CREATE_VCPU and
SVSM_CORE_DELETE_VCPU calls. I think a few corner cases with these
commands should be better defined.

The underlying problem is that the SVSM can only handle one VMSA per
(apic_id, vmpl) combination. So what happens if:

	1) A new VCPU is created for an (apic_id, vmpl) tuple where
	   there is already a VMSA, should the call fail or should the
	   existing VMSA be implicitly shut down and replaced by the new
	   one? The latter case can fail if the existing VMSA is
	   currently in use (depending on how it is implemented).

	2) Who can delete VCPUs? Can a (apic_id, vmpl) VMSA only be
	   deleted from a lower VMPL running on the same apic_id? Or can
	   other apic_ids delete it too?

	3) What happens if a given VMSA for a VCPU is deleted. Should
	   that be not allowed or should the SVSM gracefully handle
	   that.

I have some thoughts on how to handle these cases, but want to put that
up for discussion here. For 1) I think it is best to fail VCPU creation
if there is already one for the (apic_id, vmpl) combination, requiring
the guest to delete the VCPU first. This is related to 3) where I think
the SVSM should just go into HLT when there is no higher-level VMSA
anymore. When it wakes up from HLT it can check whether a VMSA was
created and continue executing it.

For 2) I am a bit undecided, letting VCPUs only delete itself would
simplify things on the SVSM side. Allowing other VCPUs to delete VMSAs
needs quite some synchronization or could fail entirely if the VMSA is
currently executing.

Thoughts?

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-15  9:49 ` Jörg Rödel
@ 2023-02-21 22:07   ` Tom Lendacky
  2023-02-24 14:15     ` Jörg Rödel
  0 siblings, 1 reply; 14+ messages in thread
From: Tom Lendacky @ 2023-02-21 22:07 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

On 2/15/23 03:49, Jörg Rödel wrote:
> Hi Tom,
> 
> On Wed, Feb 08, 2023 at 03:55:34PM -0600, Tom Lendacky wrote:
>> Attached is an updated draft version of the SVSM specification with feedback
>> incorporated from the previous review. Please take a look and reply with any
>> feedback you may have.
> 
> Thanks for your work on the new specification. I was reading through and
> have some thoughts on the handling of SVSM_CORE_CREATE_VCPU and
> SVSM_CORE_DELETE_VCPU calls. I think a few corner cases with these
> commands should be better defined.
> 
> The underlying problem is that the SVSM can only handle one VMSA per
> (apic_id, vmpl) combination. So what happens if:
> 
> 	1) A new VCPU is created for an (apic_id, vmpl) tuple where
> 	   there is already a VMSA, should the call fail or should the
> 	   existing VMSA be implicitly shut down and replaced by the new
> 	   one? The latter case can fail if the existing VMSA is
> 	   currently in use (depending on how it is implemented).

VCPU create doesn't actually replace the VMSA from the hypervisor point of 
view. The guest is still required to invoke the GHCB interface to notify 
the hypervisor.

The SVSM PoC leaves any current VMSA in the list of all VMSAs, but does 
replace the current / in-use version of the VMSA for the specified VCPU. 
This is used to examine the register associated with a request, so the 
guest can shoot itself in the foot if it doesn't switch that VCPU to the 
new VMSA.

I'm ok with either doing that or, as you suggest below, checking if 
there's an existing VMSA for the VMPL level and failing the request until 
the the existing VMSA is deleted. That does require the deletion/creation 
to be offloaded to another VCPU, though (unless we allow the self-deletion 
change talked about below).

> 
> 	2) Who can delete VCPUs? Can a (apic_id, vmpl) VMSA only be
> 	   deleted from a lower VMPL running on the same apic_id? Or can
> 	   other apic_ids delete it too?

Nothing prevents one VCPU from deleting another VCPU - although I think we 
should probably prevent a lower VMPL from deleting a higher VMPL (e.g. 
VMPL2 VMSA can't delete (or add for that matter) a VMPL1 VMSA). That 
should be added to the specification.

It might be desirable to allow another APIC to delete a VMSA. Maybe kexec 
processing would be able to benefit from that.

> 
> 	3) What happens if a given VMSA for a VCPU is deleted. Should
> 	   that be not allowed or should the SVSM gracefully handle
> 	   that.

As the specification is written now, switching from the SVSM would result 
in the VMRUN instruction failing for VMPL1, which will shutdown the guest.

Again, I'm ok with either allowing that or, as you suggest below, if there 
  is no VMPL1 VMSA, just enter HLT in the SVSM code.

> 
> I have some thoughts on how to handle these cases, but want to put that
> up for discussion here. For 1) I think it is best to fail VCPU creation
> if there is already one for the (apic_id, vmpl) combination, requiring
> the guest to delete the VCPU first. This is related to 3) where I think > the SVSM should just go into HLT when there is no higher-level VMSA
> anymore. When it wakes up from HLT it can check whether a VMSA was
> created and continue executing it.

For 1), that is an option.

For 3), if the SVSM goes into HLT, it will require another VCPU to perform 
a VCPU create, which should result in the VMPL1 VMSA being run for the 
VCPU. That would then imply that coming out of HLT in the SVSM would be 
for a protocol request. So this is definitely possible.

> 
> For 2) I am a bit undecided, letting VCPUs only delete itself would
> simplify things on the SVSM side. Allowing other VCPUs to delete VMSAs
> needs quite some synchronization or could fail entirely if the VMSA is
> currently executing.

Should the clearing of the SVME bit fail, then a FAIL_INUSE return code 
would be returned. I wouldn't think that the synchronization would be that 
difficult, but I'm not sure. If we do allow self-deletion, then, I agree, 
that would work nicely.

Thanks
Tom

> 
> Thoughts?
> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-21 22:07   ` Tom Lendacky
@ 2023-02-24 14:15     ` Jörg Rödel
  2023-02-24 19:02       ` [EXTERNAL] " Jon Lange
  2023-03-01 15:00       ` Tom Lendacky
  0 siblings, 2 replies; 14+ messages in thread
From: Jörg Rödel @ 2023-02-24 14:15 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

On Tue, Feb 21, 2023 at 04:07:46PM -0600, Tom Lendacky wrote:
> VCPU create doesn't actually replace the VMSA from the hypervisor point of
> view. The guest is still required to invoke the GHCB interface to notify the
> hypervisor.
> 
> The SVSM PoC leaves any current VMSA in the list of all VMSAs, but does
> replace the current / in-use version of the VMSA for the specified VCPU.
> This is used to examine the register associated with a request, so the guest
> can shoot itself in the foot if it doesn't switch that VCPU to the new VMSA.
> 
> I'm ok with either doing that or, as you suggest below, checking if there's
> an existing VMSA for the VMPL level and failing the request until the the
> existing VMSA is deleted. That does require the deletion/creation to be
> offloaded to another VCPU, though (unless we allow the self-deletion change
> talked about below).

Yeah, so the current protocol expects a bit too much from the code
running at VMPL1 for my personal taste. I think it would be better if we
put some constraints into the handling of CREATE_VCPU and DELETE_VCPU
calls to keep the number of pages marked as VMSA as small as possible.

With the current semantics the guest is required to call DELETE_VCPU on
any VMSA it no longer uses. But it is not required to do so. Also there
is a race window between switching to the new VMSA and deleting the old
one where the HV could run the old and the new VMSA in parallel.

The SVSM will not get to see any old VMSA before it is retired (e.g. to
clear EFER.SVME and make it invalid) until the DELETE_VCPU call, but at
that time the new VMSA could already be running.

I don't think we can fully prevent such attacks, but we can make them
more complicated by having some constraints who and when VCPUs can be
created and deleted.

In particular:

	1) VCPUs can only delete themselves. Exception is VCPU 0
	   (or whichever VCPU the BSP is).

	2) VCPUs can only be created for same or higher VMPLs on any
	   VCPU in the guest, given that there is no VMSA configured for
	   this (VCPU, VMPL) combination.

These are constraints the SVSM can enforce and it should keep the number
of VMSAs in the system minimal.

It requires some changes to the current OVMF patches for SNP and SVSM,
though.

> It might be desirable to allow another APIC to delete a VMSA. Maybe kexec
> processing would be able to benefit from that.

Yeah, maybe. On the other side kexec would benefit if VCPUs delete
themselves in the old kernel. Otherwise the new kernel needs to find the
VMSA GPAs und delete them manually.

There is already a code-path for kexec/kdump where the non-boot CPUs are
parked. This could be instrumented. If a VCPU deletes itself the SVSM
will go into HLT until a new VCPU is created.

> Should the clearing of the SVME bit fail, then a FAIL_INUSE return code
> would be returned. I wouldn't think that the synchronization would be that
> difficult, but I'm not sure. If we do allow self-deletion, then, I agree,
> that would work nicely.

The question here is, can we reliably detect within the SVSM that
clearing EFER.SVME failed?

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-24 14:15     ` Jörg Rödel
@ 2023-02-24 19:02       ` Jon Lange
  2023-02-25  6:33         ` Jörg Rödel
  2023-03-01 15:00       ` Tom Lendacky
  1 sibling, 1 reply; 14+ messages in thread
From: Jon Lange @ 2023-02-24 19:02 UTC (permalink / raw)
  To: Jörg Rödel, Tom Lendacky
  Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

-----Original Message-----
From: AMD-SEV-SNP <amd-sev-snp-bounces+jlange=microsoft.com@lists.suse.com> On Behalf Of Jörg Rödel
Sent: Friday, February 24, 2023 6:15 AM

> The question here is, can we reliably detect within the SVSM that clearing EFER.SVME failed?

If the target VMSA is running, then the attempt by the SVSM to access it will generate #NPF since the RMP entry for the running VMSA is modified to a locked state while the VMSA is running.  This means that if the SVSM observes that the instruction that clears EFER.SVME retires, then it must have succeeded, and the next time the VMSA is selected to run, it will fail with a non-canonical VM state.

-Jon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-24 19:02       ` [EXTERNAL] " Jon Lange
@ 2023-02-25  6:33         ` Jörg Rödel
  2023-02-27 17:03           ` Jon Lange
  0 siblings, 1 reply; 14+ messages in thread
From: Jörg Rödel @ 2023-02-25  6:33 UTC (permalink / raw)
  To: Jon Lange
  Cc: Tom Lendacky, linux-coco@lists.linux.dev,
	amd-sev-snp@lists.suse.com

On Fri, Feb 24, 2023 at 07:02:21PM +0000, Jon Lange wrote:
> If the target VMSA is running, then the attempt by the SVSM to access
> it will generate #NPF since the RMP entry for the running VMSA is
> modified to a locked state while the VMSA is running.  This means that
> if the SVSM observes that the instruction that clears EFER.SVME
> retires, then it must have succeeded, and the next time the VMSA is
> selected to run, it will fail with a non-canonical VM state.

Yes, that #NPF happening is what Tom told me too. That is a bit
unfortunate, though, since that #NPF gives control to the HV and we
don't have a way to handle the situation within the SVSM. Best case is
that the HV kills the VM when this happens, worst case is that it loops
over that #NPF as long as the VMSA is running.

In any case the SVSM will not _know_ that the VMSA was busy and can not
return the FAIL_INUSE error to the higher VMPL. We can partially work
around that by manually tracking the busy state within the SVSM. But
that doesn't protect from a malicious HV which randomly executes VMSAs.

Anyway, that's the way it is, I guess. 

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-25  6:33         ` Jörg Rödel
@ 2023-02-27 17:03           ` Jon Lange
  2023-03-01  8:56             ` Jörg Rödel
  0 siblings, 1 reply; 14+ messages in thread
From: Jon Lange @ 2023-02-27 17:03 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: Tom Lendacky, linux-coco@lists.linux.dev,
	amd-sev-snp@lists.suse.com

-----Original Message-----
From: Jörg Rödel <jroedel@suse.de> 
Sent: Friday, February 24, 2023 10:33 PM

> On Fri, Feb 24, 2023 at 07:02:21PM +0000, Jon Lange wrote:
>> If the target VMSA is running, then the attempt by the SVSM to access
>> it will generate #NPF since the RMP entry for the running VMSA is
>> modified to a locked state while the VMSA is running.  This means that
>> if the SVSM observes that the instruction that clears EFER.SVME
>> retires, then it must have succeeded, and the next time the VMSA is
>> selected to run, it will fail with a non-canonical VM state.

> Yes, that #NPF happening is what Tom told me too. That is a bit
> unfortunate, though, since that #NPF gives control to the HV and we
> don't have a way to handle the situation within the SVSM. Best case is
> that the HV kills the VM when this happens, worst case is that it loops
> over that #NPF as long as the VMSA is running.

It's true that the SVSM will not know if it is under attack in these circumstances.  However, delivery of the #NPF prevents the SVSM from making any forward progress, which is equivalent to a denial of service.

My opinion has always been that defining graceful behavior in the face of a malicious VMM is relatively unimportant as long as it is impossible for the VM to proceed in violation of any architectural assumptions that it might be making.  That should certainly be guaranteed here, as the SVSM is guaranteed to receive positive proof that a higher VMPL is not executing and cannot execute further until it approves, so it should be insulated from surprises.

> In any case the SVSM will not _know_ that the VMSA was busy and can not
> return the FAIL_INUSE error to the higher VMPL. We can partially work
> around that by manually tracking the busy state within the SVSM. But
> that doesn't protect from a malicious HV which randomly executes VMSAs.

Is there a reason a failure code is useful instead of just blocking (or spinning) in the SVSM until execution can complete successfully?  This situation can only arise in the face of a malicious VMM, and such blocking is no different than the (malicious) VMM choosing not to schedule the VM at all.  Is there something uniquely useful the higher VMPL would achieve from observing a failure code instead of just waiting for completion?

-Jon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-27 17:03           ` Jon Lange
@ 2023-03-01  8:56             ` Jörg Rödel
  2023-03-01 14:00               ` Tom Lendacky
  0 siblings, 1 reply; 14+ messages in thread
From: Jörg Rödel @ 2023-03-01  8:56 UTC (permalink / raw)
  To: Jon Lange
  Cc: Tom Lendacky, linux-coco@lists.linux.dev,
	amd-sev-snp@lists.suse.com

Hi Jon,

On Mon, Feb 27, 2023 at 05:03:07PM +0000, Jon Lange wrote:
> It's true that the SVSM will not know if it is under attack in these
> circumstances.  However, delivery of the #NPF prevents the SVSM from
> making any forward progress, which is equivalent to a denial of
> service.

Right, I sometimes forget there there is no guarantee for forward
progress, just for confidentiality in SEV-SNP VMs (and CoCo VMs in
general).

The interesting question here is how RMPADJUST behaves when it tries to
revoke the VMSA flag of a VMSA page that is currently executing.
Documentation does not state an IN_USE error code return, so I guess it
will also cause an #NPF.

> Is there a reason a failure code is useful instead of just blocking
> (or spinning) in the SVSM until execution can complete successfully?
> This situation can only arise in the face of a malicious VMM, and such
> blocking is no different than the (malicious) VMM choosing not to
> schedule the VM at all.  Is there something uniquely useful the higher
> VMPL would achieve from observing a failure code instead of just
> waiting for completion?

It would allow the VM to make forward progress, but in a situation where
it is under attack this is probably a bad idea anyway.
After some more thinking I came to the conclusion that the VM must make
sure that to-be-retired VMSAs point to a code-path where it can not
escape anymore, like an AP-HLT loop with a zeroed IDT. This is important
to consider also for higher-level VMPLs.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-03-01  8:56             ` Jörg Rödel
@ 2023-03-01 14:00               ` Tom Lendacky
  0 siblings, 0 replies; 14+ messages in thread
From: Tom Lendacky @ 2023-03-01 14:00 UTC (permalink / raw)
  To: Jörg Rödel, Jon Lange
  Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com



On 3/1/23 02:56, Jörg Rödel wrote:
> Hi Jon,
> 
> On Mon, Feb 27, 2023 at 05:03:07PM +0000, Jon Lange wrote:
>> It's true that the SVSM will not know if it is under attack in these
>> circumstances.  However, delivery of the #NPF prevents the SVSM from
>> making any forward progress, which is equivalent to a denial of
>> service.
> 
> Right, I sometimes forget there there is no guarantee for forward
> progress, just for confidentiality in SEV-SNP VMs (and CoCo VMs in
> general).
> 
> The interesting question here is how RMPADJUST behaves when it tries to
> revoke the VMSA flag of a VMSA page that is currently executing.
> Documentation does not state an IN_USE error code return, so I guess it
> will also cause an #NPF.

We should not be in the situation where that occurs, since the RMPADJUST 
won't be performed until the SVME flag is cleared. Once the SVSM flag is 
cleared, the VMSA can't execute so the RMPADJUST should just work.

But, yes, it would be interesting to see what happens. I might try to test 
that just to see the result.

Thanks,
Tom

> 
>> Is there a reason a failure code is useful instead of just blocking
>> (or spinning) in the SVSM until execution can complete successfully?
>> This situation can only arise in the face of a malicious VMM, and such
>> blocking is no different than the (malicious) VMM choosing not to
>> schedule the VM at all.  Is there something uniquely useful the higher
>> VMPL would achieve from observing a failure code instead of just
>> waiting for completion?
> 
> It would allow the VM to make forward progress, but in a situation where
> it is under attack this is probably a bad idea anyway.
> After some more thinking I came to the conclusion that the VM must make
> sure that to-be-retired VMSAs point to a code-path where it can not
> escape anymore, like an AP-HLT loop with a zeroed IDT. This is important
> to consider also for higher-level VMPLs.
> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-24 14:15     ` Jörg Rödel
  2023-02-24 19:02       ` [EXTERNAL] " Jon Lange
@ 2023-03-01 15:00       ` Tom Lendacky
  1 sibling, 0 replies; 14+ messages in thread
From: Tom Lendacky @ 2023-03-01 15:00 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

On 2/24/23 08:15, Jörg Rödel wrote:
> On Tue, Feb 21, 2023 at 04:07:46PM -0600, Tom Lendacky wrote:
>> VCPU create doesn't actually replace the VMSA from the hypervisor point of
>> view. The guest is still required to invoke the GHCB interface to notify the
>> hypervisor.
>>
>> The SVSM PoC leaves any current VMSA in the list of all VMSAs, but does
>> replace the current / in-use version of the VMSA for the specified VCPU.
>> This is used to examine the register associated with a request, so the guest
>> can shoot itself in the foot if it doesn't switch that VCPU to the new VMSA.
>>
>> I'm ok with either doing that or, as you suggest below, checking if there's
>> an existing VMSA for the VMPL level and failing the request until the the
>> existing VMSA is deleted. That does require the deletion/creation to be
>> offloaded to another VCPU, though (unless we allow the self-deletion change
>> talked about below).
> 
> Yeah, so the current protocol expects a bit too much from the code
> running at VMPL1 for my personal taste. I think it would be better if we
> put some constraints into the handling of CREATE_VCPU and DELETE_VCPU
> calls to keep the number of pages marked as VMSA as small as possible.
> 
> With the current semantics the guest is required to call DELETE_VCPU on
> any VMSA it no longer uses. But it is not required to do so. Also there
> is a race window between switching to the new VMSA and deleting the old
> one where the HV could run the old and the new VMSA in parallel.
> 
> The SVSM will not get to see any old VMSA before it is retired (e.g. to
> clear EFER.SVME and make it invalid) until the DELETE_VCPU call, but at
> that time the new VMSA could already be running.

We could specify that any vCPU create where there's an existing VMSA at 
the VMPL level specified, results in the existing VMSA being "deleted" (an 
implicit call to delete vCPU) before performing the create vCPU. If the 
old VMSA is executing, this would prevent the creation until the old VMSA 
can be deleted.

The guest could get itself in trouble no matter what if it does not 
quiesce the target vCPU before either calling delete or create vCPU. So I 
think the above would work ok.

> 
> I don't think we can fully prevent such attacks, but we can make them
> more complicated by having some constraints who and when VCPUs can be
> created and deleted.
> 
> In particular:
> 
> 	1) VCPUs can only delete themselves. Exception is VCPU 0
> 	   (or whichever VCPU the BSP is).

I don't think we should have this limitation. We don't have the limitation 
today when the kernel is running at VMPL0, so not sure we should have it 
here. If the guest wants to do it, then it is really up to them as I don't 
think it's an attack against the SVSM, only against itself.

> 
> 	2) VCPUs can only be created for same or higher VMPLs on any
> 	   VCPU in the guest, given that there is no VMSA configured for
> 	   this (VCPU, VMPL) combination.

Agreed, we need to verify that the caller is not trying to create a VMSA 
at a higher VMPL privilege level that it is currently running at.

> 
> These are constraints the SVSM can enforce and it should keep the number
> of VMSAs in the system minimal.
> 
> It requires some changes to the current OVMF patches for SNP and SVSM,
> though.
> 
>> It might be desirable to allow another APIC to delete a VMSA. Maybe kexec
>> processing would be able to benefit from that.
> 
> Yeah, maybe. On the other side kexec would benefit if VCPUs delete
> themselves in the old kernel. Otherwise the new kernel needs to find the
> VMSA GPAs und delete them manually.
> 
> There is already a code-path for kexec/kdump where the non-boot CPUs are
> parked. This could be instrumented. If a VCPU deletes itself the SVSM
> will go into HLT until a new VCPU is created.

Yes, this would need to be documented in the spec to be sure we get the 
proper behavior on a self deletion.

Thanks,
Tom

> 
>> Should the clearing of the SVME bit fail, then a FAIL_INUSE return code
>> would be returned. I wouldn't think that the synchronization would be that
>> difficult, but I'm not sure. If we do allow self-deletion, then, I agree,
>> that would work nicely.
> 
> The question here is, can we reliably detect within the SVSM that
> clearing EFER.SVME failed?
> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-08 21:55 SVSM Attestation and vTPM specification additions - v0.61 Tom Lendacky
  2023-02-08 23:19 ` Dionna Amalie Glaze
  2023-02-15  9:49 ` Jörg Rödel
@ 2023-02-15 14:57 ` Tom Lendacky
  2023-03-06 10:33 ` Dov Murik
  3 siblings, 0 replies; 14+ messages in thread
From: Tom Lendacky @ 2023-02-15 14:57 UTC (permalink / raw)
  To: linux-coco@lists.linux.dev, amd-sev-snp@lists.suse.com

On 2/8/23 15:55, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with 
> feedback incorporated from the previous review. Please take a look and 
> reply with any feedback you may have.

There's a typo in the new version of the spec under the 
SVSM_CORE_CREATE_VCPU call. The bullet:

   If VMSA.EFER.SVME == 1, the call will return SVSM_ERR_INVALID_PARAMETER

should read:

   If VMSA.EFER.SVME != 1, the call will return SVSM_ERR_INVALID_PARAMETER

Thanks for spotting that Joerg!

Tom

> 
> Thanks,
> Tom

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.61
  2023-02-08 21:55 SVSM Attestation and vTPM specification additions - v0.61 Tom Lendacky
                   ` (2 preceding siblings ...)
  2023-02-15 14:57 ` Tom Lendacky
@ 2023-03-06 10:33 ` Dov Murik
  3 siblings, 0 replies; 14+ messages in thread
From: Dov Murik @ 2023-03-06 10:33 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco@lists.linux.dev,
	amd-sev-snp@lists.suse.com
  Cc: Dov Murik

Hi Tom,

On 08/02/2023 23:55, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with
> feedback incorporated from the previous review. Please take a look and
> reply with any feedback you may have.
> 

In section 7.1 (SVSM_ATTEST_SERVICES Call), Table 11 lists the field
"Attestation report buffer size (in bytes)", but this field is not used
in the description of the operation.  There's also no register defined
to return the expected length of the report itself in case the supplied
report buffer is too small.

I suggest adding something like:

If the size of the SNP attestation report exceeds the size of the
supplied attestation report buffer, R8 will be set to the size of the
attestation report and the call will return SVSM_ERR_INVALID_PARAMETER.

(and fill RCX and RDX too in the response?)

-Dov

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-03-06 10:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-08 21:55 SVSM Attestation and vTPM specification additions - v0.61 Tom Lendacky
2023-02-08 23:19 ` Dionna Amalie Glaze
2023-02-08 23:44   ` Tom Lendacky
2023-02-15  9:49 ` Jörg Rödel
2023-02-21 22:07   ` Tom Lendacky
2023-02-24 14:15     ` Jörg Rödel
2023-02-24 19:02       ` [EXTERNAL] " Jon Lange
2023-02-25  6:33         ` Jörg Rödel
2023-02-27 17:03           ` Jon Lange
2023-03-01  8:56             ` Jörg Rödel
2023-03-01 14:00               ` Tom Lendacky
2023-03-01 15:00       ` Tom Lendacky
2023-02-15 14:57 ` Tom Lendacky
2023-03-06 10:33 ` Dov Murik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).