From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B110146588 for ; Wed, 13 Nov 2024 15:58:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731513497; cv=none; b=ufs16hQ/ODP76LPIC3KQfynoTGsICxNioaFj8nIzQCUEs5lOqSPyEJYpteewBfogr0sSZB9GfNLBBcmuL6AcQqr8pzDd+B0xZO1c6SuoRMv82z+N3yLmRwBMW3IxKlcWv99MeMsNhCn9UUCnLbTwBObEAz0Avf7U2Fs5A3WnKtk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731513497; c=relaxed/simple; bh=IZNKL5dKTunqOnlhlqbELW9B4+RdNR6QwIYEmChpV5c=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gVemwTFPEhV2BVrtYPcYwTzPjH4nqfeiGRlV+IpKp6rs1LiLubHI2Y1G0pongqAHJ7m4fdpaxaLLlt8aO16pHxXdt8T9MOQUPxSMdbnNLrdKCj5ok0m1aNkELvPQBUoZJ1/rsJvK3bArta8dndAEyDfed+FlhKyr5Vt8msCDOPc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=VjaXK6nH; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VjaXK6nH" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6ea863ecfe9so136955987b3.3 for ; Wed, 13 Nov 2024 07:58:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731513494; x=1732118294; darn=lists.linux.dev; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kF/MfTQ5/fcEXQi4LYfLZfwZ71SBi7osE385WhqsuTc=; b=VjaXK6nH+vzh6oMbuD/9QNSz0KgrMDRVWY/aqURYnhtz/6MCwgNJNPWpyechZKiqGs tinuWPBBxgLFV9aQx/UZ2Qn9ZJMWPc9NtNT1ioaxYTyCmV6qzoAaf5TWx3rKtdoSuPu6 26eVJVfe9igvgWwfHaGsXo6DWIWQbKZkpvFWNp70V5E7X4UK7QlqBPAwRxu5uKaM+gxo AE1OrGHyFzILqnlu0N2xPh/bEBexc8UVFGH8GJIQad3ncqZRrAa2jrcO2o0A3vW07y7v P1zMqOlzUeZSx1OdHoSJlA98+GWkhbBHqu5TwEGPBf/pesXPoDjI6+rL7TN6nF+TGr62 2wzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731513494; x=1732118294; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kF/MfTQ5/fcEXQi4LYfLZfwZ71SBi7osE385WhqsuTc=; b=PnJSA+umVuG4FCxcDM5qlLpwvZMt8L76JqJUWWa3sCN6O8XP7r6vB7ZICwJPlN6UxN rySjkOYmYl5rZcZK90+bs6/qVWjDgyYprkvxnWlQJiNMpxJQQADgfj0P1FtQXSQtk/s2 B1w326Su0CrTmv3Rw2842QqMHRqi81LHTmVbYsLNUVSq+goKAF3AOM/K7ENCD+iA2c35 N/frjtGMsa1LmCo2uVT8JBv0Bf6Uabe5WacZewJgC0GAuiHSi8ulXEUVSDfbJIsY+KfZ lhHbMq58f6bwVc+ycbEtwFNpAxkoW041gSpijINprSCo+kSjYJYxgYTY8Ao/m/jf8SNr unqw== X-Forwarded-Encrypted: i=1; AJvYcCV7tC8FmQTEIAvwJTTImClsQ9dXxmWFiokfmSIeRSO5nYr9FHAIwYuVnDqK4pG1GWs5tqESRiTX6AKi@lists.linux.dev X-Gm-Message-State: AOJu0YzAi+D4tEYHqIi4I2OMfVEHlfUBTfTgZbN/7pi7TkeoxcTrj7IN bRYpJNAewofIGa1l59TBYJdMazQu51yCasiqSERFctL3a65qeOuGXLg9yAUD6ic5yjoBn/2qgs1 FGw== X-Google-Smtp-Source: AGHT+IHuQ95n+Y8jejyNGBOGXye70VLPfyNGzX3R/E/mHwLTbyQVpbG8tfrNstJjL7AVlymwbNM8htd7/LM= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:9d:3983:ac13:c240]) (user=seanjc job=sendgmr) by 2002:a81:b510:0:b0:6e2:ad08:4924 with SMTP id 00721157ae682-6eaddf71dcfmr1920297b3.4.1731513494525; Wed, 13 Nov 2024 07:58:14 -0800 (PST) Date: Wed, 13 Nov 2024 07:58:13 -0800 In-Reply-To: <20241112232253.3379178-6-dionnaglaze@google.com> Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241112232253.3379178-1-dionnaglaze@google.com> <20241112232253.3379178-6-dionnaglaze@google.com> Message-ID: Subject: Re: [PATCH v6 5/8] crypto: ccp: Add GCTX API to track ASID assignment From: Sean Christopherson To: Dionna Glaze Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Ashish Kalra , Tom Lendacky , John Allen , Herbert Xu , "David S. Miller" , linux-coco@lists.linux.dev, Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Michael Roth , Luis Chamberlain , Russ Weight , Danilo Krummrich , Greg Kroah-Hartman , "Rafael J. Wysocki" , Tianfei zhang , Alexey Kardashevskiy , linux-crypto@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Tue, Nov 12, 2024, Dionna Glaze wrote: > @@ -109,6 +110,10 @@ static void *sev_init_ex_buffer; > */ > static struct sev_data_range_list *snp_range_list; > > +/* SEV ASID data tracks resources associated with an ASID to safely manage operations. */ > +struct sev_asid_data *sev_asid_data; > +u32 nr_asids, sev_min_asid, sev_es_max_asid; > + > static inline bool sev_version_greater_or_equal(u8 maj, u8 min) > { > struct sev_device *sev = psp_master->sev_data; > @@ -1093,6 +1098,109 @@ static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg) > return 0; > } > > +static bool sev_check_external_user(int fd); > +void *sev_snp_create_context(int fd, int asid, int *psp_ret) > +{ > + struct sev_data_snp_addr data = {}; > + void *context; > + int rc, error; > + > + if (!sev_check_external_user(fd)) > + return ERR_PTR(-EBADF); > + > + if (!sev_asid_data) > + return ERR_PTR(-ENODEV); > + > + if (asid < 0 || asid >= nr_asids) > + return ERR_PTR(-EINVAL); > + > + /* Can't create a context for a used ASID. */ > + if (WARN_ON_ONCE(sev_asid_data[asid].snp_context)) > + return ERR_PTR(-EBUSY); Tracking contexts in an array that's indexed per ASID is unsafe and unnecessarily splits ASID management across KVM and the PSP driver. There is zero reason the PSP driver needs to care about ASIDs. Attempting to police KVM is futile, and leads to bloated, convoluted code. AFAICT, there is nothing to guard against a use-after-free in snp_update_guest_contexts(). The need to handle SEV_RET_INVALID_GUEST is a pretty big clue that there are races between KVM and firmware updates. if (!sev_asid_data[i].snp_context) continue; status_data.gctx_paddr = __psp_pa(sev_asid_data[i].snp_context); status_data.address = __psp_pa(snp_guest_status); rc = sev_do_cmd(SEV_CMD_SNP_GUEST_STATUS, &status_data, &error); if (!rc) continue; /* * Handle race with SNP VM being destroyed/decommissoned, * if guest context page invalid error is returned, * assume guest has been destroyed. */ if (error == SEV_RET_INVALID_GUEST) continue; Using an array is also inefficient, as it requires iterating over all possible ASIDs, many of which may be unused. Furthermore, handling this in the PSP driver (correctly) leads to unnecessary locking. KVM already protects SNP ASID allocations with sev_deactivate_lock, I see zero reason to complicate things with another lock. The "rollback" mechanism is also broken. If SEV_CMD_SNP_GUEST_STATUS fails, synthetic_restore_required is set and never cleared, and impacts *all* SEV PSP commands. I.e. failure to update one guest context comletely cripples the entire system. Not to mention synthetic_restore_required also lacks any form of SMP synchronication. I also don't see how a rollback is possible if an error occurs after one or more guest contexts have been updated. Presumably trying to rollback in that state will leave the updated guests in a bad state. Of course, I don't see any rollback code as nothing ever clears synthetic_restore_required, so what's intented to happen is entirely unclear. I also don't see anything in this series that explains why a SEV_CMD_SNP_GUEST_STATUS failure shouldn't be treated as a fatal error. Of the error codes listed in the SNP ABI, everything except UPDATE_FAILED is clearly a software bug. And I can't find anything that explains when UPDATE_FAILED will be returned. Table 80. Status Codes for SNP_GUEST_STATUS Status Condition SUCCESS Successful completion. INVALID_PLATFORM_STATE The platform is not in the INIT state. INVALID_ADDRESS The address is invalid for use by the firmware. INVALID_PARAM MBZ fields are not zero. INVALID_GUEST The guest context page was invalid. INVALID_PAGE_STATE The guest status page was not in the correct state. INVALID_PAGE_SIZE The guest status page was not the correct size. UPDATE_FAILED Update of the firmware internal state or a guest context page has failed. Somewhat off the cuff, I think the only sane way to approach this is to call into KVM when doing a firmware update, and let KVM react accordingly. E.g. let KVM walk its list of VMs in order to update SNP VMs, taking kvm_lock and the somewhat misnamed sev_deactivate_lock() as needed. Then if updating a guest context fails, terminate _that_ VM, and move on to the next VM. Side topic, I don't see any code that ensures no SEV or SEV-ES VMs are running. Is the idea to let userspace throw noodles at the PSP and see what sticks? + Provide support for AMD Platform Security Processor firmware. + The PSP firmware can be updated while no SEV or SEV-ES VMs are active. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + Users of this feature should be aware of the error modes that indicate + required manual rollback or reset due to instablity.