public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tycho Andersen <tycho@kernel.org>
To: Ashish Kalra <ashish.kalra@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	John Allen <john.allen@amd.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org,
	Sean Christopherson <seanjc@google.com>,
	Kim Phillips <kim.phillips@amd.com>,
	Alexey Kardashevskiy <aik@amd.com>,
	"Tycho Andersen (AMD)" <tycho@kernel.org>,
	Nikunj A Dadhania <nikunj@amd.com>,
	"Pratik R. Sampat" <prsampat@amd.com>,
	Michael Roth <michael.roth@amd.com>
Subject: [RFC v1 6/6] crypto/ccp: Implement SNP firmware live update
Date: Thu, 30 Apr 2026 10:07:16 -0600	[thread overview]
Message-ID: <20260430160716.1120553-7-tycho@kernel.org> (raw)
In-Reply-To: <20260430160716.1120553-1-tycho@kernel.org>

From: "Tycho Andersen (AMD)" <tycho@kernel.org>

Put all the previous primitives together to implement SNP firmware
live update via DOWNLOAD_FIRMWARE_EX.

DOWNLOAD_FIRMWARE_EX can only be run while the legacy SEV firmware is the
UNINIT state. If the legacy firmware is in the WORKING state running legacy
guests, refuse to update. If the legacy firmware is in the INIT state,
de-initialize it so that the update can be run.

When the firmware is installed, it is only provisionally loaded. It relies
on userspace to issue ioctl(/dev/sev, SNP_COMMIT, ...) when it is happy
with the provisional firmware.

To roll back, userspace should not do an SNP_COMMIT, and invoke the
firmware loader in the same way but with the old firmware image. The
firmware spec notes:

    If a guest context page is updated to a provisional firmware version,
    then updating the context page back to the committed version after a
    rollback will always succeed.

There are essentially four classes of errors during an update:

1. kernel bugs that WARN_ON_ONCE()
2. invalid firmware (bad image, bad signature, downgrade too far, etc.)
3. UPDATE_FAILED, things can continue as normally
4. HARDWARE_UNSAFE, declare the PSP dead, since the behavior of the SEV
   firmware is undefined
5. RESTORE_REQUIRED, the firmware can only successfully execute
   DOWNLOAD_FIRMWARE_EX commands. The admin needs to load the old firmware
   image.

There is a firmware bug where upgrades across 1.58.03 time out, even though
the upgrade actually succeeds. There is no documented way to determine what
the input firmware version is, so there is no way to detect this case
before trying a firmware update. Instead look for the timeout and try an
SNP_PLATFORM_STATUS to see if the PSP is still alive.

Finally, this differs from the previous implementation [1] in a couple of
ways:

1. guest context pages are no longer required to be updated as of 1.58 of
   the SEV-SNP Firmware spec doc 56860.
2. no WBINVD+DF_FLUSH is required after a firmware update, so it drops that
   code

[1]: https://lore.kernel.org/lkml/20241112232253.3379178-7-dionnaglaze@google.com/
Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
---
 drivers/crypto/ccp/sev-dev.c | 244 ++++++++++++++++++++++++++++++++++-
 1 file changed, 243 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index b4711bf823e8..e7fe6dbf69c2 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -90,6 +90,8 @@ MODULE_FIRMWARE("amd/amd_sev_fam19h_model1xh.sbin"); /* 4th gen EPYC */
 
 static bool psp_dead;
 static int psp_timeout;
+static bool dlfwex_wants_rollback;
+static bool sev_firmware_needs_reinit;
 
 enum snp_hv_fixed_pages_state {
 	ALLOCATED,
@@ -2046,11 +2048,251 @@ static enum fw_upload_err sev_fw_upload_prepare(struct fw_upload *fw_upload,
 	return FW_UPLOAD_ERR_NONE;
 }
 
+static int sev_download_firmware_ex(struct sev_device *sev, const u8 *data,
+				    u32 size)
+{
+	struct sev_data_download_firmware_ex sev_data = {0};
+	int ret, error = 0, order;
+	struct page *p;
+	void *fw_blob;
+
+	order = get_order(size);
+	p = alloc_pages(GFP_KERNEL, order);
+	if (!p)
+		return -ENOMEM;
+
+	fw_blob = page_address(p);
+	memcpy(fw_blob, data, size);
+
+	sev_data.len = sizeof(sev_data);
+	sev_data.fw_paddr = __psp_pa(fw_blob);
+	sev_data.fw_len = size;
+	sev_data.commit = 0;
+
+	ret = __sev_do_cmd_locked(SEV_CMD_SNP_DOWNLOAD_FIRMWARE_EX, &sev_data, &error);
+
+	/*
+	 * Quirk: firmware upgrades across 1.58.03 give ETIMEDOUT for
+	 * DLFWEX, even though the command actually succeeds. If we're
+	 * in this case, test that we can do SNP_PLATFORM_STATUS, and
+	 * if so, continue as normal.
+	 */
+	if (ret == -ETIMEDOUT) {
+		struct sev_user_data_snp_status status;
+
+		dev_info(sev->dev, "Firmware update timed out, checking status for quirk...\n");
+		psp_dead = false;
+
+		ret = __sev_do_snp_platform_status(&status, &error);
+		if (ret) {
+			dev_err(sev->dev, "SNP STATUS failed after firmware upgrade, ret = %d, error = %#x\n",
+				ret, error);
+			psp_dead = true;
+			goto out;
+		}
+	}
+
+	if (ret < 0 && error != 0)
+		ret = error;
+
+out:
+	__free_pages(p, order);
+	return ret;
+}
+
+static int sev_firmware_shutdown_if_sev_initialized(struct sev_device *sev)
+{
+	int rc, error, sev_plat_state;
+
+	lockdep_assert_held(&sev_cmd_mutex);
+
+	error = 0;
+	rc = sev_get_platform_state(&sev_plat_state, &error);
+	if (rc < 0) {
+		if (error)
+			rc = error;
+		dev_dbg(sev->dev, "SEV get platform state failed %d\n", rc);
+		return rc;
+	}
+
+	switch (sev_plat_state) {
+	case SEV_STATE_UNINIT:
+		return 0;
+	case SEV_STATE_INIT:
+		error = 0;
+		rc = __sev_platform_shutdown_locked(&error);
+		if (rc) {
+			if (error)
+				rc = error;
+			dev_err(sev->dev, "SEV platform shutdown failed %d\n", rc);
+			return rc;
+		}
+		sev_firmware_needs_reinit = true;
+		return 0;
+	case SEV_STATE_WORKING:
+		return -EBUSY;
+	default:
+		dev_err(sev->dev, "Unknown SEV firmware state: %d\n", sev_plat_state);
+		return -EINVAL;
+	}
+}
+
+static void sev_firmware_reinit_if_shutdown(struct sev_device *sev)
+{
+	int rc, error;
+
+	guard(mutex)(&sev_cmd_mutex);
+
+	if (!sev_firmware_needs_reinit)
+		return;
+
+	sev_firmware_needs_reinit = false;
+	error = 0;
+	rc = __sev_platform_init_locked(&error);
+	if (rc) {
+		if (error)
+			rc = error;
+		dev_err(sev->dev, "SEV platform re-init failed %d\n", rc);
+	}
+}
+
 static enum fw_upload_err sev_fw_upload_write(struct fw_upload *fw_upload,
 					      const u8 *data, u32 offset,
 					      u32 size, u32 *written)
 {
-	return FW_UPLOAD_ERR_BUSY;
+	struct sev_device *sev = fw_upload->dd_handle;
+	u8 old_major, old_minor, old_build;
+	int rc, error = 0;
+	enum fw_upload_err ret;
+
+	if (offset != 0)
+		return FW_UPLOAD_ERR_INVALID_SIZE;
+
+	old_major = sev->api_major;
+	old_minor = sev->api_minor;
+	old_build = sev->build;
+
+	mutex_lock(&sev_cmd_mutex);
+
+	/*
+	 * If the last firmware update returned RESTORE_REQUIRED, allow only
+	 * this DLFWEX command so the admin can restore the previous FW
+	 * version. If we are in this state the legacy firmware has previously
+	 * been shut down, so no need to do it again.
+	 */
+	if (dlfwex_wants_rollback && psp_dead) {
+		dlfwex_wants_rollback = false;
+		psp_dead = false;
+	} else {
+		rc = sev_firmware_shutdown_if_sev_initialized(sev);
+		if (rc) {
+			ret = FW_UPLOAD_ERR_BUSY;
+			goto unlock;
+		}
+	}
+
+	rc = sev_download_firmware_ex(sev, data, size);
+	if (rc) {
+		ret = FW_UPLOAD_ERR_FW_INVALID;
+		switch (rc) {
+		case SEV_RET_INVALID_PLATFORM_STATE:
+			fallthrough;
+		case SEV_RET_INVALID_ADDRESS:
+			/* these are probably kernel bugs */
+			WARN_ON_ONCE(true);
+			ret = FW_UPLOAD_ERR_BUSY;
+			goto unlock;
+		case SEV_RET_INVALID_LEN:
+			ret = FW_UPLOAD_ERR_INVALID_SIZE;
+			goto unlock;
+		case SEV_RET_INVALID_PARAM:
+			dev_err(sev->dev, "SEV firmware image is not well formed\n");
+			goto unlock;
+		case SEV_RET_SHUTDOWN_REQUIRED:
+			dev_err(sev->dev, "SEV firmware too far, shutdown required\n");
+			goto unlock;
+		case SEV_RET_INVALID_CONFIG:
+			dev_err(sev->dev, "SEV firmware upgrade would rollback SVN\n");
+			goto unlock;
+		case SEV_RET_BAD_SIGNATURE:
+			dev_err(sev->dev, "SEV firmware upgrade bad signature\n");
+			goto unlock;
+		case SEV_RET_BAD_VERSION:
+			dev_err(sev->dev, "SEV firmware upgrade less than CommittedVersion\n");
+			goto unlock;
+		case SEV_RET_UNSUPPORTED:
+			dev_err(sev->dev, "SEV firmware required feature not supported\n");
+			goto unlock;
+		case SEV_RET_UPDATE_FAILED:
+			/*
+			 * Update failed but fw rolled back on its own,
+			 * operation can continue normally.
+			 */
+			dev_err(sev->dev, "SEV firmware update failed\n");
+			ret = FW_UPLOAD_ERR_HW_ERROR;
+			goto unlock;
+		case SEV_RET_HWSEV_RET_UNSAFE:
+			/*
+			 * "Following a return of HARDWARE_UNSAFE, operation of
+			 * the SEV firmware is indeterminate
+			 * and the recommendation is to reboot the platform."
+			 */
+			dev_err(sev->dev, "SEV firmware no longer safe to operate\n");
+			psp_dead = true;
+			ret = FW_UPLOAD_ERR_HW_ERROR;
+			goto unlock;
+		case SEV_RET_RESTORE_REQUIRED:
+			/*
+			 * FW asked us to roll back; we don't hold onto the
+			 * last FW image, so we can't. We can set a flag to
+			 * allow the admin to rollback if they happen to have
+			 * the old firmware image handy.
+			 */
+			dev_err(sev->dev, "SEV firmware update failed, please roll back\n");
+			psp_dead = true;
+			dlfwex_wants_rollback = true;
+			ret = FW_UPLOAD_ERR_HW_ERROR;
+			goto unlock;
+		default:
+			dev_err(sev->dev, "Unknown SEV firmware err %d\n", rc);
+			ret = FW_UPLOAD_ERR_HW_ERROR;
+			goto unlock;
+		}
+	}
+
+	*written = size;
+	ret = FW_UPLOAD_ERR_NONE;
+
+unlock:
+	mutex_unlock(&sev_cmd_mutex);
+
+	/*
+	 * sev_get_api_version() updates the SEV and SNP statuses, SNP feature
+	 * info if available, build numbers, etc. cached in struct sev_device.
+	 * Update these if they may have changed for new firmware.
+	 */
+	if (ret == FW_UPLOAD_ERR_NONE) {
+		error = 0;
+
+		rc = sev_get_api_version();
+		if (rc) {
+			dev_warn(sev->dev,
+				 "SNP platform data refresh after firmware update failed %d\n",
+				 rc);
+		} else if (sev->api_major != old_major ||
+			   sev->api_minor != old_minor ||
+			   sev->build != old_build) {
+			dev_info(sev->dev, "SEV firmware updated to %d.%d build %d\n",
+				 sev->api_major, sev->api_minor, sev->build);
+		} else {
+			dev_info(sev->dev, "SEV firmware not updated\n");
+		}
+	}
+
+	if (!dlfwex_wants_rollback)
+		sev_firmware_reinit_if_shutdown(sev);
+
+	return ret;
 }
 
 static enum fw_upload_err sev_fw_upload_poll_complete(struct fw_upload *fw_upload)
-- 
2.54.0


  parent reply	other threads:[~2026-04-30 16:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-30 16:07 [RFC v1 0/6] Implement SNP DOWNLOAD_FIRMWARE_EX support Tycho Andersen
2026-04-30 16:07 ` [RFC v1 1/6] crypto/ccp: Hoist kernel part of SNP_PLATFORM_STATUS Tycho Andersen
2026-04-30 16:07 ` [RFC v1 2/6] crypto/ccp: Allow snp_get_platform_data() after SNP init Tycho Andersen
2026-04-30 16:07 ` [RFC v1 3/6] crypto/ccp: Add DOWNLOAD_FIRMWARE_EX message struct Tycho Andersen
2026-04-30 16:07 ` [RFC v1 4/6] crypto/ccp: Reclaim command buffer when the PSP dies Tycho Andersen
2026-04-30 16:07 ` [RFC v1 5/6] crypto/ccp: Register with fw_uploader and always fail Tycho Andersen
2026-04-30 16:07 ` Tycho Andersen [this message]
2026-05-03  3:18   ` [RFC v1 6/6] crypto/ccp: Implement SNP firmware live update Maxwell Doose
2026-05-03  3:25     ` Maxwell Doose
2026-05-04 13:57     ` Tycho Andersen
2026-05-04 18:43       ` Maxwell Doose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260430160716.1120553-7-tycho@kernel.org \
    --to=tycho@kernel.org \
    --cc=aik@amd.com \
    --cc=ashish.kalra@amd.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=john.allen@amd.com \
    --cc=kim.phillips@amd.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=nikunj@amd.com \
    --cc=prsampat@amd.com \
    --cc=seanjc@google.com \
    --cc=thomas.lendacky@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox