From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 014B32F3B; Fri, 28 Mar 2025 08:44:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743151471; cv=none; b=noMjuZtUJ0ueMneFM0TK3VOOaHsKkEeFI5E7tVCSxcUeGCc7T+l+pb0xof6lx8faEOa2mbNruhTx+k7WM42RvQX88PU767aZLanECyuC8TCcnj/jOEiZAhxmOvxxRJh98UDVeO4RtRtCK/t1dRXVEvZT2Q2kP219eBQlsoJYBcw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743151471; c=relaxed/simple; bh=7wWpNyqwEltt65jw58fWm4t4+DezGpJMXEf4es9UZBc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EGRXHX85iBOtapWOuynHkRUPQdovIEPFQ0XcrIgNZWs/kz6zSUQE2daCHPjhurKTMldyhTo9rMBKEMCPZEyAbYXXEkfilAY2q0Uht6QYQu3B7SHrs0vqLWDsd7kWSjS9qmc7VnqEhdTJcx1ljukHT6NF2aTkUiKUGRVjDKNMKZw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tvTNG3yn; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tvTNG3yn" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D9A74C4CEE4; Fri, 28 Mar 2025 08:44:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743151470; bh=7wWpNyqwEltt65jw58fWm4t4+DezGpJMXEf4es9UZBc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=tvTNG3ynFho3DH3Rp+RztQH5CP6x6aaH8VICHj0go9wqjxsHobhsynnpkpPy9WyRt brtW0jv8aWTkISaBem9nA3B/VxeLrGW5BlfjCgFfnIhHgf4A+rixw8/VAqAaiJYIUG bzbgSggRp6XRxuKKNjxcHr1yA9g837AW05R3Wk+aQDZMhZ0DMVSkfPqyLKjoUfeiQs 2QzqwyxmTqVMDkqKK/bKpKaouPi0kjmhc3UFv0yzvso7Euz8c+4HpSyCik9ZxYvAYc /zDcn4l75GL5HD13etXK4nh5tWqove7kr5dMUTXUr78SVy3OTLIVON43SjScQZYX+W pevz5T/lNZMHw== Date: Fri, 28 Mar 2025 10:44:25 +0200 From: Jarkko Sakkinen To: "Reshetova, Elena" Cc: "Hansen, Dave" , "linux-sgx@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , "Mallick, Asit K" , "Scarlata, Vincent R" , "Cai, Chong" , "Aktas, Erdem" , "Annapurve, Vishal" , "dionnaglaze@google.com" , "bondarn@google.com" , "Raynor, Scott" , "Zhang, Cathy" Subject: Re: [PATCH 4/4] x86/sgx: Implement ENCLS[EUPDATESVN] and opportunistically call it during first EPC page alloc Message-ID: References: <20250321123938.802763-1-elena.reshetova@intel.com> <20250321123938.802763-5-elena.reshetova@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Mar 28, 2025 at 08:27:51AM +0000, Reshetova, Elena wrote: > > > On Thu, Mar 27, 2025 at 03:42:30PM +0000, Reshetova, Elena wrote: > > > > > > > + case SGX_NO_UPDATE: > > > > > > > + pr_debug("EUPDATESVN was successful, but CPUSVN > > was not > > > > > > updated, " > > > > > > > + "because current SVN was not newer than > > > > > > CPUSVN.\n"); > > > > > > > + break; > > > > > > > + case SGX_EPC_NOT_READY: > > > > > > > + pr_debug("EPC is not ready for SVN update."); > > > > > > > + break; > > > > > > > + case SGX_INSUFFICIENT_ENTROPY: > > > > > > > + pr_debug("CPUSVN update is failed due to Insufficient > > > > > > entropy in RNG, " > > > > > > > + "please try it later.\n"); > > > > > > > + break; > > > > > > > + case SGX_EPC_PAGE_CONFLICT: > > > > > > > + pr_debug("CPUSVN update is failed due to > > concurrency > > > > > > violation, please " > > > > > > > + "stop running any other ENCLS leaf and try it > > > > > > later.\n"); > > > > > > > + break; > > > > > > > + default: > > > > > > > + break; > > > > > > > > > > > > Remove pr_debug() statements. > > > > > > > > > > This I am not sure it is good idea. I think it would be useful for system > > > > > admins to have a way to see that update either happened or not. > > > > > It is true that you can find this out by requesting a new SGX attestation > > > > > quote (and see if newer SVN is used), but it is not the faster way. > > > > > > > > Maybe pr_debug() is them wrong level if they are meant for sysadmins? > > > > > > > > I mean these should not happen in normal behavior like ever? As > > > > pr_debug() I don't really grab this. > > > > > > SGX_NO_UPDATE will absolutely happen normally all the time. > > > Since EUPDATESVN is executed every time EPC is empty, this is the > > > most common code you will get back (because microcode updates are rare). > > > Others yes, that would indicate some error condition. > > > So, what is the pr_level that you would suggest? > > > > Right, got it. That changes my conclusions: > > > > So I'd reformulate it like: > > > > switch (ret) { > > case 0: > > pr_info("EUPDATESVN: success\n); > > break; > > case SGX_EPC_NOT_READY: > > case SGX_INSUFFICIENT_ENTROPY: > > case SGX_EPC_PAGE_CONFLICT: > > pr_err("EUPDATESVN: error %d\n", ret); > > /* TODO: block/teardown driver? */ > > break; > > case SGX_NO_UPDATE: > > break; > > default: > > pr_err("EUPDATESVN: unknown error %d\n", ret); > > /* TODO: block/teardown driver? */ > > break; > > } > > > > Since when this is executed EPC usage is zero error cases should block > > or teardown SGX driver, presuming that they are because of either > > incorrect driver state or spurious error code. > > I agree with the above, but not sure at all about the blocking/teardown the > driver. They are all potentially temporal things and SGX_INSUFFICIENT_ENTROPY > is even outside of SGX driver control and *does not* indicate any error > condition on the driver side itself. SGX_EPC_NOT_READY and SGX_EPC_PAGE_CONFLICT > would mean we have a bug somewhere because we thought we could go > do EUDPATESVN on empty EPC and prevented anyone from creating > pages in meanwhile but looks like we missed smth. That said, I dont know if we > want to fail the whole system in case we have such a code bug, this is very > aggressive (in case it is some rare edge condition that no one knew about or > guessed). So, I would propose to print the pr_err() as you have above but > avoid destroying the driver. > Would this work? I think now is the time that you should really roll out a new version in the way you see fit and we will revisit that. I already grabbed from your example that I got some of the error codes horribly wrong :-) Still I think the draft of error planning I put is at least towards right direction. > > Best Regards, > Elena. > > > > > > If this happens, we definitely do not want service, right? > > > > I'm not sure of all error codes how serious they are, or are all of them > > consequence of incorrectly working driver. > > > > BR, Jarkko BR, Jarkko