From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A085360EC7 for ; Fri, 3 Jul 2026 09:00:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1783069236; cv=none; b=agVJckSkuzXDItBT0f3vXtqNQhnTPNkoEMhX+T5VI0Jr+sHiJJcl2+HZG3zgXQe6yjf5Wz9ZtUrn4vCfrWTSEAgC9hydSxp4Sid09iYnncm2dMb3KGl6mYwW0lb+QJhzPnjMyLlkO6BAtgWgJQSjAg93UpQAo87ukk4hxaJpCjc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1783069236; c=relaxed/simple; bh=X2mIPTuVUVzG3Sy48nX/GeVEJ+zkaU2gIP0hEAX9jYs=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=WhHxE9hDXPIle7W0/CHspfBm6raGFOQU6rrWxZdkN4XUXt/iQG7bIRfpwIYIPoCs58T0NHMbG56AGyJbtKbOddRQCFd76w61uCqfsUyhFLgMHy4uqduFBjcwP+i55yaOhtOsf8nHefPmiYqiZUTYesm//ncp3LHGeWfOOjSYYo4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=DLn1JD2l; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="DLn1JD2l" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8EC611F000E9; Fri, 3 Jul 2026 09:00:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1783069234; bh=bRlr6RTE1wU4o/W900DOQrm2i1YySRfOk5Ad5VrUcsU=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=DLn1JD2luaUBMckiJbl266HElL5cralo5eIx8Nzra+V8moVbitrxHvrVqovU/Z0JY UOy+e+voCo7xBaSxzzsnKAaK1rohPGvffYe/f/UUfDXzW7eXJk6NlXj1QuncnI0+A+ 1cAATdPOfZn0Vs6GVGJMeJH7GGMSIIkQRaD0WhTPZ1ET83uiHs1n96SqrRlw/SC6Xw noqUmi8QAfTIi9FQ+VufPgvlNdbpPzVbyQbJ/ky7gLR4+hukOVzmuED3OX8AV1GZg+ A7ifnnGwEKJiwJzaaI9KwRA9ty+H0XmshxqEAoYDeXbARQvJMl1UopJNTHnC2L2ef5 eqIYR3hb//lWQ== From: sashiko-bot@kernel.org Subject: Re: [PATCH v5] x86/sgx: Report RCU-Tasks quiescent state in EPC sanitization loop Reply-To: sashiko-reviews@lists.linux.dev To: "Jun Miao" Cc: bpf@vger.kernel.org In-Reply-To: <20260703084810.145567-1-jun.miao@intel.com> References: <20260703084810.145567-1-jun.miao@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Fri, 03 Jul 2026 09:00:34 +0000 Message-Id: <20260703090034.8EC611F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 2 potential issue(= s) to consider: Pre-existing issues: - [High] The cond_resched() call in the main loop of ksgxd() was not update= d to explicitly report RCU-tasks quiescent states, leaving it vulnerable to= stalls. - [High] SGX initialization error path frees EPC page arrays without removi= ng their physical addresses from the global sgx_epc_address_space XArray, l= eading to Use-After-Free on hardware memory errors. -- commit 71875f4f4ed3656160e4c510add3ac4a68c4cd86 Author: Jun Miao x86/sgx: Report RCU-Tasks quiescent state in EPC sanitization loop When the kernel boots from kexec, the EPC pages may have a stale state. The kernel sanitizes all EPC pages to reset them to a clean state before their first use in any enclave. The EPC size could be several GBs and resetting them could take a significant amount of time. Link: https://github.com/systemd/systemd/issues/40423 > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/mai= n.c > index dc73194416ac8..545415ee98177 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -103,7 +103,7 @@ static unsigned long __sgx_sanitize_pages(struct list= _head *dirty_page_list) > left_dirty++; > } > =20 > - cond_resched(); > + cond_resched_tasks_rcu_qs(); > } > =20 > list_splice(&dirty, dirty_page_list); [Severity: High] This isn't a bug introduced by this patch, but looking at the main loop in ksgxd() in the same file: arch/x86/kernel/cpu/sgx/main.c:ksgxd() { ... if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) sgx_reclaim_pages(); cond_resched(); ... } Could this cond_resched() also starve RCU-tasks grace periods if sgx_should_reclaim() remains continuously true under heavy SGX memory press= ure? Since wait_event_freezable() won't sleep in this case, would it make sense to update this to cond_resched_tasks_rcu_qs() as well? [Severity: High] This is a pre-existing issue, but does the error path in sgx_init() leave dangling pointers in the global sgx_epc_address_space XArray? arch/x86/kernel/cpu/sgx/main.c:sgx_init() { ... err_page_cache: for (i =3D 0; i < sgx_nr_epc_sections; i++) { vfree(sgx_epc_sections[i].pages); memunmap(sgx_epc_sections[i].virt_addr); } return ret; } If sgx_page_cache_init() succeeds (which populates the XArray) but a later initialization step fails, the section->pages arrays are freed here without being removed from the XArray. If a hardware memory error occurs later, arch_memory_failure() looks up the PFN via sgx_paddr_to_page(): arch/x86/kernel/cpu/sgx/main.c:sgx_paddr_to_page() { ... section =3D xa_load(&sgx_epc_address_space, paddr); if (!section) return NULL; return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; } Could this lead to a use-after-free in arch_memory_failure() when it dereferences the returned page pointer? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260703084810.1455= 67-1-jun.miao@intel.com?part=3D1