From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36F1C3382C7; Mon, 20 Apr 2026 15:45:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776699945; cv=none; b=TnT3I3hji7un7gv4zddwVQaQJlMx1M8VJJqzSCmDMBdhf0te3/U0PblngyEDlV24/MK5GjMTNHU/vURLKfBXg3ATlkey0LL7f2NekQkJOkQ8HkQiFrlwNxO97sYDIjj0ZUFAW1uLz05X696HsUIPmHQa+cjC/R94U5axvYDzJvg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776699945; c=relaxed/simple; bh=uKnQUw33ggBCLzscrat35dAJtlyk6M00CeLGefWQC3c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=byid73UNZfVIS63+2AR/mbfaUgCuAQhHLCDyckdmr32pYuDwTtV6TZnwqr4nvDrYVeINIEbVUqxZ1GrAmybEctYv4MVUU/L3fP6NNnltkfccHOUNUhs91aLMRv2MlKEiKdZuLtwZPLo+sosanQ+WVqiDGSbCkxCGQUakfPZKFjs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=ejijzBfG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="ejijzBfG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 842DCC2BCB4; Mon, 20 Apr 2026 15:45:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1776699944; bh=uKnQUw33ggBCLzscrat35dAJtlyk6M00CeLGefWQC3c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ejijzBfGI6tX1yo2hCGkfNtVwrtqrHdZYqp/038xP/2Xhs0RbJcXghj6F1G3VPySl 3yoIpN58WXt3Y4hGSKuJwV1Li+Ea4WYnxBAbNT9pu1q1ZKfjdYV5FwPlSmKckSNciU /CWUr0diPQxfGXkUu+Etc9Nrew9dq5ZCS7sVw18w= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Linus Torvalds Subject: [PATCH 7.0 58/76] x86-64: rename misleadingly named __copy_user_nocache() function Date: Mon, 20 Apr 2026 17:42:09 +0200 Message-ID: <20260420153912.931958705@linuxfoundation.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260420153910.810034134@linuxfoundation.org> References: <20260420153910.810034134@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 7.0-stable review patch. If anyone has any objections, please let me know. ------------------ From: Linus Torvalds commit d187a86de793f84766ea40b9ade7ac60aabbb4fe upstream. This function was a masterclass in bad naming, for various historical reasons. It claimed to be a non-cached user copy. It is literally _neither_ of those things. It's a specialty memory copy routine that uses non-temporal stores for the destination (but not the source), and that does exception handling for both source and destination accesses. Also note that while it works for unaligned targets, any unaligned parts (whether at beginning or end) will not use non-temporal stores, since only words and quadwords can be non-temporal on x86. The exception handling means that it _can_ be used for user space accesses, but not on its own - it needs all the normal "start user space access" logic around it. But typically the user space access would be the source, not the non-temporal destination. That was the original intention of this, where the destination was some fragile persistent memory target that needed non-temporal stores in order to catch machine check exceptions synchronously and deal with them gracefully. Thus that non-descriptive name: one use case was to copy from user space into a non-cached kernel buffer. However, the existing users are a mix of that intended use-case, and a couple of random drivers that just did this as a performance tweak. Some of those random drivers then actively misused the user copying version (with STAC/CLAC and all) to do kernel copies without ever even caring about the exception handling, _just_ for the non-temporal destination. Rename it as a first small step to actually make it halfway sane, and change the prototype to be more normal: it doesn't take a user pointer unless the caller has done the proper conversion, and the argument size is the full size_t (it still won't actually copy more than 4GB in one go, but there's also no reason to silently truncate the size argument in the caller). Finally, use this now sanely named function in the NTB code, which mis-used a user copy version (with STAC/CLAC and all) of this interface despite it not actually being a user copy at all. Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/uaccess_64.h | 5 +++-- arch/x86/lib/copy_user_uncached_64.S | 6 +++--- arch/x86/lib/usercopy_64.c | 4 ++-- drivers/infiniband/sw/rdmavt/qp.c | 8 +++----- drivers/ntb/ntb_transport.c | 7 ++++--- tools/objtool/check.c | 2 +- 6 files changed, 16 insertions(+), 16 deletions(-) --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -147,7 +147,8 @@ raw_copy_to_user(void __user *dst, const return copy_user_generic((__force void *)dst, src, size); } -extern long __copy_user_nocache(void *dst, const void __user *src, unsigned size); +#define copy_to_nontemporal copy_to_nontemporal +extern size_t copy_to_nontemporal(void *dst, const void *src, size_t size); extern long __copy_user_flushcache(void *dst, const void __user *src, unsigned size); static inline int @@ -157,7 +158,7 @@ __copy_from_user_inatomic_nocache(void * long ret; kasan_check_write(dst, size); stac(); - ret = __copy_user_nocache(dst, src, size); + ret = copy_to_nontemporal(dst, (__force const void *)src, size); clac(); return ret; } --- a/arch/x86/lib/copy_user_uncached_64.S +++ b/arch/x86/lib/copy_user_uncached_64.S @@ -27,7 +27,7 @@ * Output: * rax uncopied bytes or 0 if successful. */ -SYM_FUNC_START(__copy_user_nocache) +SYM_FUNC_START(copy_to_nontemporal) ANNOTATE_NOENDBR /* If destination is not 7-byte aligned, we'll have to align it */ testb $7,%dil @@ -240,5 +240,5 @@ _ASM_EXTABLE_UA(95b, .Ldone) _ASM_EXTABLE_UA(52b, .Ldone0) _ASM_EXTABLE_UA(53b, .Ldone0) -SYM_FUNC_END(__copy_user_nocache) -EXPORT_SYMBOL(__copy_user_nocache) +SYM_FUNC_END(copy_to_nontemporal) +EXPORT_SYMBOL(copy_to_nontemporal) --- a/arch/x86/lib/usercopy_64.c +++ b/arch/x86/lib/usercopy_64.c @@ -49,11 +49,11 @@ long __copy_user_flushcache(void *dst, c long rc; stac(); - rc = __copy_user_nocache(dst, src, size); + rc = copy_to_nontemporal(dst, (__force const void *)src, size); clac(); /* - * __copy_user_nocache() uses non-temporal stores for the bulk + * copy_to_nontemporal() uses non-temporal stores for the bulk * of the transfer, but we need to manually flush if the * transfer is unaligned. A cached memory copy is used when * destination or size is not naturally aligned. That is: --- a/drivers/infiniband/sw/rdmavt/qp.c +++ b/drivers/infiniband/sw/rdmavt/qp.c @@ -92,12 +92,10 @@ static int rvt_wss_llc_size(void) static void cacheless_memcpy(void *dst, void *src, size_t n) { /* - * Use the only available X64 cacheless copy. Add a __user cast - * to quiet sparse. The src agument is already in the kernel so - * there are no security issues. The extra fault recovery machinery - * is not invoked. + * Use the only available X64 cacheless copy. + * The extra fault recovery machinery is not invoked. */ - __copy_user_nocache(dst, (void __user *)src, n); + copy_to_nontemporal(dst, src, n); } void rvt_wss_exit(struct rvt_dev_info *rdi) --- a/drivers/ntb/ntb_transport.c +++ b/drivers/ntb/ntb_transport.c @@ -1779,12 +1779,13 @@ static void ntb_tx_copy_callback(void *d static void ntb_memcpy_tx_on_stack(struct ntb_queue_entry *entry, void __iomem *offset) { -#ifdef ARCH_HAS_NOCACHE_UACCESS +#ifdef copy_to_nontemporal /* * Using non-temporal mov to improve performance on non-cached - * writes, even though we aren't actually copying from user space. + * writes. This only works if __iomem is strictly memory-like, + * but that is the case on x86-64 */ - __copy_from_user_inatomic_nocache(offset, entry->buf, entry->len); + copy_to_nontemporal(offset, entry->buf, entry->len); #else memcpy_toio(offset, entry->buf, entry->len); #endif --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1301,7 +1301,7 @@ static const char *uaccess_safe_builtin[ "copy_mc_enhanced_fast_string", "rep_stos_alternative", "rep_movs_alternative", - "__copy_user_nocache", + "copy_to_nontemporal", NULL };