Re: [RFC PATCH] iov: Bypass usercopy hardening for kernel iterators

public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed

From: Chuck Lever <cel@kernel.org>
To: Kees Cook <kees@kernel.org>
Cc: viro@zeniv.linux.org.uk, gustavoars@kernel.org,
	linux-hardening@vger.kernel.org, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, netdev@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [RFC PATCH] iov: Bypass usercopy hardening for kernel iterators
Date: Wed, 25 Mar 2026 17:29:51 -0400	[thread overview]
Message-ID: <a65dee68-6b8b-44ae-9296-7fde63322083@kernel.org> (raw)
In-Reply-To: <202603251421.20D29E29@keescook>

On 3/25/26 5:27 PM, Kees Cook wrote:
> On Tue, Mar 03, 2026 at 11:29:32AM -0500, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> Profiling NFSD under an iozone workload showed that hardened
>> usercopy checks consume roughly 1.3% of CPU in the TCP receive
>> path. The runtime check in check_object_size() validates that
>> copy buffers reside in expected slab regions, which is
>> meaningful when data crosses the user/kernel boundary but adds
>> no value when both source and destination are kernel addresses.
>>
>> Split check_copy_size() so that copy_to_iter() can bypass the
>> runtime check_object_size() call for kernel-only iterators
>> (ITER_BVEC, ITER_KVEC). Existing callers of check_copy_size()
>> are unaffected; user-backed iterators still receive the full
>> usercopy validation.
>>
>> This benefits all kernel consumers of copy_to_iter(), including
>> the TCP receive path used by the NFS client and server,
>> NVMe-TCP, and any other subsystem that uses ITER_BVEC or
>> ITER_KVEC receive buffers.
> 
> So, I'm not a big fan of this just because the whole point is to catch
> unexpected conditions, but there is a reasonable point to be made that
> this case shouldn't be covered by kernel/kernel copies.
> 
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>  include/linux/ucopysize.h | 10 +++++++++-
>>  include/linux/uio.h       |  9 +++++++--
>>  2 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/ucopysize.h b/include/linux/ucopysize.h
>> index 41c2d9720466..b3eacb4869a8 100644
>> --- a/include/linux/ucopysize.h
>> +++ b/include/linux/ucopysize.h
>> @@ -42,7 +42,7 @@ static inline void copy_overflow(int size, unsigned long count)
>>  }
>>  
>>  static __always_inline __must_check bool
>> -check_copy_size(const void *addr, size_t bytes, bool is_source)
>> +check_copy_size_nosec(const void *addr, size_t bytes, bool is_source)
> 
> "nosec" is kind of ambiguous. Since this is doing the compile-time
> checks, how about naming this __compiletime_check_copy_size() or so?

No problem.


>>  {
>>  	int sz = __builtin_object_size(addr, 0);
>>  	if (unlikely(sz >= 0 && sz < bytes)) {
>> @@ -56,6 +56,14 @@ check_copy_size(const void *addr, size_t bytes, bool is_source)
>>  	}
>>  	if (WARN_ON_ONCE(bytes > INT_MAX))
>>  		return false;
>> +	return true;
>> +}
>> +
>> +static __always_inline __must_check bool
>> +check_copy_size(const void *addr, size_t bytes, bool is_source)
>> +{
>> +	if (!check_copy_size_nosec(addr, bytes, is_source))
>> +		return false;
>>  	check_object_size(addr, bytes, is_source);
>>  	return true;
>>  }
>> diff --git a/include/linux/uio.h b/include/linux/uio.h
>> index a9bc5b3067e3..f860529abfbe 100644
>> --- a/include/linux/uio.h
>> +++ b/include/linux/uio.h
>> @@ -216,8 +216,13 @@ size_t copy_page_to_iter_nofault(struct page *page, unsigned offset,
>>  static __always_inline __must_check
>>  size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>>  {
>> -	if (check_copy_size(addr, bytes, true))
>> -		return _copy_to_iter(addr, bytes, i);
>> +	if (user_backed_iter(i)) {
>> +		if (check_copy_size(addr, bytes, true))
>> +			return _copy_to_iter(addr, bytes, i);
>> +	} else {
>> +		if (check_copy_size_nosec(addr, bytes, true))
>> +			return _copy_to_iter(addr, bytes, i);
>> +	}
>>  	return 0;
>>  }
> 
> This seems reasonable with the renaming, though I might come back some
> day and ask that this get a boot param or something (we have a big
> hammer boot param for usercopy checking already, but I like this more
> focused check).
> 

Thanks for having a look. An additional question is whether the
"copy from" direction needs similar treatment. Performance analysis
found "copy to" was an issue for my particular workload (NFSD) but
it's plausible that "copy from" should be handled similarly.

-- 
Chuck Lever

     prev parent reply	other threads:[~2026-03-25 21:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03 16:29 [RFC PATCH] iov: Bypass usercopy hardening for kernel iterators Chuck Lever
2026-03-03 18:00 ` Matthew Wilcox
2026-03-03 19:41   ` Chuck Lever
2026-03-03 19:59     ` Matthew Wilcox
2026-03-25 17:26 ` Chuck Lever
2026-03-25 21:27 ` Kees Cook
2026-03-25 21:29   ` Chuck Lever [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a65dee68-6b8b-44ae-9296-7fde63322083@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=gustavoars@kernel.org \
    --cc=kees@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox