Re: [PATCH] x86: only use ERMS for user copies for larger sizes

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: pabeni@redhat.com, Jens Axboe <axboe@kernel.dk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	bp@alien8.de, Peter Anvin <hpa@zytor.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrew Lutomirski <luto@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	dvlasenk@redhat.com, brgerst@gmail.com,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: only use ERMS for user copies for larger sizes
Date: Thu, 22 Nov 2018 11:32:31 +0100	[thread overview]
Message-ID: <20181122103231.GA102790@gmail.com> (raw)
In-Reply-To: <CAHk-=wiDLq5CVnTJhFNEEduqYP_VSsQPSNvi=w6a8Y9figuh7g@mail.gmail.com>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Wed, Nov 21, 2018 at 10:16 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > It might be interesting to just change raw_copy_to/from_user() to
> > handle a lot more cases (in particular, handle cases where 'size' is
> > 8-byte aligned). The special cases we *do* have may not be the right
> > ones (the 10-byte case in particular looks odd).
> >
> > For example, instead of having a "if constant size is 8 bytes, do one
> > get/put_user()" case, we might have a "if constant size is < 64 just
> > unroll it into get/put_user()" calls.
> 
> Actually, x86 doesn't even set INLINE_COPY_TO_USER, so I don't think
> the constant size cases ever trigger at all the way they are set up
> now.

Side note, there's one artifact the patch exposes: some of the 
__builtin_constant_p() checks are imprecise and don't trigger at the 
early stage where GCC checks them, but the lenght is actually known to 
the compiler at later optimization stages.

This means that with Jen's patch some of the length checks go away. I 
checked x86-64 defconfig and a distro config, and the numbers were ~7% 
and 10%, so not a big effect.

The kernel text size reduction with Jen's patch is small but real:

 text		data		bss		dec		hex	filename
 19572694	11516934	19873888	50963516	309a43c	vmlinux.before
 19572468	11516934	19873888	50963290	309a35a	vmlinux.after

But I checked the disassembly, and it's not a real win, the new code is 
actually more complex than the old one, as expected, but GCC (7.3.0) does 
some particularly stupid things which bloats the generated code.

> I do have a random patch that makes "unsafe_put_user()" actually use
> "asm goto" for the error case, and that, together with the attached
> patch seems to generate fairly nice code, but even then it would
> depend on gcc actually unrolling things (which we do *not* want in
> general).
> 
> But for a 32-byte user copy (cp_old_stat), and that
> INLINE_COPY_TO_USER, it generates this:
> 
>         stac
>         movl    $32, %edx       #, size
>         movq    %rsp, %rax      #, src
> .L201:
>         movq    (%rax), %rcx    # MEM[base: src_155, offset: 0B],
> MEM[base: src_155, offset: 0B]
> 1:      movq %rcx,0(%rbp)       # MEM[base: src_155, offset: 0B],
> MEM[(struct __large_struct *)dst_156]
> ASM_EXTABLE_HANDLE from=1b to=.L200 handler="ex_handler_uaccess"        #
> 
>         addq    $8, %rax        #, src
>         addq    $8, %rbp        #, statbuf
>         subq    $8, %rdx        #, size
>         jne     .L201   #,
>         clac
> 
> which is actually fairly close to "optimal".

Yeah, that looks pretty sweet!

> Random patch (with my "asm goto" hack included) attached, in case
> people want to play with it.

Doesn't even look all that hacky to me. Any hack in it that I didn't 
notice? :-)

The only question is the inlining overhead - will try to measure that.

> Impressively, it actually removes more lines of code than it adds. But
> I didn't actually check whether the end result *works*, so hey..

Most of the linecount reduction appears to come from the simplification 
of the unroll loop and moving it into C, from a 6-way hard-coded copy 
routine:

> -	switch (size) {
> -	case 1:
> -	case 2:
> -	case 4:
> -	case 8:
> -	case 10:
> -	case 16:

to a more flexible 4-way loop unrolling:

> +	while (size >= sizeof(unsigned long)) {
> +	while (size >= sizeof(unsigned int)) {
> +	while (size >= sizeof(unsigned short)) {
> +	while (size >= sizeof(unsigned char)) {

Which is a nice improvement in itself.

> +	user_access_begin();
> +	if (dirent)
> +		unsafe_put_user(offset, &dirent->d_off, efault_end);
>  	dirent = buf->current_dir;
> +	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
> +	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
> +	unsafe_put_user(0, dirent->d_name + namlen, efault_end);
> +	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
> +	user_access_end();
> +
>  	if (copy_to_user(dirent->d_name, name, namlen))
>  		goto efault;
>  	buf->previous = dirent;
>  	dirent = (void __user *)dirent + reclen;
>  	buf->current_dir = dirent;
>  	buf->count -= reclen;
>  	return 0;
> +efault_end:
> +	user_access_end();
>  efault:
>  	buf->error = -EFAULT;
>  	return -EFAULT;

In terms of high level APIs, could we perhaps use the opportunity to 
introduce unsafe_write_user() instead, which would allow us to write it 
as:

	unsafe_write_user(&dirent->d_ino, d_ino, efault_end);
	unsafe_write_user(&dirent->d_reclen, reclen, efault_end);
	unsafe_write_user(dirent->d_name + namlen, 0, efault_end);
	unsafe_write_user((char __user *)dirent + reclen - 1, d_type, efault_end);

	if (copy_to_user(dirent->d_name, name, namlen))
		goto efault;

This gives it the regular 'VAR = VAL;' notation of C assigments, instead 
of the weird historical reverse notation that put_user()/get_user() uses.

Note how this newfangled ordering now matches the 'copy_to_user()' 
natural C-assignment parameter order that comes straight afterwards and 
makes it obvious that the d->name+namelen was writing the delimiter at 
the end.

I think we even had bugs from put_user() ordering mixups?

Or is it too late to try to fix this particular mistake?

Thanks,

	Ingo

next prev parent reply	other threads:[~2018-11-22 10:32 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <02bfc577-32a5-66be-64bf-d476b7d447d2@kernel.dk>
2018-11-20 20:24 ` [PATCH] x86: only use ERMS for user copies for larger sizes Jens Axboe
2018-11-21  6:36 ` Ingo Molnar
2018-11-21 13:32   ` Jens Axboe
2018-11-21 13:44     ` Denys Vlasenko
2018-11-22 17:36       ` David Laight
2018-11-22 17:52         ` Linus Torvalds
2018-11-22 18:06           ` Andy Lutomirski
2018-11-22 18:58             ` Linus Torvalds
2018-11-23  9:34               ` David Laight
2018-11-23 10:12                 ` David Laight
2018-11-23 16:36                   ` Linus Torvalds
2018-11-23 17:42                     ` Linus Torvalds
2018-11-23 18:39                       ` Andy Lutomirski
2018-11-23 18:44                         ` Linus Torvalds
2018-11-23 19:11                           ` Andy Lutomirski
2018-11-26 10:12                             ` David Laight
2018-11-26 10:01                     ` David Laight
2018-11-26 10:26                     ` David Laight
2019-01-05  2:38                       ` Linus Torvalds
2019-01-07  9:55                         ` David Laight
2019-01-07 17:43                           ` Linus Torvalds
2019-01-08  9:10                             ` David Laight
2019-01-08 18:01                               ` Linus Torvalds
2018-11-21 13:45     ` Paolo Abeni
2018-11-21 17:27       ` Linus Torvalds
2018-11-21 18:04         ` Jens Axboe
2018-11-21 18:26           ` Andy Lutomirski
2018-11-21 18:43             ` Linus Torvalds
2018-11-21 22:38               ` Andy Lutomirski
2018-11-21 18:16         ` Linus Torvalds
2018-11-21 19:01           ` Linus Torvalds
2018-11-22 10:32             ` Ingo Molnar [this message]
2018-11-22 11:13               ` Ingo Molnar
2018-11-22 11:21                 ` Ingo Molnar
2018-11-23 16:40                 ` Josh Poimboeuf
2018-11-22 16:55               ` Linus Torvalds
2018-11-22 17:26                 ` Andy Lutomirski
2018-11-22 17:35                   ` Linus Torvalds
2018-11-24  6:09           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181122103231.GA102790@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).