All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Herton Krzesinski <hkrzesin@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	x86@kernel.org, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	linux-kernel@vger.kernel.org, olichtne@redhat.com,
	atomasov@redhat.com, aokuliar@redhat.com
Subject: Re: [PATCH] x86: add back the alignment of the destination to 8 bytes in copy_user_generic()
Date: Sun, 16 Mar 2025 11:58:35 +0100	[thread overview]
Message-ID: <Z9au20vtMSXCbdXu@gmail.com> (raw)
In-Reply-To: <CAJmZWFFVL++yU1XJLkXSck=GRQXiim16xVSvdxjq1k=c=Aaiqg@mail.gmail.com>


* Herton Krzesinski <hkrzesin@redhat.com> wrote:

> On Fri, Mar 14, 2025 at 4:06 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Fri, 14 Mar 2025 at 07:53, Herton R. Krzesinski <herton@redhat.com> wrote:
> > >
> > > --- a/arch/x86/include/asm/uaccess_64.h
> > > +++ b/arch/x86/include/asm/uaccess_64.h
> > > @@ -130,7 +130,7 @@ copy_user_generic(void *to, const void *from, unsigned long len)
> > >                 "2:\n"
> > >                 _ASM_EXTABLE_UA(1b, 2b)
> > >                 :"+c" (len), "+D" (to), "+S" (from), ASM_CALL_CONSTRAINT
> > > -               : : "memory", "rax");
> > > +               : : "memory", "rax", "rdx", "r8");
> >
> > Please don't penalize the caller with the extra clobbers.
> >
> > Maybe it doesn't matter - these functions are marked always_inline,
> > but they aren't inlined in very many places and maybe those places
> > have registers to spare - but let's not penalize the FSRM case anyway.
> >
> > And we do call it "rep_movs_alternative", so let's keep it close to
> > "rep movs" semantics (yes, we already clobber %rax, but let's not make
> > it worse).
> >
> > As to the actual change to rep_movs - that should be done differently
> > too. In particular, I doubt it makes any sense to try to align the
> > destination for small writes or for the ERMS case when we use 'rep
> > movsb', so I think this should all go into just the ".Llarge_movsq"
> > case.
> >
> > .. and then the patch can be further optimized to just do the first -
> > possibly unaligned - destination word unconditionally, and then
> > updating the addresses and counts to make the rest be aligned.
> >
> > Something ENTIRELY UNTESTED like this, in other words. And I wrote it
> > so that it doesn't need any new temporary registers, so no need for
> > clobbers or for some save/restore code.
> >
> > NOTE! The patch below is very intentionally whitespace-damaged.
> > Anybody who applies this needs to look at it very carefully, because I
> > just threw this together with zero testing and only very limited
> > thought.
> >
> > But if it works, and if it actually improves performance, I think it
> > might be a fairly minimal approach to "align destination".
> 
> It does look good in my testing here, I built same kernel I
> was using for testing the original patch (based on
> 6.14-rc6), this is one of the results I got in one of the runs
> testing on the same machine:
> 
>              CPU      RATE          SYS          TIME     sender-receiver
> Server bind   19: 20.8Gbits/sec 14.832313000 20.863476111 75.4%-89.2%
> Server bind   21: 18.0Gbits/sec 18.705221000 23.996913032 80.8%-89.7%
> Server bind   23: 20.1Gbits/sec 15.331761000 21.536657212 75.0%-89.7%
> Server bind none: 24.1Gbits/sec 14.164226000 18.043132731 82.3%-87.1%
> 
> There are still some variations between runs, which is
> expected as was the same when I tested my patch or in
> the not aligned case, but it's consistently better/higher than
> the no align case. Looks really it's sufficient to align for the
> higher than or equal 64 bytes copy case.

Mind sending a v2 patch with a changelog and these benchmark numbers 
added in, and perhaps a Co-developed-by tag with Linus or so?

Thanks,

	Ingo

  reply	other threads:[~2025-03-16 10:58 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-14 17:53 Performance issues in copy_user_generic() in x86_64 Herton R. Krzesinski
2025-03-14 17:53 ` [PATCH] x86: add back the alignment of the destination to 8 bytes in copy_user_generic() Herton R. Krzesinski
2025-03-14 19:06   ` Linus Torvalds
2025-03-14 20:33     ` Herton Krzesinski
2025-03-16 10:58       ` Ingo Molnar [this message]
2025-03-16 11:09         ` Ingo Molnar
2025-03-17 13:18           ` Herton Krzesinski
2025-03-18 21:59           ` David Laight
2025-03-18 22:50             ` Herton Krzesinski
2025-03-19 13:07               ` David Laight
2025-03-17 13:16     ` David Laight
2025-03-17 21:29       ` Linus Torvalds
2025-03-17 22:32         ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9au20vtMSXCbdXu@gmail.com \
    --to=mingo@kernel.org \
    --cc=aokuliar@redhat.com \
    --cc=atomasov@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hkrzesin@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=olichtne@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.