linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: x86@kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Kees Cook <keescook@chromium.org>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>
Subject: Re: [PATCH] x86/uaccess: use unrolled string copy for short strings
Date: Thu, 22 Jun 2017 19:02:47 +0200	[thread overview]
Message-ID: <1498150967.2503.4.camel@redhat.com> (raw)
In-Reply-To: <20170622084732.xjnd5gx77ftaozem@gmail.com>

On Thu, 2017-06-22 at 10:47 +0200, Ingo Molnar wrote:
> * Paolo Abeni <pabeni@redhat.com> wrote:
> 
> > The 'rep' prefix suffers for a relevant "setup cost"; as a result
> > string copies with unrolled loops are faster than even
> > optimized string copy using 'rep' variant, for short string.
> > 
> > This change updates __copy_user_generic() to use the unrolled
> > version for small string length. The threshold length for short
> > string - 64 - has been selected with empirical measures as the
> > larger value that still ensure a measurable gain.
> > 
> > A micro-benchmark of __copy_from_user() with different lengths shows
> > the following:
> > 
> > string len	vanilla		patched 	delta
> > bytes		ticks		ticks		tick(%)
> > 
> > 0		58		26		32(55%)
> > 1		49		29		20(40%)
> > 2		49		31		18(36%)
> > 3		49		32		17(34%)
> > 4		50		34		16(32%)
> > 5		49		35		14(28%)
> > 6		49		36		13(26%)
> > 7		49		38		11(22%)
> > 8		50		31		19(38%)
> > 9		51		33		18(35%)
> > 10		52		36		16(30%)
> > 11		52		37		15(28%)
> > 12		52		38		14(26%)
> > 13		52		40		12(23%)
> > 14		52		41		11(21%)
> > 15		52		42		10(19%)
> > 16		51		34		17(33%)
> > 17		51		35		16(31%)
> > 18		52		37		15(28%)
> > 19		51		38		13(25%)
> > 20		52		39		13(25%)
> > 21		52		40		12(23%)
> > 22		51		42		9(17%)
> > 23		51		46		5(9%)
> > 24		52		35		17(32%)
> > 25		52		37		15(28%)
> > 26		52		38		14(26%)
> > 27		52		39		13(25%)
> > 28		52		40		12(23%)
> > 29		53		42		11(20%)
> > 30		52		43		9(17%)
> > 31		52		44		8(15%)
> > 32		51		36		15(29%)
> > 33		51		38		13(25%)
> > 34		51		39		12(23%)
> > 35		51		41		10(19%)
> > 36		52		41		11(21%)
> > 37		52		43		9(17%)
> > 38		51		44		7(13%)
> > 39		52		46		6(11%)
> > 40		51		37		14(27%)
> > 41		50		38		12(24%)
> > 42		50		39		11(22%)
> > 43		50		40		10(20%)
> > 44		50		42		8(16%)
> > 45		50		43		7(14%)
> > 46		50		43		7(14%)
> > 47		50		45		5(10%)
> > 48		50		37		13(26%)
> > 49		49		38		11(22%)
> > 50		50		40		10(20%)
> > 51		50		42		8(16%)
> > 52		50		42		8(16%)
> > 53		49		46		3(6%)
> > 54		50		46		4(8%)
> > 55		49		48		1(2%)
> > 56		50		39		11(22%)
> > 57		50		40		10(20%)
> > 58		49		42		7(14%)
> > 59		50		42		8(16%)
> > 60		50		46		4(8%)
> > 61		50		47		3(6%)
> > 62		50		48		2(4%)
> > 63		50		48		2(4%)
> > 64		51		38		13(25%)
> > 
> > Above 64 bytes the gain fades away.
> > 
> > Very similar values are collectd for __copy_to_user().
> > UDP receive performances under flood with small packets using recvfrom()
> > increase by ~5%.
> 
> What CPU model(s) were used for the performance testing and was it performance 
> tested on several different types of CPUs?
> 
> Please add a comment here:
> 
> +       if (len <= 64)
> +               return copy_user_generic_unrolled(to, from, len);
> +
> 
> ... because it's not obvious at all that this is a performance optimization, not a 
> correctness issue. Also explain that '64' is a number that we got from performance 
> measurements.
> 
> But in general I like the change - as long as it was measured on reasonably modern 
> x86 CPUs. I.e. it should not regress on modern Intel or AMD CPUs.

Thank you for reviewing this.

I'll add an hopefully descriptive comment in v2.

The above figures are for an Intel Xeon E5-2690 v4.

I see similar data points with an i7-6500U CPU, while an i7-4810MQ
shows slightly better improvements. 

I'm in the process of collecting more figures for AMD processors, which
 I don't have so handy - it may take some time.

Thanks,

Paolo

  reply	other threads:[~2017-06-22 17:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 11:09 [PATCH] x86/uaccess: use unrolled string copy for short strings Paolo Abeni
2017-06-21 17:38 ` Kees Cook
2017-06-22 14:55   ` Alan Cox
2017-06-22  8:47 ` Ingo Molnar
2017-06-22 17:02   ` Paolo Abeni [this message]
2017-06-22 17:30 ` Linus Torvalds
2017-06-22 17:54   ` Paolo Abeni
2017-06-29 13:55   ` [PATCH] x86/uaccess: optimize copy_user_enhanced_fast_string for short string Paolo Abeni
2017-06-29 21:40     ` Linus Torvalds
2017-06-30 13:10     ` [tip:x86/asm] x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings tip-bot for Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1498150967.2503.4.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=hannes@stressinduktion.org \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).