linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Russell King - ARM Linux <linux@armlinux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Richard Henderson <rth@twiddle.net>,
	Will Deacon <will.deacon@arm.com>,
	Haavard Skinnemoen <hskinnemoen@gmail.com>,
	Steven Miao <realmz6@gmail.com>,
	Jesper Nilsson <jesper.nilsson@axis.com>,
	Mark Salter <msalter@redhat.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Richard Kuo <rkuo@codeaurora.org>,
	Tony Luck <tony.luck@intel.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	James Hogan <james.hogan@imgtec.com>,
	Michal Simek <monstr@monstr.eu>,
	David Howells <dhowells@redhat.com>,
	Ley Foon Tan <lftan@altera.com>, Jonas Bonn <Jonas.Nilsson@syno>
Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification
Date: Fri, 31 Mar 2017 00:21:47 +0100	[thread overview]
Message-ID: <20170330232147.GL7909@n2100.armlinux.org.uk> (raw)
In-Reply-To: <CA+55aFyQL75SOyx=zn1zWvy+TS-Ockv=O9Q59b_ZQwSeCh7WnQ@mail.gmail.com>

On Thu, Mar 30, 2017 at 01:59:58PM -0700, Linus Torvalds wrote:
> On Thu, Mar 30, 2017 at 1:40 PM, Vineet Gupta
> <Vineet.Gupta1@synopsys.com> wrote:
> >
> > So it's a mix bag really. Maybe we need some better directed test to really drill
> > it down.
> 
> As mentioned inn the discussion about ARM, I seriously doubt that the
> inlining will even be noticeable compared to other effects here.

(Sorry to switch sub-threads.)

I'm running tests on that point, concentrating on hdparm -T and perfing
that.  You're right in so far as perf identifies the hotspot as the
copy_to_user() function for that workload, rather than the inlined bits
- the top hits in perf of hdparm -T are:

+   66.52%  hdparm  [k] __copy_to_user_std
+    8.49%  hdparm  [k] generic_file_read_iter
+    3.82%  hdparm  [k] lock_acquire
+    2.80%  hdparm  [k] copy_page_to_iter
+    2.49%  hdparm  [k] find_get_entry
+    1.19%  hdparm  [k] lock_release

Note: perf on ARM does is affected by IRQ-disabled regions, so hotspots
can be off.

The generic_file_read_iter() one is definitely affected by an IRQ-
disabled region in there.

Here's the average hdparm -T transfer rates and standard deviation over
20 samples:

Unpatched:        Average=320.42 MB/s sigma=0.878657
Uaccess+inline:   Average=318.77 MB/s sigma=1.003332
Uaccess+noinline: Average=319.40 MB/s sigma=1.088354

This pattern - where the noinline version sits between the inlined
version and unpatched version seems to be a pattern in all the
measurements I've done so far, and it points to inlining that code
having a slight detrimental effect.  What we don't know is whether
uninlining the code without Al's patch would see a slight boost,
but I'm not about to go there.

However, this all points towards there being a very slight advantage
to dropping the INLINE_COPY_TO_USER and INLINE_COPY_FROM_USER for
ARM, but I'd say it's really down in the noise - I'm not concerned.

> (On ARM, hopefully the UAO bit is faster to set, but it's still
> "another instruction before and after", so even if it's not as
> expensive as clac/stac are on current x86 chips, it's an argument
> against inlining)

The UAO set/clear does show up as a hotspot within copy_page_to_iter(),
but as we can see, overall its about 3% of the workload.  Within
copy_page_to_iter(), it's the __put_user() based loop inside
fault_in_pages_writeable() which has the hotspot, due to the repeated
enable+disable sequence (more the instruction barriers that we need.)

Perf reports that the barriers account for 8.33 and 17.59% of the
time spent within that function, so we're actually talking about
maybe .25% and .5% of this workload spent doing the UAO thing.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

WARNING: multiple messages have this Message-ID (diff)
From: Russell King - ARM Linux <linux@armlinux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Richard Henderson <rth@twiddle.net>,
	Will Deacon <will.deacon@arm.com>,
	Haavard Skinnemoen <hskinnemoen@gmail.com>,
	Steven Miao <realmz6@gmail.com>,
	Jesper Nilsson <jesper.nilsson@axis.com>,
	Mark Salter <msalter@redhat.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Richard Kuo <rkuo@codeaurora.org>,
	Tony Luck <tony.luck@intel.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	James Hogan <james.hogan@imgtec.com>,
	Michal Simek <monstr@monstr.eu>,
	David Howells <dhowells@redhat.com>,
	Ley Foon Tan <lftan@altera.com>,
	Jonas Bonn <Jonas.Nilsson@synopsys.com>
Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification
Date: Fri, 31 Mar 2017 00:21:47 +0100	[thread overview]
Message-ID: <20170330232147.GL7909@n2100.armlinux.org.uk> (raw)
Message-ID: <20170330232147.xd4hA-gLlSD6EVPCAEyimDBIZtz36MJiaeY8QupzhYY@z> (raw)
In-Reply-To: <CA+55aFyQL75SOyx=zn1zWvy+TS-Ockv=O9Q59b_ZQwSeCh7WnQ@mail.gmail.com>

On Thu, Mar 30, 2017 at 01:59:58PM -0700, Linus Torvalds wrote:
> On Thu, Mar 30, 2017 at 1:40 PM, Vineet Gupta
> <Vineet.Gupta1@synopsys.com> wrote:
> >
> > So it's a mix bag really. Maybe we need some better directed test to really drill
> > it down.
> 
> As mentioned inn the discussion about ARM, I seriously doubt that the
> inlining will even be noticeable compared to other effects here.

(Sorry to switch sub-threads.)

I'm running tests on that point, concentrating on hdparm -T and perfing
that.  You're right in so far as perf identifies the hotspot as the
copy_to_user() function for that workload, rather than the inlined bits
- the top hits in perf of hdparm -T are:

+   66.52%  hdparm  [k] __copy_to_user_std
+    8.49%  hdparm  [k] generic_file_read_iter
+    3.82%  hdparm  [k] lock_acquire
+    2.80%  hdparm  [k] copy_page_to_iter
+    2.49%  hdparm  [k] find_get_entry
+    1.19%  hdparm  [k] lock_release

Note: perf on ARM does is affected by IRQ-disabled regions, so hotspots
can be off.

The generic_file_read_iter() one is definitely affected by an IRQ-
disabled region in there.

Here's the average hdparm -T transfer rates and standard deviation over
20 samples:

Unpatched:        Average=320.42 MB/s sigma=0.878657
Uaccess+inline:   Average=318.77 MB/s sigma=1.003332
Uaccess+noinline: Average=319.40 MB/s sigma=1.088354

This pattern - where the noinline version sits between the inlined
version and unpatched version seems to be a pattern in all the
measurements I've done so far, and it points to inlining that code
having a slight detrimental effect.  What we don't know is whether
uninlining the code without Al's patch would see a slight boost,
but I'm not about to go there.

However, this all points towards there being a very slight advantage
to dropping the INLINE_COPY_TO_USER and INLINE_COPY_FROM_USER for
ARM, but I'd say it's really down in the noise - I'm not concerned.

> (On ARM, hopefully the UAO bit is faster to set, but it's still
> "another instruction before and after", so even if it's not as
> expensive as clac/stac are on current x86 chips, it's an argument
> against inlining)

The UAO set/clear does show up as a hotspot within copy_page_to_iter(),
but as we can see, overall its about 3% of the workload.  Within
copy_page_to_iter(), it's the __put_user() based loop inside
fault_in_pages_writeable() which has the hotspot, due to the repeated
enable+disable sequence (more the instruction barriers that we need.)

Perf reports that the barriers account for 8.33 and 17.59% of the
time spent within that function, so we're actually talking about
maybe .25% and .5% of this workload spent doing the UAO thing.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

  parent reply	other threads:[~2017-03-30 23:21 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-29  5:57 [RFC][CFT][PATCHSET v1] uaccess unification Al Viro
2017-03-29  5:57 ` Al Viro
2017-03-29 20:08 ` Vineet Gupta
2017-03-29 20:08   ` Vineet Gupta
2017-03-29 20:29   ` Al Viro
2017-03-29 20:29     ` Al Viro
2017-03-29 20:37     ` Linus Torvalds
2017-03-29 20:37       ` Linus Torvalds
2017-03-29 21:03       ` Al Viro
2017-03-29 21:03         ` Al Viro
2017-03-29 21:24         ` Linus Torvalds
2017-03-29 21:24           ` Linus Torvalds
2017-03-29 23:09           ` Al Viro
2017-03-29 23:09             ` Al Viro
2017-03-29 23:43             ` Linus Torvalds
2017-03-29 23:43               ` Linus Torvalds
2017-03-30 15:31               ` Al Viro
2017-03-30 15:31                 ` Al Viro
2017-03-29 21:14     ` Vineet Gupta
2017-03-29 21:14       ` Vineet Gupta
2017-03-29 23:42       ` Al Viro
2017-03-29 23:42         ` Al Viro
2017-03-30  0:02         ` Vineet Gupta
2017-03-30  0:02           ` Vineet Gupta
2017-03-30  0:27           ` Linus Torvalds
2017-03-30  0:27             ` Linus Torvalds
2017-03-30  1:15             ` Al Viro
2017-03-30  1:15               ` Al Viro
2017-03-30 20:40             ` Vineet Gupta
2017-03-30 20:40               ` Vineet Gupta
2017-03-30 20:59               ` Linus Torvalds
2017-03-30 20:59                 ` Linus Torvalds
2017-03-30 23:21                 ` Russell King - ARM Linux [this message]
2017-03-30 23:21                   ` Russell King - ARM Linux
2017-03-30 12:32 ` Martin Schwidefsky
2017-03-30 12:32   ` Martin Schwidefsky
2017-03-30 14:48   ` Al Viro
2017-03-30 14:48     ` Al Viro
2017-03-30 16:22 ` Russell King - ARM Linux
2017-03-30 16:22   ` Russell King - ARM Linux
2017-03-30 16:43   ` Al Viro
2017-03-30 16:43     ` Al Viro
2017-03-30 17:18     ` Linus Torvalds
2017-03-30 17:18       ` Linus Torvalds
2017-03-30 18:48       ` Al Viro
2017-03-30 18:48         ` Al Viro
2017-03-30 18:54         ` Al Viro
2017-03-30 18:54           ` Al Viro
2017-03-30 18:59           ` Linus Torvalds
2017-03-30 18:59             ` Linus Torvalds
2017-03-30 19:10             ` Al Viro
2017-03-30 19:10               ` Al Viro
2017-03-30 19:19               ` Linus Torvalds
2017-03-30 19:19                 ` Linus Torvalds
2017-03-30 21:08                 ` Al Viro
2017-03-30 21:08                   ` Al Viro
2017-03-30 18:56         ` Linus Torvalds
2017-03-30 18:56           ` Linus Torvalds
2017-03-31  0:21 ` Kees Cook
2017-03-31  0:21   ` Kees Cook
2017-03-31 13:38   ` James Hogan
2017-03-31 13:38     ` James Hogan
2017-04-03 16:27 ` James Morse
2017-04-03 16:27   ` James Morse
2017-04-04 20:26 ` Max Filippov
2017-04-04 20:26   ` Max Filippov
2017-04-04 20:52   ` Al Viro
2017-04-04 20:52     ` Al Viro
2017-04-05  5:05 ` ia64 exceptions (Re: [RFC][CFT][PATCHSET v1] uaccess unification) Al Viro
2017-04-05  8:08   ` Al Viro
2017-04-05  8:08     ` Al Viro
2017-04-05 18:44     ` Tony Luck
2017-04-05 18:44       ` Tony Luck
2017-04-05 20:33       ` Al Viro
2017-04-05 20:33         ` Al Viro
2017-04-07  0:24 ` [RFC][CFT][PATCHSET v2] uaccess unification Al Viro
2017-04-07  0:24   ` Al Viro
2017-04-07  0:35   ` Al Viro
2017-04-07  0:35     ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170330232147.GL7909@n2100.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=Jonas.Nilsson@syno \
    --cc=Vineet.Gupta1@synopsys.com \
    --cc=dhowells@redhat.com \
    --cc=geert@linux-m68k.org \
    --cc=hskinnemoen@gmail.com \
    --cc=james.hogan@imgtec.com \
    --cc=jesper.nilsson@axis.com \
    --cc=lftan@altera.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=monstr@monstr.eu \
    --cc=msalter@redhat.com \
    --cc=realmz6@gmail.com \
    --cc=rkuo@codeaurora.org \
    --cc=rth@twiddle.net \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).