From: Al Viro <viro@ZenIV.linux.org.uk> To: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Richard Henderson <rth@twiddle.net>, Will Deacon <will.deacon@arm.com>, Haavard Skinnemoen <hskinnemoen@gmail.com>, Vineet Gupta <vgupta@synopsys.com>, Steven Miao <realmz6@gmail.com>, Jesper Nilsson <jesper.nilsson@axis.com>, Mark Salter <msalter@redhat.com>, Yoshinori Sato <ysato@users.sourceforge.jp>, Richard Kuo <rkuo@codeaurora.org>, Tony Luck <tony.luck@intel.com>, Geert Uytterhoeven <geert@linux-m68k.org>, James Hogan <james.hogan@imgtec.com>, Michal Simek <monstr@monstr.eu>, David Howells <dhowells@redhat.com>, Ley Foon Tan <lftan@altera.com>, Jonas Bonn <jonas@southpole.se>, Helge Deller <deller@gmx.de>, Martin Schwidefsky <schwi> Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification Date: Thu, 30 Mar 2017 17:43:42 +0100 [thread overview] Message-ID: <20170330164342.GR29622@ZenIV.linux.org.uk> (raw) In-Reply-To: <20170330162241.GG7909@n2100.armlinux.org.uk> On Thu, Mar 30, 2017 at 05:22:41PM +0100, Russell King - ARM Linux wrote: > On Wed, Mar 29, 2017 at 06:57:06AM +0100, Al Viro wrote: > > Comments, review, testing, replacement patches, etc. are very welcome. > > I've given this a spin, and it appears to work (in that the box boots). > > Kernel size wise: > > text data bss dec hex filename > 8020229 3014220 10243276 21277725 144ac1d vmlinux.orig > 8034741 3014388 10243276 21292405 144e575 vmlinux.uaccess > 7976719 3014324 10243276 21234319 144028f vmlinux.noinline > > Performance using hdparm -T (cached reads) to evaluate against a SSD > gives me the following results: > > * original: > Timing cached reads: 580 MB in 2.00 seconds = 289.64 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 290.06 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 289.65 MB/sec > Timing cached reads: 582 MB in 2.00 seconds = 290.82 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 289.07 MB/sec > > Average = 289.85MB/s > > * uaccess: > Timing cached reads: 578 MB in 2.00 seconds = 288.36 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 266.68 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 267.07 MB/sec > Timing cached reads: 552 MB in 2.00 seconds = 275.45 MB/sec > Timing cached reads: 532 MB in 2.00 seconds = 266.08 MB/sec > > Average = 272.73 MB/sec > > * noinline: > Timing cached reads: 548 MB in 2.00 seconds = 274.16 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 287.19 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 286.47 MB/sec > Timing cached reads: 572 MB in 2.00 seconds = 286.20 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 288.86 MB/sec > > Average = 284.58 MB/sec > > I've run the test twice, and there's definitely a reproducable drop in > performance for some reason when switching between current and Al's > uaccess patches, which is partly recovered by switching to the out of > line versions. > > The only difference that I can identify that could explain this are > the extra might_fault() checks in Al's version but which are missing > from the ARM version. How would the following affect things? diff --git a/lib/iov_iter.c b/lib/iov_iter.c index e68604ae3ced..d24d338f0682 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -184,7 +184,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b kaddr = kmap(page); from = kaddr + offset; - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip += copy; from += copy; @@ -193,7 +193,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip = copy; from += copy; @@ -267,7 +267,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t kaddr = kmap(page); to = kaddr + offset; - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip += copy; to += copy; @@ -276,7 +276,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip = copy; to += copy; @@ -541,7 +541,7 @@ size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) if (unlikely(i->type & ITER_PIPE)) return copy_pipe_to_iter(addr, bytes, i); iterate_and_advance(i, bytes, v, - __copy_to_user(v.iov_base, (from += v.iov_len) - v.iov_len, + __copy_to_user_inatomic(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len), memcpy_to_page(v.bv_page, v.bv_offset, (from += v.bv_len) - v.bv_len, v.bv_len), @@ -560,7 +560,7 @@ size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i) return 0; } iterate_and_advance(i, bytes, v, - __copy_from_user((to += v.iov_len) - v.iov_len, v.iov_base, + __copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), @@ -582,7 +582,7 @@ bool copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i) return false; iterate_all_kinds(i, bytes, v, ({ - if (__copy_from_user((to += v.iov_len) - v.iov_len, + if (__copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)) return false; 0;}),
WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk> To: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Richard Henderson <rth@twiddle.net>, Will Deacon <will.deacon@arm.com>, Haavard Skinnemoen <hskinnemoen@gmail.com>, Vineet Gupta <vgupta@synopsys.com>, Steven Miao <realmz6@gmail.com>, Jesper Nilsson <jesper.nilsson@axis.com>, Mark Salter <msalter@redhat.com>, Yoshinori Sato <ysato@users.sourceforge.jp>, Richard Kuo <rkuo@codeaurora.org>, Tony Luck <tony.luck@intel.com>, Geert Uytterhoeven <geert@linux-m68k.org>, James Hogan <james.hogan@imgtec.com>, Michal Simek <monstr@monstr.eu>, David Howells <dhowells@redhat.com>, Ley Foon Tan <lftan@altera.com>, Jonas Bonn <jonas@southpole.se>, Helge Deller <deller@gmx.de>, Martin Schwidefsky <schwidefsky@de.ibm.com>, Ralf Baechle <ralf@linux-mips.org>, Benjamin Herrenschmidt <benh@kernel.crashing.org>, Chen Liqin <liqin.linux@gmail.com>, "David S. Miller" <davem@davemloft.net>, Chris Metcalf <cmetcalf@mellanox.com>, Richard Weinberger <richard@nod.at>, Guan Xuetao <gxt@mprc.pku.edu.cn>, Thomas Gleixner <tglx@linutronix.de>, Chris Zankel <chris@zankel.net> Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification Date: Thu, 30 Mar 2017 17:43:42 +0100 [thread overview] Message-ID: <20170330164342.GR29622@ZenIV.linux.org.uk> (raw) Message-ID: <20170330164342.YCd9DwkURmTtZhG6FbeJ0HkY8atjiwx3jsOLQK2dKsw@z> (raw) In-Reply-To: <20170330162241.GG7909@n2100.armlinux.org.uk> On Thu, Mar 30, 2017 at 05:22:41PM +0100, Russell King - ARM Linux wrote: > On Wed, Mar 29, 2017 at 06:57:06AM +0100, Al Viro wrote: > > Comments, review, testing, replacement patches, etc. are very welcome. > > I've given this a spin, and it appears to work (in that the box boots). > > Kernel size wise: > > text data bss dec hex filename > 8020229 3014220 10243276 21277725 144ac1d vmlinux.orig > 8034741 3014388 10243276 21292405 144e575 vmlinux.uaccess > 7976719 3014324 10243276 21234319 144028f vmlinux.noinline > > Performance using hdparm -T (cached reads) to evaluate against a SSD > gives me the following results: > > * original: > Timing cached reads: 580 MB in 2.00 seconds = 289.64 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 290.06 MB/sec > Timing cached reads: 580 MB in 2.00 seconds = 289.65 MB/sec > Timing cached reads: 582 MB in 2.00 seconds = 290.82 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 289.07 MB/sec > > Average = 289.85MB/s > > * uaccess: > Timing cached reads: 578 MB in 2.00 seconds = 288.36 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 266.68 MB/sec > Timing cached reads: 534 MB in 2.00 seconds = 267.07 MB/sec > Timing cached reads: 552 MB in 2.00 seconds = 275.45 MB/sec > Timing cached reads: 532 MB in 2.00 seconds = 266.08 MB/sec > > Average = 272.73 MB/sec > > * noinline: > Timing cached reads: 548 MB in 2.00 seconds = 274.16 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 287.19 MB/sec > Timing cached reads: 574 MB in 2.00 seconds = 286.47 MB/sec > Timing cached reads: 572 MB in 2.00 seconds = 286.20 MB/sec > Timing cached reads: 578 MB in 2.00 seconds = 288.86 MB/sec > > Average = 284.58 MB/sec > > I've run the test twice, and there's definitely a reproducable drop in > performance for some reason when switching between current and Al's > uaccess patches, which is partly recovered by switching to the out of > line versions. > > The only difference that I can identify that could explain this are > the extra might_fault() checks in Al's version but which are missing > from the ARM version. How would the following affect things? diff --git a/lib/iov_iter.c b/lib/iov_iter.c index e68604ae3ced..d24d338f0682 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -184,7 +184,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b kaddr = kmap(page); from = kaddr + offset; - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip += copy; from += copy; @@ -193,7 +193,7 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_to_user(buf, from, copy); + left = __copy_to_user_inatomic(buf, from, copy); copy -= left; skip = copy; from += copy; @@ -267,7 +267,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t kaddr = kmap(page); to = kaddr + offset; - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip += copy; to += copy; @@ -276,7 +276,7 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t iov++; buf = iov->iov_base; copy = min(bytes, iov->iov_len); - left = __copy_from_user(to, buf, copy); + left = __copy_from_user_inatomic(to, buf, copy); copy -= left; skip = copy; to += copy; @@ -541,7 +541,7 @@ size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) if (unlikely(i->type & ITER_PIPE)) return copy_pipe_to_iter(addr, bytes, i); iterate_and_advance(i, bytes, v, - __copy_to_user(v.iov_base, (from += v.iov_len) - v.iov_len, + __copy_to_user_inatomic(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len), memcpy_to_page(v.bv_page, v.bv_offset, (from += v.bv_len) - v.bv_len, v.bv_len), @@ -560,7 +560,7 @@ size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i) return 0; } iterate_and_advance(i, bytes, v, - __copy_from_user((to += v.iov_len) - v.iov_len, v.iov_base, + __copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), @@ -582,7 +582,7 @@ bool copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i) return false; iterate_all_kinds(i, bytes, v, ({ - if (__copy_from_user((to += v.iov_len) - v.iov_len, + if (__copy_from_user_inatomic((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)) return false; 0;}),
next prev parent reply other threads:[~2017-03-30 16:43 UTC|newest] Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-03-29 5:57 [RFC][CFT][PATCHSET v1] uaccess unification Al Viro 2017-03-29 5:57 ` Al Viro 2017-03-29 20:08 ` Vineet Gupta 2017-03-29 20:08 ` Vineet Gupta 2017-03-29 20:29 ` Al Viro 2017-03-29 20:29 ` Al Viro 2017-03-29 20:37 ` Linus Torvalds 2017-03-29 20:37 ` Linus Torvalds 2017-03-29 21:03 ` Al Viro 2017-03-29 21:03 ` Al Viro 2017-03-29 21:24 ` Linus Torvalds 2017-03-29 21:24 ` Linus Torvalds 2017-03-29 23:09 ` Al Viro 2017-03-29 23:09 ` Al Viro 2017-03-29 23:43 ` Linus Torvalds 2017-03-29 23:43 ` Linus Torvalds 2017-03-30 15:31 ` Al Viro 2017-03-30 15:31 ` Al Viro 2017-03-29 21:14 ` Vineet Gupta 2017-03-29 21:14 ` Vineet Gupta 2017-03-29 23:42 ` Al Viro 2017-03-29 23:42 ` Al Viro 2017-03-30 0:02 ` Vineet Gupta 2017-03-30 0:02 ` Vineet Gupta 2017-03-30 0:27 ` Linus Torvalds 2017-03-30 0:27 ` Linus Torvalds 2017-03-30 1:15 ` Al Viro 2017-03-30 1:15 ` Al Viro 2017-03-30 20:40 ` Vineet Gupta 2017-03-30 20:40 ` Vineet Gupta 2017-03-30 20:59 ` Linus Torvalds 2017-03-30 20:59 ` Linus Torvalds 2017-03-30 23:21 ` Russell King - ARM Linux 2017-03-30 23:21 ` Russell King - ARM Linux 2017-03-30 12:32 ` Martin Schwidefsky 2017-03-30 12:32 ` Martin Schwidefsky 2017-03-30 14:48 ` Al Viro 2017-03-30 14:48 ` Al Viro 2017-03-30 16:22 ` Russell King - ARM Linux 2017-03-30 16:22 ` Russell King - ARM Linux 2017-03-30 16:43 ` Al Viro [this message] 2017-03-30 16:43 ` Al Viro 2017-03-30 17:18 ` Linus Torvalds 2017-03-30 17:18 ` Linus Torvalds 2017-03-30 18:48 ` Al Viro 2017-03-30 18:48 ` Al Viro 2017-03-30 18:54 ` Al Viro 2017-03-30 18:54 ` Al Viro 2017-03-30 18:59 ` Linus Torvalds 2017-03-30 18:59 ` Linus Torvalds 2017-03-30 19:10 ` Al Viro 2017-03-30 19:10 ` Al Viro 2017-03-30 19:19 ` Linus Torvalds 2017-03-30 19:19 ` Linus Torvalds 2017-03-30 21:08 ` Al Viro 2017-03-30 21:08 ` Al Viro 2017-03-30 18:56 ` Linus Torvalds 2017-03-30 18:56 ` Linus Torvalds 2017-03-31 0:21 ` Kees Cook 2017-03-31 0:21 ` Kees Cook 2017-03-31 13:38 ` James Hogan 2017-03-31 13:38 ` James Hogan 2017-04-03 16:27 ` James Morse 2017-04-03 16:27 ` James Morse 2017-04-04 20:26 ` Max Filippov 2017-04-04 20:26 ` Max Filippov 2017-04-04 20:52 ` Al Viro 2017-04-04 20:52 ` Al Viro 2017-04-05 5:05 ` ia64 exceptions (Re: [RFC][CFT][PATCHSET v1] uaccess unification) Al Viro 2017-04-05 8:08 ` Al Viro 2017-04-05 8:08 ` Al Viro 2017-04-05 18:44 ` Tony Luck 2017-04-05 18:44 ` Tony Luck 2017-04-05 20:33 ` Al Viro 2017-04-05 20:33 ` Al Viro 2017-04-07 0:24 ` [RFC][CFT][PATCHSET v2] uaccess unification Al Viro 2017-04-07 0:24 ` Al Viro 2017-04-07 0:35 ` Al Viro 2017-04-07 0:35 ` Al Viro
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170330164342.GR29622@ZenIV.linux.org.uk \ --to=viro@zeniv.linux.org.uk \ --cc=deller@gmx.de \ --cc=dhowells@redhat.com \ --cc=geert@linux-m68k.org \ --cc=hskinnemoen@gmail.com \ --cc=james.hogan@imgtec.com \ --cc=jesper.nilsson@axis.com \ --cc=jonas@southpole.se \ --cc=lftan@altera.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@armlinux.org.uk \ --cc=monstr@monstr.eu \ --cc=msalter@redhat.com \ --cc=realmz6@gmail.com \ --cc=rkuo@codeaurora.org \ --cc=rth@twiddle.net \ --cc=tony.luck@intel.com \ --cc=torvalds@linux-foundation.org \ --cc=vgupta@synopsys.com \ --cc=will.deacon@arm.com \ --cc=ysato@users.sourceforge.jp \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).