* [PATCH] x86/copy_user.S: Remove zero byte check before copy user buffer.
@ 2013-11-16 20:37 Fenghua Yu
2013-11-17 6:18 ` [tip:x86/asm] x86-64, copy_user: " tip-bot for Fenghua Yu
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Fenghua Yu @ 2013-11-16 20:37 UTC (permalink / raw)
To: Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
Cc: linux-kernel, Fenghua Yu
From: Fenghua Yu <fenghua.yu@intel.com>
Operation of rep movsb instruction handles zero byte copy. As pointed out by
Linus, there is no need to check zero size in kernel. Removing this redundant
check saves a few cycles in copy user functions.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
arch/x86/lib/copy_user_64.S | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index a30ca15..ffe4eb9 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -236,8 +236,6 @@ ENDPROC(copy_user_generic_unrolled)
ENTRY(copy_user_generic_string)
CFI_STARTPROC
ASM_STAC
- andl %edx,%edx
- jz 4f
cmpl $8,%edx
jb 2f /* less than 8 bytes, go to byte copy loop */
ALIGN_DESTINATION
@@ -249,7 +247,7 @@ ENTRY(copy_user_generic_string)
2: movl %edx,%ecx
3: rep
movsb
-4: xorl %eax,%eax
+ xorl %eax,%eax
ASM_CLAC
ret
@@ -279,12 +277,10 @@ ENDPROC(copy_user_generic_string)
ENTRY(copy_user_enhanced_fast_string)
CFI_STARTPROC
ASM_STAC
- andl %edx,%edx
- jz 2f
movl %edx,%ecx
1: rep
movsb
-2: xorl %eax,%eax
+ xorl %eax,%eax
ASM_CLAC
ret
--
1.8.0.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. 2013-11-16 20:37 [PATCH] x86/copy_user.S: Remove zero byte check before copy user buffer Fenghua Yu @ 2013-11-17 6:18 ` tip-bot for Fenghua Yu [not found] ` <CA+55aFyFz_6oy8-1jb5Jzk+4VC5MLA5d0KbqhZxki0=+DmggBg@mail.gmail.com> 2013-11-20 20:54 ` [tip:x86/asm] x86-64, copy_user: Use leal to produce 32-bit results tip-bot for H. Peter Anvin 2013-11-20 22:00 ` tip-bot for H. Peter Anvin 2 siblings, 1 reply; 12+ messages in thread From: tip-bot for Fenghua Yu @ 2013-11-17 6:18 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, fenghua.yu, tglx, hpa Commit-ID: f4cb1cc18f364d761d5614eb6293cccc6647f259 Gitweb: http://git.kernel.org/tip/f4cb1cc18f364d761d5614eb6293cccc6647f259 Author: Fenghua Yu <fenghua.yu@intel.com> AuthorDate: Sat, 16 Nov 2013 12:37:01 -0800 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Sat, 16 Nov 2013 18:00:58 -0800 x86-64, copy_user: Remove zero byte check before copy user buffer. Operation of rep movsb instruction handles zero byte copy. As pointed out by Linus, there is no need to check zero size in kernel. Removing this redundant check saves a few cycles in copy user functions. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> --- arch/x86/lib/copy_user_64.S | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index a30ca15..ffe4eb9 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -236,8 +236,6 @@ ENDPROC(copy_user_generic_unrolled) ENTRY(copy_user_generic_string) CFI_STARTPROC ASM_STAC - andl %edx,%edx - jz 4f cmpl $8,%edx jb 2f /* less than 8 bytes, go to byte copy loop */ ALIGN_DESTINATION @@ -249,7 +247,7 @@ ENTRY(copy_user_generic_string) 2: movl %edx,%ecx 3: rep movsb -4: xorl %eax,%eax + xorl %eax,%eax ASM_CLAC ret @@ -279,12 +277,10 @@ ENDPROC(copy_user_generic_string) ENTRY(copy_user_enhanced_fast_string) CFI_STARTPROC ASM_STAC - andl %edx,%edx - jz 2f movl %edx,%ecx 1: rep movsb -2: xorl %eax,%eax + xorl %eax,%eax ASM_CLAC ret ^ permalink raw reply related [flat|nested] 12+ messages in thread
[parent not found: <CA+55aFyFz_6oy8-1jb5Jzk+4VC5MLA5d0KbqhZxki0=+DmggBg@mail.gmail.com>]
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. [not found] ` <CA+55aFyFz_6oy8-1jb5Jzk+4VC5MLA5d0KbqhZxki0=+DmggBg@mail.gmail.com> @ 2013-11-17 6:51 ` H. Peter Anvin 2013-11-19 4:37 ` H. Peter Anvin 2013-11-20 19:28 ` H. Peter Anvin 2 siblings, 0 replies; 12+ messages in thread From: H. Peter Anvin @ 2013-11-17 6:51 UTC (permalink / raw) To: Linus Torvalds, Ingo Molnar Cc: Peter Anvin, tglx, linux-tip-commits, fenghua.yu, linux-kernel On 11/16/2013 10:44 PM, Linus Torvalds wrote: > So this doesn't do the 32-bit truncation in the error path of the generic > string copy. Oversight? > > Linus Indeed... although in the kernel it seems to be taken as an invariant that copy lengths over 4G is simply prohibited. There are places all over the kernel which will fail in a massive way if we ever ended up with a copy over 4G in size. As such, I would argue the code with the patch is actually no more broken than with the truncation in place; if anything it is *more* correct than the modified one, since for a (very small) subset of >=4G copies it will actually do the right thing, albeit slowly. The truncations do make me twitch a little inside, I have to admit. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. [not found] ` <CA+55aFyFz_6oy8-1jb5Jzk+4VC5MLA5d0KbqhZxki0=+DmggBg@mail.gmail.com> 2013-11-17 6:51 ` H. Peter Anvin @ 2013-11-19 4:37 ` H. Peter Anvin 2013-11-19 19:38 ` Linus Torvalds 2013-11-20 19:28 ` H. Peter Anvin 2 siblings, 1 reply; 12+ messages in thread From: H. Peter Anvin @ 2013-11-19 4:37 UTC (permalink / raw) To: Linus Torvalds, Ingo Molnar Cc: Peter Anvin, tglx, linux-tip-commits, fenghua.yu, linux-kernel On 11/16/2013 10:44 PM, Linus Torvalds wrote: > So this doesn't do the 32-bit truncation in the error path of the generic > string copy. Oversight? > > Linus Hi Linus, Do you have a preference: 1. Considering the 32-bit truncation incidental (take it or leave it); 2. Require the 32-bit truncation, or 3. Get rid of it completely? -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. 2013-11-19 4:37 ` H. Peter Anvin @ 2013-11-19 19:38 ` Linus Torvalds 2013-11-20 19:12 ` H. Peter Anvin 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2013-11-19 19:38 UTC (permalink / raw) To: H. Peter Anvin Cc: Ingo Molnar, Peter Anvin, Thomas Gleixner, linux-tip-commits, Fenghua Yu, Linux Kernel Mailing List On Mon, Nov 18, 2013 at 8:37 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > Do you have a preference: > > 1. Considering the 32-bit truncation incidental (take it or leave it); > 2. Require the 32-bit truncation, or > 3. Get rid of it completely? I don't have a huge preference, but I hate the current situation (with Fenghua's patch) where it's not consistent. One path uses just 32-bits of the count (thanks to the "mov %edx,%ecx") while another path uses 64 bits. One or the other, but not a mixture of both. And only tangentially related to this: I do think that we could be stricter about the count. Make it oops if the high bits are set, rather than overwrite a lot of memory. So I would not be adverse to limiting the count to 31 bits (or even less) explicitly, and thus making the while 32-vs-64 bit issue moot. Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. 2013-11-19 19:38 ` Linus Torvalds @ 2013-11-20 19:12 ` H. Peter Anvin 0 siblings, 0 replies; 12+ messages in thread From: H. Peter Anvin @ 2013-11-20 19:12 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Peter Anvin, Thomas Gleixner, linux-tip-commits, Fenghua Yu, Linux Kernel Mailing List On 11/19/2013 11:38 AM, Linus Torvalds wrote: > On Mon, Nov 18, 2013 at 8:37 PM, H. Peter Anvin <hpa@zytor.com> wrote: >> >> Do you have a preference: >> >> 1. Considering the 32-bit truncation incidental (take it or leave it); >> 2. Require the 32-bit truncation, or >> 3. Get rid of it completely? > > I don't have a huge preference, but I hate the current situation (with > Fenghua's patch) where it's not consistent. One path uses just 32-bits > of the count (thanks to the "mov %edx,%ecx") while another path uses > 64 bits. > > One or the other, but not a mixture of both. > > And only tangentially related to this: I do think that we could be > stricter about the count. Make it oops if the high bits are set, > rather than overwrite a lot of memory. So I would not be adverse to > limiting the count to 31 bits (or even less) explicitly, and thus > making the while 32-vs-64 bit issue moot. > I guess the question is if we want to spend the extra cycles on a test and branch. For now I suggest that we just put back the truncation in the form of a movl instruction. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. [not found] ` <CA+55aFyFz_6oy8-1jb5Jzk+4VC5MLA5d0KbqhZxki0=+DmggBg@mail.gmail.com> 2013-11-17 6:51 ` H. Peter Anvin 2013-11-19 4:37 ` H. Peter Anvin @ 2013-11-20 19:28 ` H. Peter Anvin 2013-11-20 20:13 ` Linus Torvalds 2 siblings, 1 reply; 12+ messages in thread From: H. Peter Anvin @ 2013-11-20 19:28 UTC (permalink / raw) To: Linus Torvalds, Ingo Molnar Cc: Peter Anvin, tglx, linux-tip-commits, fenghua.yu, linux-kernel On 11/16/2013 10:44 PM, Linus Torvalds wrote: > So this doesn't do the 32-bit truncation in the error path of the > generic string copy. Oversight? > > Linus I looked at the code again, and it turns out to be false alarm. We *do* do 32-bit truncation in every path, still: > ENTRY(copy_user_generic_string) > CFI_STARTPROC > ASM_STAC > cmpl $8,%edx > jb 2f /* less than 8 bytes, go to byte copy loop */ -> If we jump here, we will truncate at 2: > ALIGN_DESTINATION > movl %edx,%ecx -> If we don't jb 2f we end up > shrl $3,%ecx 32-bit truncation here... > andl $7,%edx > 1: rep > movsq > 2: movl %edx,%ecx 32-bit truncation here... > 3: rep > movsb > xorl %eax,%eax > ASM_CLAC > ret > > .section .fixup,"ax" > 11: lea (%rdx,%rcx,8),%rcx > 12: movl %ecx,%edx /* ecx is zerorest also */ -> Even if %rdx+%rcx*8 > 2^32 we end up truncating at 12: -- not that it matters, since both arguments are prototyped as "unsigned" and therefore the C compiler is supposed to guarantee the upper 32 bits are ignored. So I think Fenghua's patch is fine as-is. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. 2013-11-20 19:28 ` H. Peter Anvin @ 2013-11-20 20:13 ` Linus Torvalds 2013-11-20 20:36 ` H. Peter Anvin 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2013-11-20 20:13 UTC (permalink / raw) To: H. Peter Anvin Cc: Ingo Molnar, Peter Anvin, Thomas Gleixner, linux-tip-commits, Fenghua Yu, Linux Kernel Mailing List On Wed, Nov 20, 2013 at 11:28 AM, H. Peter Anvin <hpa@linux.intel.com> wrote: >> >> .section .fixup,"ax" >> 11: lea (%rdx,%rcx,8),%rcx >> 12: movl %ecx,%edx /* ecx is zerorest also */ > > -> Even if %rdx+%rcx*8 > 2^32 we end up truncating at 12: -- not that it > matters, since both arguments are prototyped as "unsigned" and therefore > the C compiler is supposed to guarantee the upper 32 bits are ignored. Ahh. That was the one I thought was broken, but yes, while the upper bits of %rcx are calculated and not zeroed, they end up not actually getting used. So yeah, I'll believe it's correct. Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. 2013-11-20 20:13 ` Linus Torvalds @ 2013-11-20 20:36 ` H. Peter Anvin 2013-11-20 21:44 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: H. Peter Anvin @ 2013-11-20 20:36 UTC (permalink / raw) To: Linus Torvalds, H. Peter Anvin Cc: Ingo Molnar, Thomas Gleixner, linux-tip-commits, Fenghua Yu, Linux Kernel Mailing List On 11/20/2013 12:13 PM, Linus Torvalds wrote: > On Wed, Nov 20, 2013 at 11:28 AM, H. Peter Anvin <hpa@linux.intel.com> wrote: >>> >>> .section .fixup,"ax" >>> 11: lea (%rdx,%rcx,8),%rcx >>> 12: movl %ecx,%edx /* ecx is zerorest also */ >> >> -> Even if %rdx+%rcx*8 > 2^32 we end up truncating at 12: -- not that it >> matters, since both arguments are prototyped as "unsigned" and therefore >> the C compiler is supposed to guarantee the upper 32 bits are ignored. > > Ahh. That was the one I thought was broken, but yes, while the upper > bits of %rcx are calculated and not zeroed, they end up not actually > getting used. So yeah, I'll believe it's correct. > That being said, "lea (%rdx,%rcx,8),%ecx" (leal, as opposed to leaq) is a perfectly legitimate instruction and actually one byte shorter. The big question is if some broken version of gas will choke on it. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [tip:x86/asm] x86-64, copy_user: Remove zero byte check before copy user buffer. 2013-11-20 20:36 ` H. Peter Anvin @ 2013-11-20 21:44 ` Linus Torvalds 0 siblings, 0 replies; 12+ messages in thread From: Linus Torvalds @ 2013-11-20 21:44 UTC (permalink / raw) To: H. Peter Anvin Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, linux-tip-commits, Fenghua Yu, Linux Kernel Mailing List On Wed, Nov 20, 2013 at 12:36 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > That being said, "lea (%rdx,%rcx,8),%ecx" (leal, as opposed to leaq) is > a perfectly legitimate instruction and actually one byte shorter. The > big question is if some broken version of gas will choke on it. At least gcc-4.8.2 generates that instruction (try multiplying an integer value by 9), so I guess gas will be happy. Except gcc uses "leal", so to be safe.. Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* [tip:x86/asm] x86-64, copy_user: Use leal to produce 32-bit results 2013-11-16 20:37 [PATCH] x86/copy_user.S: Remove zero byte check before copy user buffer Fenghua Yu 2013-11-17 6:18 ` [tip:x86/asm] x86-64, copy_user: " tip-bot for Fenghua Yu @ 2013-11-20 20:54 ` tip-bot for H. Peter Anvin 2013-11-20 22:00 ` tip-bot for H. Peter Anvin 2 siblings, 0 replies; 12+ messages in thread From: tip-bot for H. Peter Anvin @ 2013-11-20 20:54 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, fenghua.yu, tglx, hpa Commit-ID: 372474e12a858807f03f73cb30e830a76fd1ae07 Gitweb: http://git.kernel.org/tip/372474e12a858807f03f73cb30e830a76fd1ae07 Author: H. Peter Anvin <hpa@linux.intel.com> AuthorDate: Wed, 20 Nov 2013 12:50:51 -0800 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Wed, 20 Nov 2013 12:50:51 -0800 x86-64, copy_user: Use leal to produce 32-bit results When we are using lea to produce a 32-bit result, we can use the leal form, rather than using leaq and worry about truncation elsewhere. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> --- arch/x86/lib/copy_user_64.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index ffe4eb9..f00a87c 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -186,7 +186,7 @@ ENTRY(copy_user_generic_unrolled) 30: shll $6,%ecx addl %ecx,%edx jmp 60f -40: lea (%rdx,%rcx,8),%rdx +40: lea (%rdx,%rcx,8),%edx jmp 60f 50: movl %ecx,%edx 60: jmp copy_user_handle_tail /* ecx is zerorest also */ @@ -252,7 +252,7 @@ ENTRY(copy_user_generic_string) ret .section .fixup,"ax" -11: lea (%rdx,%rcx,8),%rcx +11: lea (%rdx,%rcx,8),%ecx 12: movl %ecx,%edx /* ecx is zerorest also */ jmp copy_user_handle_tail .previous ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [tip:x86/asm] x86-64, copy_user: Use leal to produce 32-bit results 2013-11-16 20:37 [PATCH] x86/copy_user.S: Remove zero byte check before copy user buffer Fenghua Yu 2013-11-17 6:18 ` [tip:x86/asm] x86-64, copy_user: " tip-bot for Fenghua Yu 2013-11-20 20:54 ` [tip:x86/asm] x86-64, copy_user: Use leal to produce 32-bit results tip-bot for H. Peter Anvin @ 2013-11-20 22:00 ` tip-bot for H. Peter Anvin 2 siblings, 0 replies; 12+ messages in thread From: tip-bot for H. Peter Anvin @ 2013-11-20 22:00 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, fenghua.yu, tglx, hpa Commit-ID: 661c80192d21269c7fc566f1d547510b0c867677 Gitweb: http://git.kernel.org/tip/661c80192d21269c7fc566f1d547510b0c867677 Author: H. Peter Anvin <hpa@linux.intel.com> AuthorDate: Wed, 20 Nov 2013 12:50:51 -0800 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Wed, 20 Nov 2013 13:57:07 -0800 x86-64, copy_user: Use leal to produce 32-bit results When we are using lea to produce a 32-bit result, we can use the leal form, rather than using leaq and worry about truncation elsewhere. Make the leal explicit, both to be more obvious and since that is what gcc generates and thus is less likely to trigger obscure gas bugs. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> --- arch/x86/lib/copy_user_64.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index ffe4eb9..dee945d 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -186,7 +186,7 @@ ENTRY(copy_user_generic_unrolled) 30: shll $6,%ecx addl %ecx,%edx jmp 60f -40: lea (%rdx,%rcx,8),%rdx +40: leal (%rdx,%rcx,8),%edx jmp 60f 50: movl %ecx,%edx 60: jmp copy_user_handle_tail /* ecx is zerorest also */ @@ -252,7 +252,7 @@ ENTRY(copy_user_generic_string) ret .section .fixup,"ax" -11: lea (%rdx,%rcx,8),%rcx +11: leal (%rdx,%rcx,8),%ecx 12: movl %ecx,%edx /* ecx is zerorest also */ jmp copy_user_handle_tail .previous ^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-11-20 22:00 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-16 20:37 [PATCH] x86/copy_user.S: Remove zero byte check before copy user buffer Fenghua Yu
2013-11-17 6:18 ` [tip:x86/asm] x86-64, copy_user: " tip-bot for Fenghua Yu
[not found] ` <CA+55aFyFz_6oy8-1jb5Jzk+4VC5MLA5d0KbqhZxki0=+DmggBg@mail.gmail.com>
2013-11-17 6:51 ` H. Peter Anvin
2013-11-19 4:37 ` H. Peter Anvin
2013-11-19 19:38 ` Linus Torvalds
2013-11-20 19:12 ` H. Peter Anvin
2013-11-20 19:28 ` H. Peter Anvin
2013-11-20 20:13 ` Linus Torvalds
2013-11-20 20:36 ` H. Peter Anvin
2013-11-20 21:44 ` Linus Torvalds
2013-11-20 20:54 ` [tip:x86/asm] x86-64, copy_user: Use leal to produce 32-bit results tip-bot for H. Peter Anvin
2013-11-20 22:00 ` tip-bot for H. Peter Anvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox