From: Borislav Petkov <bp@alien8.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Hemment <markhemm@googlemail.com>,
Andrew Morton <akpm@linux-foundation.org>,
the arch/x86 maintainers <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
patrice.chotard@foss.st.com,
Mikulas Patocka <mpatocka@redhat.com>,
Lukas Czerner <lczerner@redhat.com>,
Christoph Hellwig <hch@lst.de>,
"Darrick J. Wong" <djwong@kernel.org>,
Chuck Lever <chuck.lever@oracle.com>,
Hugh Dickins <hughd@google.com>,
patches@lists.linux.dev, Linux-MM <linux-mm@kvack.org>,
mm-commits@vger.kernel.org
Subject: Re: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE
Date: Wed, 27 Apr 2022 02:14:36 +0200 [thread overview]
Message-ID: <YmiK7Bos+zLAvL0t@zn.tnic> (raw)
In-Reply-To: <CAHk-=wgFnTbbeR0NAsGGsoBBThXt9Zh5_acN47r4CF0PdgSNeA@mail.gmail.com>
On Sun, Apr 24, 2022 at 12:54:57PM -0700, Linus Torvalds wrote:
> I suspect it's a %rax vs %rcx confusion again, but with your "patch on
> top of earlier patch" I didn't go and sort it out.
Finally had some quiet time to stare at this.
So when we enter the function, we shift %rcx to get the number of
qword-sized quantities to zero:
SYM_FUNC_START(clear_user_original)
mov %rcx,%rax
shr $3,%rcx # qwords <---
and we zero in qword quantities merrily:
# do the qwords first
.p2align 4
0: movq $0,(%rdi)
lea 8(%rdi),%rdi
dec %rcx
jnz 0b
but when we encounter the fault here, we return *%rcx* - not %rcx << 3
- latter being the *bytes* leftover which we *actually* need to return
when we encounter the #PF.
So, we need to shift back when we fault during the qword-sized zeroing,
i.e., full function below, see label 3 there.
With that, strace looks good too:
openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7dc5000
munmap(0x7ffff7dd5000, 65536) = 0
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 16
exit_group(16) = ?
+++ exited with 16 +++
As to the byte-exact deal, I'll put it on my TODO to play with it later
and see how much asm we can shed from this simplification so thanks for
the pointers!
/*
* Default clear user-space.
* Input:
* rdi destination
* rcx count
*
* Output:
* rcx uncleared bytes or 0 if successful.
*/
SYM_FUNC_START(clear_user_original)
mov %rcx,%rax
shr $3,%rcx # qwords
and $7,%rax # rest bytes
test %rcx,%rcx
jz 1f
# do the qwords first
.p2align 4
0: movq $0,(%rdi)
lea 8(%rdi),%rdi
dec %rcx
jnz 0b
1: test %rax,%rax
jz 3f
# now do the rest bytes
2: movb $0,(%rdi)
inc %rdi
decl %eax
jnz 2b
3:
# convert qwords back into bytes to return to caller
shl $3, %rcx
4:
xorl %eax,%eax
RET
_ASM_EXTABLE_UA(0b, 3b)
/*
* The %rcx value gets fixed up with EX_TYPE_UCOPY_LEN (which basically ends
* up doing "%rcx = %rcx*8 + %rax" in ex_handler_ucopy_len() for the exception
* case). That is, we use %rax above at label 2: for simpler asm but the number
* of uncleared bytes will land in %rcx, as expected by the caller.
*
* %rax at label 3: still needs to be cleared in the exception case because this
* is called from inline asm and the compiler expects %rax to be zero when exiting
* the inline asm, in case it might reuse it somewhere.
*/
_ASM_EXTABLE_TYPE_REG(2b, 4b, EX_TYPE_UCOPY_LEN8, %rax)
Btw, I'm wondering if using descriptive label names would make this function even more
understandable:
/*
* Default clear user-space.
* Input:
* rdi destination
* rcx count
*
* Output:
* rcx uncleared bytes or 0 if successful.
*/
SYM_FUNC_START(clear_user_original)
mov %rcx,%rax
shr $3,%rcx # qwords
and $7,%rax # rest bytes
test %rcx,%rcx
jz .Lrest_bytes
# do the qwords first
.p2align 4
.Lqwords:
movq $0,(%rdi)
lea 8(%rdi),%rdi
dec %rcx
jnz .Lqwords
.Lrest_bytes:
test %rax,%rax
jz .Lexit
# now do the rest bytes
.Lbytes:
movb $0,(%rdi)
inc %rdi
decl %eax
jnz .Lbytes
.Lqwords_exit:
# convert qwords back into bytes to return to caller
shl $3, %rcx
.Lexit:
xorl %eax,%eax
RET
_ASM_EXTABLE_UA(.Lqwords, .Lqwords_exit)
/*
* The %rcx value gets fixed up with EX_TYPE_UCOPY_LEN (which basically ends
* up doing "%rcx = %rcx*8 + %rax" in ex_handler_ucopy_len() for the exception
* case). That is, we use %rax above at label 2: for simpler asm but the number
* of uncleared bytes will land in %rcx, as expected by the caller.
*
* %rax at label 3: still needs to be cleared in the exception case because this
* is called from inline asm and the compiler expects %rax to be zero when exiting
* the inline asm, in case it might reuse it somewhere.
*/
_ASM_EXTABLE_TYPE_REG(.Lbytes, .Lexit, EX_TYPE_UCOPY_LEN8, %rax)
SYM_FUNC_END(clear_user_original)
EXPORT_SYMBOL(clear_user_original)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
next prev parent reply other threads:[~2022-04-27 0:14 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-15 2:12 incoming Andrew Morton
2022-04-15 2:13 ` [patch 01/14] MAINTAINERS: Broadcom internal lists aren't maintainers Andrew Morton
2022-04-15 2:13 ` [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE Andrew Morton
2022-04-15 22:10 ` Linus Torvalds
2022-04-15 22:21 ` Matthew Wilcox
2022-04-15 22:41 ` Hugh Dickins
2022-04-16 6:36 ` Borislav Petkov
2022-04-16 14:07 ` Mark Hemment
2022-04-16 17:28 ` Borislav Petkov
2022-04-16 17:42 ` Linus Torvalds
2022-04-16 21:15 ` Borislav Petkov
2022-04-17 19:41 ` Borislav Petkov
2022-04-17 20:56 ` Linus Torvalds
2022-04-18 10:15 ` Borislav Petkov
2022-04-18 17:10 ` Linus Torvalds
2022-04-19 9:17 ` Borislav Petkov
2022-04-19 16:41 ` Linus Torvalds
2022-04-19 17:48 ` Borislav Petkov
2022-04-21 15:06 ` Borislav Petkov
2022-04-21 16:50 ` Linus Torvalds
2022-04-21 17:22 ` Linus Torvalds
2022-04-24 19:37 ` Borislav Petkov
2022-04-24 19:54 ` Linus Torvalds
2022-04-24 20:24 ` Linus Torvalds
2022-04-27 0:14 ` Borislav Petkov [this message]
2022-04-27 1:29 ` Linus Torvalds
2022-04-27 10:41 ` Borislav Petkov
2022-04-27 16:00 ` Linus Torvalds
2022-05-04 18:56 ` Borislav Petkov
2022-05-04 19:22 ` Linus Torvalds
2022-05-04 20:18 ` Borislav Petkov
2022-05-04 20:40 ` Linus Torvalds
2022-05-04 21:01 ` Borislav Petkov
2022-05-04 21:09 ` Linus Torvalds
2022-05-10 9:31 ` clear_user (was: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE) Borislav Petkov
2022-05-10 17:17 ` Linus Torvalds
2022-05-10 17:28 ` Linus Torvalds
2022-05-10 18:10 ` Borislav Petkov
2022-05-10 18:57 ` Borislav Petkov
2022-05-24 12:32 ` [PATCH] x86/clear_user: Make it faster Borislav Petkov
2022-05-24 16:51 ` Linus Torvalds
2022-05-24 17:30 ` Borislav Petkov
2022-05-25 12:11 ` Mark Hemment
2022-05-27 11:28 ` Borislav Petkov
2022-05-27 11:10 ` Ingo Molnar
2022-06-22 14:21 ` Borislav Petkov
2022-06-22 15:06 ` Linus Torvalds
2022-06-22 20:14 ` Borislav Petkov
2022-06-22 21:07 ` Linus Torvalds
2022-06-23 9:41 ` Borislav Petkov
2022-07-05 17:01 ` [PATCH -final] " Borislav Petkov
2022-07-06 9:24 ` Alexey Dobriyan
2022-07-11 10:33 ` Borislav Petkov
2022-07-12 12:32 ` Alexey Dobriyan
2022-08-06 12:49 ` Borislav Petkov
2022-04-15 2:13 ` [patch 03/14] mm/secretmem: fix panic when growing a memfd_secret Andrew Morton
2022-04-15 2:13 ` [patch 04/14] irq_work: use kasan_record_aux_stack_noalloc() record callstack Andrew Morton
2022-04-15 2:13 ` [patch 05/14] kasan: fix hw tags enablement when KUNIT tests are disabled Andrew Morton
2022-04-15 2:13 ` [patch 06/14] mm, kfence: support kmem_dump_obj() for KFENCE objects Andrew Morton
2022-04-15 2:13 ` [patch 07/14] mm, page_alloc: fix build_zonerefs_node() Andrew Morton
2022-04-15 2:13 ` [patch 08/14] mm: fix unexpected zeroed page mapping with zram swap Andrew Morton
2022-04-15 2:13 ` [patch 09/14] mm: compaction: fix compiler warning when CONFIG_COMPACTION=n Andrew Morton
2022-04-15 2:13 ` [patch 10/14] hugetlb: do not demote poisoned hugetlb pages Andrew Morton
2022-04-15 2:13 ` [patch 11/14] revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders" Andrew Morton
2022-04-15 2:13 ` [patch 12/14] revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" Andrew Morton
2022-04-15 2:14 ` [patch 13/14] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore Andrew Morton
2022-04-15 2:14 ` [patch 14/14] mm: kmemleak: take a full lowmem check in kmemleak_*_phys() Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YmiK7Bos+zLAvL0t@zn.tnic \
--to=bp@alien8.de \
--cc=akpm@linux-foundation.org \
--cc=chuck.lever@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=hughd@google.com \
--cc=lczerner@redhat.com \
--cc=linux-mm@kvack.org \
--cc=markhemm@googlemail.com \
--cc=mm-commits@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=patches@lists.linux.dev \
--cc=patrice.chotard@foss.st.com \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).