From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12C62C433EF for ; Sat, 16 Apr 2022 14:08:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B7956B0072; Sat, 16 Apr 2022 10:08:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 567806B0073; Sat, 16 Apr 2022 10:08:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4087D6B0074; Sat, 16 Apr 2022 10:08:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 33B886B0072 for ; Sat, 16 Apr 2022 10:08:03 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id EF947120FAB for ; Sat, 16 Apr 2022 14:08:02 +0000 (UTC) X-FDA: 79362921204.16.59BD2A5 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) by imf01.hostedemail.com (Postfix) with ESMTP id 7A7DB40005 for ; Sat, 16 Apr 2022 14:08:02 +0000 (UTC) Received: by mail-wm1-f42.google.com with SMTP id 123-20020a1c1981000000b0038b3616a71aso6429317wmz.4 for ; Sat, 16 Apr 2022 07:08:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20210112; h=from:date:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=uF6OsoJuv8Gn1RZ5HbuH+1a/CFsaDCHXY66jKKDePe4=; b=aOjQXL9ADcCktqPPcdOnu6EKxaJ3Bt56tZGsN7Gk82MFHAcamNCWw6jsYnGpPh22Sb SP0majfdlfjNN4efdNc4+KmLBKaGrq/vaSJygxgzUSadw7Wwpc+MmUYr5bv24D4B39gG 4otR/P9NYZRP6uPlUDhs0tdtYlRMt83J0AleyQGZqhGvvyLz6/VcGqwfv3Y6vueeO40o 8MrLVs7ZYpd6gBNZS/heyP96XzLvUWfJuqK9ujGTSZLoXyhmM6fSEFYQp6l7bHiq+bbC OXhlzvnONau1N4luhcSaOZAd7E9BgVNGNXDsGXUvdfyKGQIhZjljCEElfgh42J3tq9sP d9JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:date:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=uF6OsoJuv8Gn1RZ5HbuH+1a/CFsaDCHXY66jKKDePe4=; b=QI2FyWbAW4mrUGdZTI/IhzHg6mmy+4MtBD7b55Y0BhDyr8cRiPQzyLEdGhM20qB5g/ 5o1uknQ1CAp7X7m+QkOVLES5fWdj/TGNjpXxjB/9nE2rcK+npzc4wFNGSy+EdOBwPuOv o6JvkXqthZfyaVeN7h94OR7qnUHFPPPo5p/OnCoKojo83G0t4YAL9pqCdP1/ngbnL/7L yqkD2L6fnXL+pMjHHY2+OOAuBipphnx2UnKwhCtMHx4g4g/x4+Z/9Jt+llEyixhsE+fC wmjac6F4Xu8FAKGP7hbFjJe/XZjoGS0kH8HyEKTxp2AI+uGA92ZBW1KVfAY7NahFcmQz Wbxg== X-Gm-Message-State: AOAM533I6psPkqQETaPStWEjgIR3FeZcZkCB4iaLWCCOWfP5iH0OUDzY OCSRbA+4CCmIVpRpMtw1R78= X-Google-Smtp-Source: ABdhPJxb0wjIHqmBv7hjF0Xl+y1V0MawgA8KfEcXqU+1TUpzZpKLE+8LFetF1mioXNo1GVcixmnE/g== X-Received: by 2002:a05:600c:4ecb:b0:392:88ed:1ef9 with SMTP id g11-20020a05600c4ecb00b0039288ed1ef9mr3335342wmq.68.1650118081074; Sat, 16 Apr 2022 07:08:01 -0700 (PDT) Received: from linux-dev (host86-172-30-253.range86-172.btcentralplus.com. [86.172.30.253]) by smtp.gmail.com with ESMTPSA id u5-20020a5d6ac5000000b00207e90b2869sm6278215wrw.91.2022.04.16.07.07.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Apr 2022 07:08:00 -0700 (PDT) From: Mark Hemment X-Google-Original-From: Mark Hemment Date: Sat, 16 Apr 2022 15:07:47 +0100 (BST) To: Borislav Petkov cc: Linus Torvalds , Andrew Morton , the arch/x86 maintainers , Peter Zijlstra , patrice.chotard@foss.st.com, Mikulas Patocka , markhemm@googlemail.com, Lukas Czerner , Christoph Hellwig , "Darrick J. Wong" , Chuck Lever , Hugh Dickins , patches@lists.linux.dev, Linux-MM , mm-commits@vger.kernel.org Subject: Re: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE In-Reply-To: Message-ID: <29b9ef95-1226-73b4-b4d1-6e8d164fb17d@gmail.com> References: <20220414191240.9f86d15a3e3afd848a9839a6@linux-foundation.org> <20220415021328.7D31EC385A1@smtp.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: idpg7pka6hehazb3s9pbsgjgt5y975ph X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=googlemail.com header.s=20210112 header.b=aOjQXL9A; dmarc=pass (policy=quarantine) header.from=googlemail.com; spf=pass (imf01.hostedemail.com: domain of markhemm@googlemail.com designates 209.85.128.42 as permitted sender) smtp.mailfrom=markhemm@googlemail.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7A7DB40005 X-HE-Tag: 1650118082-404235 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, 16 Apr 2022, Borislav Petkov wrote: > On Fri, Apr 15, 2022 at 03:10:51PM -0700, Linus Torvalds wrote: > > Adding PeterZ and Borislav (who seem to be the last ones to have > > worked on the copy and clear_page stuff respectively) and the x86 > > maintainers in case somebody gets the urge to just fix this. > > I guess if enough people ask and keep asking, some people at least try > to move... > > > Because memory clearing should be faster than copying, and the thing > > that makes copying fast is that FSRM and ERMS logic (the whole > > "manually unrolled copy" is hopefully mostly a thing of the past and > > we can consider it legacy) > > So I did give it a look and it seems to me, if we want to do the > alternatives thing here, it will have to look something like > arch/x86/lib/copy_user_64.S. > > I.e., the current __clear_user() will have to become the "handle_tail" > thing there which deals with uncopied rest-bytes at the end and the new > fsrm/erms/rep_good variants will then be alternative_call_2 or _3. > > The fsrm thing will have only the handle_tail thing at the end when size > != 0. > > The others - erms and rep_good - will have to check for sizes smaller > than, say a cacheline, and for those call the handle_tail thing directly > instead of going into a REP loop. The current __clear_user() is still a > lot better than that copy_user_generic_unrolled() abomination. And it's > not like old CPUs would get any perf penalty - they'll simply use the > same code. > > And then you need the labels for _ASM_EXTABLE_UA() exception handling. > > Anyway, something along those lines. > > And then we'll need to benchmark this on a bunch of current machines to > make sure there's no funny surprises, perf-wise. > > I can get cracking on this but I would advise people not to hold their > breaths. :) > > Unless someone has a better idea or is itching to get hands dirty > her-/himself. I've done a skeleton implementation of alternative __clear_user() based on CPU features. It has three versions of __clear_user(); o __clear_user_original() - similar to the 'standard' __clear_user() o __clear_user_rep_good() - using resp stos{qb} when CPU has 'rep_good' o __clear_user_erms() - using 'resp stosb' when CPU has 'erms' Not claiming the implementation is ideal, but might be a useful starting point for someone. Patch is against 5.18.0-rc2. Only basic sanity testing done. Simple performance testing done for large sizes, on a system (Intel E8400) which has rep_good but not erms; # dd if=/dev/zero of=/dev/null bs=16384 count=10000 o *_original() - ~14.2GB/s. Same as the 'standard' __clear_user(). o *_rep_good() - same throughput as *_original(). o *_erms() - ~12.2GB/s (expected on a system without erms). No performance testing done for zeroing small sizes. Cheers, Mark Signed-off-by: Mark Hemment --- arch/x86/include/asm/asm.h | 39 +++++++++++++++ arch/x86/include/asm/uaccess_64.h | 36 ++++++++++++++ arch/x86/lib/clear_page_64.S | 100 ++++++++++++++++++++++++++++++++++++++ arch/x86/lib/usercopy_64.c | 32 ------------ 4 files changed, 175 insertions(+), 32 deletions(-) diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index fbcfec4dc4cc..373ed6be7a8d 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -132,6 +132,35 @@ /* Exception table entry */ #ifdef __ASSEMBLY__ +# define UNDEFINE_EXTABLE_TYPE_REG \ + .purgem extable_type_reg ; + +# define DEFINE_EXTABLE_TYPE_REG \ + .macro extable_type_reg type:req reg:req ; \ + .set .Lfound, 0 ; \ + .set .Lregnr, 0 ; \ + .irp rs,rax,rcx,rdx,rbx,rsp,rbp,rsi,rdi,r8,r9,r10,r11,r12,r13, \ + r14,r15 ; \ + .ifc \reg, %\rs ; \ + .set .Lfound, .Lfound+1 ; \ + .long \type + (.Lregnr << 8) ; \ + .endif ; \ + .set .Lregnr, .Lregnr+1 ; \ + .endr ; \ + .set .Lregnr, 0 ; \ + .irp rs,eax,ecx,edx,ebx,esp,ebp,esi,edi,r8d,r9d,r10d,r11d,r12d, \ + r13d,r14d,r15d ; \ + .ifc \reg, %\rs ; \ + .set .Lfound, .Lfound+1 ; \ + .long \type + (.Lregnr << 8) ; \ + .endif ; \ + .set .Lregnr, .Lregnr+1 ; \ + .endr ; \ + .if (.Lfound != 1) ; \ + .error "extable_type_reg: bad register argument" ; \ + .endif ; \ + .endm ; + # define _ASM_EXTABLE_TYPE(from, to, type) \ .pushsection "__ex_table","a" ; \ .balign 4 ; \ @@ -140,6 +169,16 @@ .long type ; \ .popsection +# define _ASM_EXTABLE_TYPE_REG(from, to, type1, reg1) \ + .pushsection "__ex_table","a" ; \ + .balign 4 ; \ + .long (from) - . ; \ + .long (to) - . ; \ + DEFINE_EXTABLE_TYPE_REG \ + extable_type_reg reg=reg1, type=type1 ; \ + UNDEFINE_EXTABLE_TYPE_REG \ + .popsection + # ifdef CONFIG_KPROBES # define _ASM_NOKPROBE(entry) \ .pushsection "_kprobe_blacklist","aw" ; \ diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index 45697e04d771..6a4995e4cfae 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -79,4 +79,40 @@ __copy_from_user_flushcache(void *dst, const void __user *src, unsigned size) kasan_check_write(dst, size); return __copy_user_flushcache(dst, src, size); } + +/* + * Zero Userspace. + */ + +__must_check unsigned long +clear_user_original(void __user *addr, unsigned long len); +__must_check unsigned long +clear_user_rep_good(void __user *addr, unsigned long len); +__must_check unsigned long +clear_user_erms(void __user *addr, unsigned long len); + +static __always_inline __must_check unsigned long +___clear_user(void __user *addr, unsigned long len) +{ + unsigned long ret; + + /* + * No memory constraint because it doesn't change any memory gcc + * knows about. + */ + + might_fault(); + alternative_call_2( + clear_user_original, + clear_user_rep_good, + X86_FEATURE_REP_GOOD, + clear_user_erms, + X86_FEATURE_ERMS, + ASM_OUTPUT2("=a" (ret), "=D" (addr), "=c" (len)), + "1" (addr), "2" (len) + : "%rdx", "cc"); + return ret; +} + +#define __clear_user(d, n) ___clear_user(d, n) #endif /* _ASM_X86_UACCESS_64_H */ diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S index fe59b8ac4fcc..abe1f44ea422 100644 --- a/arch/x86/lib/clear_page_64.S +++ b/arch/x86/lib/clear_page_64.S @@ -1,5 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0-only */ #include +#include +#include #include /* @@ -50,3 +52,101 @@ SYM_FUNC_START(clear_page_erms) RET SYM_FUNC_END(clear_page_erms) EXPORT_SYMBOL_GPL(clear_page_erms) + +/* + * Default clear user-space. + * Input: + * rdi destination + * rcx count + * + * Output: + * rax uncopied bytes or 0 if successful. + */ + +SYM_FUNC_START(clear_user_original) + ASM_STAC + movq %rcx,%rax + shrq $3,%rcx + andq $7,%rax + testq %rcx,%rcx + jz 1f + + .p2align 4 +0: movq $0,(%rdi) + leaq 8(%rdi),%rdi + decq %rcx + jnz 0b + +1: movq %rax,%rcx + testq %rcx,%rcx + jz 3f + +2: movb $0,(%rdi) + incq %rdi + decl %ecx + jnz 2b + +3: ASM_CLAC + movq %rcx,%rax + RET + + _ASM_EXTABLE_TYPE_REG(0b, 3b, EX_TYPE_UCOPY_LEN8, %rax) + _ASM_EXTABLE_UA(2b, 3b) +SYM_FUNC_END(clear_user_original) +EXPORT_SYMBOL(clear_user_original) + +/* + * Alternative clear user-space when CPU feature X86_FEATURE_REP_GOOD is + * present. + * Input: + * rdi destination + * rcx count + * + * Output: + * rax uncopied bytes or 0 if successful. + */ + +SYM_FUNC_START(clear_user_rep_good) + ASM_STAC + movq %rcx,%rdx + xorq %rax,%rax + shrq $3,%rcx + andq $7,%rdx + +0: rep stosq + movq %rdx,%rcx + +1: rep stosb + +3: ASM_CLAC + movq %rcx,%rax + RET + + _ASM_EXTABLE_TYPE_REG(0b, 3b, EX_TYPE_UCOPY_LEN8, %rdx) + _ASM_EXTABLE_UA(1b, 3b) +SYM_FUNC_END(clear_user_rep_good) +EXPORT_SYMBOL(clear_user_rep_good) + +/* + * Alternative clear user-space when CPU feature X86_FEATURE_ERMS is present. + * Input: + * rdi destination + * rcx count + * + * Output: + * rax uncopied bytes or 0 if successful. + */ + +SYM_FUNC_START(clear_user_erms) + xorq %rax,%rax + ASM_STAC + +0: rep stosb + +3: ASM_CLAC + movq %rcx,%rax + RET + + _ASM_EXTABLE_UA(0b, 3b) +SYM_FUNC_END(clear_user_erms) +EXPORT_SYMBOL(clear_user_erms) diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c index 0402a749f3a0..3a2872c9c4a9 100644 --- a/arch/x86/lib/usercopy_64.c +++ b/arch/x86/lib/usercopy_64.c @@ -14,38 +14,6 @@ * Zero Userspace */ -unsigned long __clear_user(void __user *addr, unsigned long size) -{ - long __d0; - might_fault(); - /* no memory constraint because it doesn't change any memory gcc knows - about */ - stac(); - asm volatile( - " testq %[size8],%[size8]\n" - " jz 4f\n" - " .align 16\n" - "0: movq $0,(%[dst])\n" - " addq $8,%[dst]\n" - " decl %%ecx ; jnz 0b\n" - "4: movq %[size1],%%rcx\n" - " testl %%ecx,%%ecx\n" - " jz 2f\n" - "1: movb $0,(%[dst])\n" - " incq %[dst]\n" - " decl %%ecx ; jnz 1b\n" - "2:\n" - - _ASM_EXTABLE_TYPE_REG(0b, 2b, EX_TYPE_UCOPY_LEN8, %[size1]) - _ASM_EXTABLE_UA(1b, 2b) - - : [size8] "=&c"(size), [dst] "=&D" (__d0) - : [size1] "r"(size & 7), "[size8]" (size / 8), "[dst]"(addr)); - clac(); - return size; -} -EXPORT_SYMBOL(__clear_user); - unsigned long clear_user(void __user *to, unsigned long n) { if (access_ok(to, n))