From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 680C0C4338F for ; Tue, 17 Aug 2021 09:03:54 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2A0E060FA0 for ; Tue, 17 Aug 2021 09:03:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2A0E060FA0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Cc:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=PXwMNCnixSmeEgcwBYjn64wUyFpSMk50u4wnKPsG63A=; b=q6yjHM26Ka7ZNZXYZopihHV995 oChM/Feiz+6ab0Z4xmbL2YX2obfEtfPSLXjok4VlCfV4RM4rU2qt04FWIoNuVW7+Z1la5i+QOjIVF nimI6bCJwLQupjUwBTLLgld4iSLq6NVz5CnsMULc3pZaTUMqdqVLGsupsorhs53ViKSgD46fhFS59 n8yFcsp9gC8qouw9TW6csBeYjEQUNX4pGDw2opKMES3nVJgq5sXme0kFaHr5SMm242u9Kk8Gk4fUb S73dIC6BcQYYHI2sNR//RisU/vWfQf4D+jgyi9SKaa7n7JSoY26PbK9yawi3O+lvJaR4eR+HAgnkv TcUQNpsQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mFv0L-001jGV-Ux; Tue, 17 Aug 2021 09:03:29 +0000 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mFv0H-001jFW-UR for linux-riscv@lists.infradead.org; Tue, 17 Aug 2021 09:03:27 +0000 Received: by mail-pj1-x1035.google.com with SMTP id oa17so31078443pjb.1 for ; Tue, 17 Aug 2021 02:03:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=cc:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=PhWxmc8derdQF3ciSQTXL4gicsn8XaD0Xw+UiOkSNfE=; b=dhHi8o3ngIQUg9/6xFAUsFR6MH1zfyTk5yl+TmU+QvURYS7z3XTjs6kV6XwjY7NEHg xN8QipriwivaRm+bBNcJqkLhFWPPPRkC6v/AnoeyU4yT/cgG28ccgeu2NXYhLiSoOmpp yEpdKPuKRHjHRFyl0zrnCgCd5bh15ui9XEOXfJsD2hP58QkL6eieRKJF3tmZXGpJU2N0 prwSzOzUmTwRu9PSv/6BeqsBjEwQBYTsxifqjukhfM+QoH3m3rdxV/BA1ofvDXp6Jj3+ JN/VE20+OzG/qKHxlgoVUowuq/43tq3LbAtN4q78Okn5ULVC+RnzLkKp6DrIof6UkrQG lfBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:cc:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PhWxmc8derdQF3ciSQTXL4gicsn8XaD0Xw+UiOkSNfE=; b=IEcRpDZ+O7sLsEm8Kjkb6c68jLbOAs4p0D2o3by9E1sHipmBXWeJiM99Furn2tndO1 ojZUYCO9osduv0x/in3QWFSSlE0EczZ0WNv6DkNgEB4/NCtLOkOlUAGxPo9aRZVrHE/5 PHxy0oE3Gbdwv6Br1RX3NyO7iWLdnwOBCNmQ906sZuC9gElJSoOoEtMpIgIpHKPSc4JA oF1IT6KLQ+GXR+zODAMuvZvSVH8qIXXG0S30TZVBoIlOyk4vX3tfzZsTUOCUsRwtWyv8 1DeAoG4joCNUIJEM0A1MnoYr/O114TBanf93ihS4Zy6PshXAdu/ADU5nLl/xFkSqxrCH 594Q== X-Gm-Message-State: AOAM532Bznk8sr41PpBgCEqwuSB4FL9xejH/TnDRe/T7fzhDHnA03xOa /BTe8zluqRlKXUCGplaNEyM= X-Google-Smtp-Source: ABdhPJy0c3WEiR8hPLWgXGJDEd84TGY7ELR4uHSfQ8aWHNpBDXhLAAZRZQyXyzM/SfSCAriJWQDXPw== X-Received: by 2002:a63:1460:: with SMTP id 32mr2578900pgu.323.1629191003883; Tue, 17 Aug 2021 02:03:23 -0700 (PDT) Received: from [10.252.0.198] (ec2-54-250-108-108.ap-northeast-1.compute.amazonaws.com. [54.250.108.108]) by smtp.gmail.com with ESMTPSA id d5sm1566461pju.28.2021.08.17.02.03.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Aug 2021 02:03:23 -0700 (PDT) Cc: akira.tsukamoto@gmail.com, Paul Walmsley , linux@roeck-us.net, geert@linux-m68k.org, qiuwenbo@kylinos.com.cn, aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/1] riscv: __asm_copy_to-from_user: Improve using word copy if size < 9*SZREG To: Palmer Dabbelt References: From: Akira Tsukamoto Message-ID: <468725dc-d110-5ede-e290-c7a97feacd43@gmail.com> Date: Tue, 17 Aug 2021 18:03:19 +0900 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210817_020326_057056_25DD7C7B X-CRM114-Status: GOOD ( 26.27 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Ck9uIDgvMTcvMjAyMSAzOjA5IEFNLCBQYWxtZXIgRGFiYmVsdCB3cm90ZToKPiBPbiBGcmksIDMw IEp1bCAyMDIxIDA2OjUyOjQ0IFBEVCAoLTA3MDApLCBha2lyYS50c3VrYW1vdG9AZ21haWwuY29t IHdyb3RlOgo+PiBSZWR1Y2UgdGhlIG51bWJlciBvZiBzbG93IGJ5dGVfY29weSB3aGVuIHRoZSBz aXplIGlzIGluIGJldHdlZW4KPj4gMipTWlJFRyB0byA5KlNaUkVHIGJ5IHVzaW5nIG5vbmUgdW5y b2xsZWQgd29yZF9jb3B5Lgo+Pgo+PiBXaXRob3V0IGl0IGFueSBzaXplIHNtYWxsZXIgdGhhbiA5 KlNaUkVHIHdpbGwgYmUgdXNpbmcgc2xvdyBieXRlX2NvcHkKPj4gaW5zdGVhZCBvZiBub25lIHVu cm9sbGVkIHdvcmRfY29weS4KPj4KPj4gU2lnbmVkLW9mZi1ieTogQWtpcmEgVHN1a2Ftb3RvIDxh a2lyYS50c3VrYW1vdG9AZ21haWwuY29tPgo+PiAtLS0KPj4gwqBhcmNoL3Jpc2N2L2xpYi91YWNj ZXNzLlMgfCA0NiArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKystLS0tCj4+IMKg MSBmaWxlIGNoYW5nZWQsIDQyIGluc2VydGlvbnMoKyksIDQgZGVsZXRpb25zKC0pCj4+Cj4+IGRp ZmYgLS1naXQgYS9hcmNoL3Jpc2N2L2xpYi91YWNjZXNzLlMgYi9hcmNoL3Jpc2N2L2xpYi91YWNj ZXNzLlMKPj4gaW5kZXggNjNiYzY5MWNmZjkxLi42YTgwZDU1MTdhZmMgMTAwNjQ0Cj4+IC0tLSBh L2FyY2gvcmlzY3YvbGliL3VhY2Nlc3MuUwo+PiArKysgYi9hcmNoL3Jpc2N2L2xpYi91YWNjZXNz LlMKPj4gQEAgLTM0LDggKzM0LDEwIEBAIEVOVFJZKF9fYXNtX2NvcHlfZnJvbV91c2VyKQo+PiDC oMKgwqDCoCAvKgo+PiDCoMKgwqDCoMKgICogVXNlIGJ5dGUgY29weSBvbmx5IGlmIHRvbyBzbWFs bC4KPj4gwqDCoMKgwqDCoCAqIFNaUkVHIGhvbGRzIDQgZm9yIFJWMzIgYW5kIDggZm9yIFJWNjQK Pj4gK8KgwqDCoMKgICogYTMgLSAyKlNaUkVHIGlzIG1pbmltdW0gc2l6ZSBmb3Igd29yZF9jb3B5 Cj4+ICvCoMKgwqDCoCAqwqDCoMKgwqDCoCAxKlNaUkVHIGZvciBhbGlnbmluZyBkc3QgKyAxKlNa UkVHIGZvciB3b3JkX2NvcHkKPj4gwqDCoMKgwqDCoCAqLwo+PiAtwqDCoMKgIGxpwqDCoMKgIGEz LCA5KlNaUkVHIC8qIHNpemUgbXVzdCBiZSBsYXJnZXIgdGhhbiBzaXplIGluIHdvcmRfY29weSAq Lwo+PiArwqDCoMKgIGxpwqDCoMKgIGEzLCAyKlNaUkVHCj4+IMKgwqDCoMKgIGJsdHXCoMKgwqAg YTIsIGEzLCAuTGJ5dGVfY29weV90YWlsCj4+Cj4+IMKgwqDCoMKgIC8qCj4+IEBAIC02Niw5ICs2 OCw0MCBAQCBFTlRSWShfX2FzbV9jb3B5X2Zyb21fdXNlcikKPj4gwqDCoMKgwqAgYW5kacKgwqDC oCBhMywgYTEsIFNaUkVHLTEKPj4gwqDCoMKgwqAgYm5lesKgwqDCoCBhMywgLkxzaGlmdF9jb3B5 Cj4+Cj4+ICsuTGNoZWNrX3NpemVfYnVsazoKPj4gK8KgwqDCoCAvKgo+PiArwqDCoMKgwqAgKiBF dmFsdWF0ZSB0aGUgc2l6ZSBpZiBwb3NzaWJsZSB0byB1c2UgdW5yb2xsZWQuCj4+ICvCoMKgwqDC oCAqIFRoZSB3b3JkX2NvcHlfdW5scm9sbGVkIHJlcXVpcmVzIGxhcmdlciB0aGFuIDgqU1pSRUcK Pj4gK8KgwqDCoMKgICovCj4+ICvCoMKgwqAgbGnCoMKgwqAgYTMsIDgqU1pSRUcKPj4gK8KgwqDC oCBhZGTCoMKgwqAgYTQsIGEwLCBhMwo+PiArwqDCoMKgIGJsdHXCoMKgwqAgYTQsIHQwLCAuTHdv cmRfY29weV91bmxyb2xsZWQKPj4gKwo+PiDCoC5Md29yZF9jb3B5Ogo+PiAtwqDCoMKgwqDCoMKg wqAgLyoKPj4gLcKgwqDCoMKgICogQm90aCBzcmMgYW5kIGRzdCBhcmUgYWxpZ25lZCwgdW5yb2xs ZWQgd29yZCBjb3B5Cj4+ICvCoMKgwqAgLyoKPj4gK8KgwqDCoMKgICogQm90aCBzcmMgYW5kIGRz dCBhcmUgYWxpZ25lZAo+PiArwqDCoMKgwqAgKiBOb25lIHVucm9sbGVkIHdvcmQgY29weSB3aXRo IGV2ZXJ5IDEqU1pSRUcgaXRlcmF0aW9uCj4+ICvCoMKgwqDCoCAqCj4+ICvCoMKgwqDCoCAqIGEw IC0gc3RhcnQgb2YgYWxpZ25lZCBkc3QKPj4gK8KgwqDCoMKgICogYTEgLSBzdGFydCBvZiBhbGln bmVkIHNyYwo+PiArwqDCoMKgwqAgKiB0MCAtIGVuZCBvZiBhbGlnbmVkIGRzdAo+PiArwqDCoMKg wqAgKi8KPj4gK8KgwqDCoCBiZ2V1wqDCoMKgIGEwLCB0MCwgLkxieXRlX2NvcHlfdGFpbCAvKiBj aGVjayBpZiBlbmQgb2YgY29weSAqLwo+PiArwqDCoMKgIGFkZGnCoMKgwqAgdDAsIHQwLCAtKFNa UkVHKSAvKiBub3QgdG8gb3ZlciBydW4gKi8KPj4gKzE6Cj4+ICvCoMKgwqAgUkVHX0zCoMKgwqAg YTUsIDAoYTEpCj4+ICvCoMKgwqAgYWRkacKgwqDCoCBhMSwgYTEsIFNaUkVHCj4+ICvCoMKgwqAg UkVHX1PCoMKgwqAgYTUsIDAoYTApCj4+ICvCoMKgwqAgYWRkacKgwqDCoCBhMCwgYTAsIFNaUkVH Cj4+ICvCoMKgwqAgYmx0dcKgwqDCoCBhMCwgdDAsIDFiCj4+ICsKPj4gK8KgwqDCoCBhZGRpwqDC oMKgIHQwLCB0MCwgU1pSRUcgLyogcmV2ZXJ0IHRvIG9yaWdpbmFsIHZhbHVlICovCj4+ICvCoMKg wqAgasKgwqDCoCAuTGJ5dGVfY29weV90YWlsCj4+ICsKPj4gKy5Md29yZF9jb3B5X3VubHJvbGxl ZDoKPj4gK8KgwqDCoCAvKgo+PiArwqDCoMKgwqAgKiBCb3RoIHNyYyBhbmQgZHN0IGFyZSBhbGln bmVkCj4+ICvCoMKgwqDCoCAqIFVucm9sbGVkIHdvcmQgY29weSB3aXRoIGV2ZXJ5IDgqU1pSRUcg aXRlcmF0aW9uCj4+IMKgwqDCoMKgwqAgKgo+PiDCoMKgwqDCoMKgICogYTAgLSBzdGFydCBvZiBh bGlnbmVkIGRzdAo+PiDCoMKgwqDCoMKgICogYTEgLSBzdGFydCBvZiBhbGlnbmVkIHNyYwo+PiBA QCAtOTcsNyArMTMwLDEyIEBAIEVOVFJZKF9fYXNtX2NvcHlfZnJvbV91c2VyKQo+PiDCoMKgwqDC oCBibHR1wqDCoMKgIGEwLCB0MCwgMmIKPj4KPj4gwqDCoMKgwqAgYWRkacKgwqDCoCB0MCwgdDAs IDgqU1pSRUcgLyogcmV2ZXJ0IHRvIG9yaWdpbmFsIHZhbHVlICovCj4+IC3CoMKgwqAgasKgwqDC oCAuTGJ5dGVfY29weV90YWlsCj4+ICsKPj4gK8KgwqDCoCAvKgo+PiArwqDCoMKgwqAgKiBSZW1h aW5pbmcgbWlnaHQgbGFyZ2UgZW5vdWdoIGZvciB3b3JkX2NvcHkgdG8gcmVkdWNlIHNsb3cgYnl0 ZQo+PiArwqDCoMKgwqAgKiBjb3B5Cj4+ICvCoMKgwqDCoCAqLwo+PiArwqDCoMKgIGrCoMKgwqAg LkxjaGVja19zaXplX2J1bGsKPj4KPj4gwqAuTHNoaWZ0X2NvcHk6Cj4gCj4gSSdtIHN0aWxsIG5v dCBjb252aW5jZWQgdGhhdCBnb2luZyBhbGwgdGhlIHdheSB0byBzdWNoIGEgbGFyZ2UgdW5yb2xs aW5nIGZhY3RvciBpcyBhIG5ldCB3aW4sIGJ1dCB0aGlzIGF0IGxlYXN0IHByb3ZpZGVzIGEgbXVj aCBzbW9vdGhlciBjb3N0IGN1cnZlLgoKSSB3b3VsZCBsaWtlIHRvIG1lZXQgYW5kIGRpc2N1c3Mg dGhlIHVucm9sbGluZyBmYWN0b3IgYXQgc29tZSBldmVudHMuClRoZSBhc3NlbWJsZXIgdmVyc2lv biBvZiBtZW1zZXQgaW4gYXJjaC9yaXNjdi9saWIvbWVtc2V0LlMgaGFkIHRoaXJ0eSB0d28gY29u c2VxdWVudCB1bnJvbGxpbmcgbG9hZGluZyBhbmQgc3RvcmVzIGFuZCBpbml0aWFsbHkgSSBhbHNv IHRob3VnaHQgaXQgd2FzIHRvbyBtdWNoIHVucm9sbGluZyBhbmQgY3JhenkuIAoKSG93ZXZlciwg SSBjb3VsZCBub3QgYmVhdCBpdCB3aXRoIHRoZSBzcGVlZCB3aXRoIGFueSBvZiBteSBjdXN0b21p emF0aW9uIHdoZW4gcmVkdWNpbmcgdGhlCnVucm9sbGluZy4gSSBuZXZlciB0aG91Z2h0IHN1Y2gg YSBsYXJnZSB1bnJvbGxpbmcgd291bGQgaGF2ZSBiZW5lZml0LCBteSBpbml0aWFsIHRob3VnaHQg d2FzIG1pbmltdW0gdHdvIG9yIHRocmVlIHdvdWxkIGJlIGVub3VnaCBmb3IgZml2ZSBvciBzbyBw aXBlbGluZSBjb3JlcyB3aXRoIGluLW9yZGVyIGFuZCBzaW5nbGUgaXNzdWUgZGVzaWduLgoKQXQg dGhlIHNhbWUgdGltZSBJIGV4cGVyaWVuY2VkIGluIHRoZSBwYXN0IHNvbWUgeDg24oCZcyBpbi1v cmRlciBjb3JlcyB3b3VsZCBiZW5lZml0IGZyb20gbGFyZ2UgdW5yb2xsaW5nLCBzbyBJIGRlY2lk ZWQgdG8gZ28gd2hpY2ggd2FzIGZhc3RlciBhZnRlciB0aGUgbWVhc3VyZW1lbnQuClRoZSBzcGVl ZCBvZiB0aGUgbWVtc2V0IGlzIGNyaXRpY2FsIGZvciBjbGVhcmluZyB0aGUgZW50aXJlIDRLaUIg cGFnZS4gCgpUaGUgYmlnZ2VzdCBkb3duIHNpemUgaXMgdGhhdCB0aGUgbGFyZ2UgdW5yb2xsaW5n IHdpbGwgaW5jcmVhc2UgdGhlIGJpbmFyeSBzaXplLCBhbmQgbW9zdCBvZiBvdXQtb2Ytb3JkZXIg Y29yZXMgYXJlIGFibGUgdG8gY29tcGVuc2F0ZSB3aXRob3V0IGxhcmdlIHVucm9sbGluZyBieSBy ZW9yZGVyaW5nIGluc3RydWN0aW9ucyBpbnRlcm5hbGx5LCBzbyB3aGVuIEkgYW0gYWJsZSB0byBy ZXdyaXRlIHRoZSBmdW5jdGlvbiB3aXRoIGlubGluZSBhc3NlbWJsZXIsIEkgd291bGQgbGlrZSB0 bwpzd2l0Y2ggd2l0aCAjaWZkZWYgb2YgY2hvb3NpbmcgdGhlIHBvcnRpb24gb2YgdW5yb2xsaW5n IGJldHdlZW4gaW4tb3JkZXIgY29yZXMgYW5kIG91dC1vZiBjb3JlcyBpbiB0aGUgZnV0dXJlLiBD dXJyZW50bHkgYWxsIHBoeXNpY2FsIHJpc2MtdiBjb3JlcyBhcmUgaW4tb3JkZXIgZGVzaWduIGJ1 dCBwcm9iYWJseSBvdXQtb2Ytb3JkZXIgY29yZXMgYXJlIGNvbWluZyBzb21lIHRpbWUgYW5kIGNv dWxkIGJlbmVmaXQgZnJvbSByZWR1Y2luZyB0aGUgYmluYXJ5IHNpemUgYW5kIHJlbGF4aW5nIHRo ZSByZXF1aXJlZCBtZW1vcnkgYmFuZHdpZHRoLgoKPiAKPiBUaGF0IHNhaWQsIHRoaXMgaXMgY2F1 c2luZyBteSAzMi1iaXQgY29uZmlncyB0byBoYW5nLsKgIFRoZXJlIHdlcmUgYSBmZXcgY29uZmxp Y3RzIHNvIEkgbWF5IGhhdmUgbWVzc2VkIHNvbWV0aGluZyB1cCwgYnV0IG5vdGhpbmcgaXMganVt cGluZyBvdXQgYXQgbWUuwqAgSSd2ZSBwdXQgd2hhdCBJIGVuZGVkIHVwIHdpdGggb24gYSBicmFu Y2gsIGlmIHlvdSBoYXZlIHRpbWUgdG8gbG9vayB0aGF0J2QgYmUgZ3JlYXQgYnV0IGlmIG5vdCB0 aGVuIEknbGwgdGFrZSBhbm90aGVyIHNob3QgYXQgdGhpcyB3aGVuIEkgZ2V0IGJhY2sgYXJvdW5k IHRvIGl0Lgo+IAo+IMKgwqAgaHR0cHM6Ly9naXQua2VybmVsLm9yZy9wdWIvc2NtL2xpbnV4L2tl cm5lbC9naXQvcGFsbWVyL2xpbnV4LmdpdC9jb21taXQvP2g9d2lwLXdvcmRfdXNlcl9jb3B5Cj4g Cj4gSGVyZSdzIHRoZSBiYWNrdHJhY2UsIHRob3VnaCB0aGF0J3MgcHJvYmFibHkgbm90IGFsbCB0 aGF0IHVzZWZ1bDoKPiAKPiBbwqDCoMKgIDAuNzAzNjk0XSBVbmFibGUgdG8gaGFuZGxlIGtlcm5l bCBOVUxMIHBvaW50ZXIgZGVyZWZlcmVuY2UgYXQgdmlydHVhbCBhZGRyZXNzIDAwMDAwNWE4Cj4g W8KgwqDCoCAwLjcwNDE5NF0gT29wcyBbIzFdCj4gW8KgwqDCoCAwLjcwNDMwMV0gTW9kdWxlcyBs aW5rZWQgaW46W8KgwqDCoCAwLjcwNDQ2M10gQ1BVOiAyIFBJRDogMSBDb21tOiBpbml0IE5vdCB0 YWludGVkIDUuMTQuMC1yYzEtMDAwMTYtZzU5NDYxZGRiOWRiZCAjNQo+IFvCoMKgwqAgMC43MDQ2 NjBdIEhhcmR3YXJlIG5hbWU6IHJpc2N2LXZpcnRpbyxxZW11IChEVCkKPiBbwqDCoMKgIDAuNzA0 ODAyXSBlcGMgOiB3YWxrX3N0YWNrZnJhbWUrMHhhYy8weGMyW8KgwqDCoCAwLjcwNDk0MV3CoCBy YSA6IGR1bXBfYmFja3RyYWNlKzB4MWEvMHgyMgo+IFvCoMKgwqAgMC43MDUwNzRdIGVwYyA6IGMw MDA0NTU4IHJhIDogYzAwMDQ1ODggc3AgOiBjMWM1ZmUxMAo+IFvCoMKgwqAgMC43MDUyMTZdwqAg Z3AgOiBjMThiNDFjOCB0cCA6IGMxY2Q4MDAwIHQwIDogMDAwMDAwMDBbwqDCoMKgIDAuNzA1MzU3 XcKgIHQxIDogZmZmZmZmZmYgdDIgOiAwMDAwMDAwMCBzMCA6IGMxYzVmZTQwCj4gW8KgwqDCoCAw LjcwNTUwNl3CoCBzMSA6IGMxMTMxM2RjIGEwIDogMDAwMDAwMDAgYTEgOiAwMDAwMDAwMAo+IFvC oMKgwqAgMC43MDU2NDddwqAgYTIgOiBjMDZmZDJjMiBhMyA6IGMxMTMxM2RjIGE0IDogYzA4NDI5 MmRbwqDCoMKgIDAuNzA1Nzg3XcKgIGE1IDogMDAwMDAwMDAgYTYgOiBjMTg2NGNiOCBhNyA6IDNm ZmZmZmZmCj4gW8KgwqDCoCAwLjcwNTkyNl3CoCBzMiA6IDAwMDAwMDAwIHMzIDogYzExMjNlODgg czQgOiAwMDAwMDAwMAo+IFvCoMKgwqAgMC43MDYwNjZdwqAgczUgOiBjMTEzMTNkYyBzNiA6IGMw NmZkMmMyIHM3IDogMDAwMDAwMDFbwqDCoMKgIDAuNzA2MjA2XcKgIHM4IDogMDAwMDAwMDAgczkg OiA5NWFmNmUyOCBzMTA6IDAwMDAwMDAwCj4gW8KgwqDCoCAwLjcwNjM0NV3CoCBzMTE6IDAwMDAw MDAxIHQzIDogMDAwMDAwMDAgdDQgOiAwMDAwMDAwMAo+IFvCoMKgwqAgMC43MDY0ODJdwqAgdDUg OiAwMDAwMDAwMSB0NiA6IDAwMDAwMDAwW8KgwqDCoCAwLjcwNjU5NF0gc3RhdHVzOiAwMDAwMDEw MCBiYWRhZGRyOiAwMDAwMDVhOCBjYXVzZTogMDAwMDAwMGQKPiBbwqDCoMKgIDAuNzA2ODA5XSBb PGMwMDA0NTU4Pl0gd2Fsa19zdGFja2ZyYW1lKzB4YWMvMHhjMgo+IFvCoMKgwqAgMC43MDcwMTld IFs8YzAwMDQ1ODg+XSBkdW1wX2JhY2t0cmFjZSsweDFhLzB4MjJbwqDCoMKgIDAuNzA3MTQ5XSBb PGMwNmZkMzEyPl0gc2hvd19zdGFjaysweDJjLzB4MzgKPiBbwqDCoMKgIDAuNzA3MjcxXSBbPGMw NmZmYmE0Pl0gZHVtcF9zdGFja19sdmwrMHg0MC8weDU4Cj4gW8KgwqDCoCAwLjcwNzQwMF0gWzxj MDZmZmJjZT5dIGR1bXBfc3RhY2srMHgxMi8weDFhW8KgwqDCoCAwLjcwNzUyMV0gWzxjMDZmZDRm Nj5dIHBhbmljKzB4ZmEvMHgyYTYKPiBbwqDCoMKgIDAuNzA3NjMyXSBbPGMwMDBlMmY0Pl0gZG9f ZXhpdCsweDdhOC8weDdhYwo+IFvCoMKgwqAgMC43MDc3NDldIFs8YzAwMGVlZmE+XSBkb19ncm91 cF9leGl0KzB4MmEvMHg3ZVvCoMKgwqAgMC43MDc4NzJdIFs8YzAwMGVmNjA+XSBfX3dha2VfdXBf cGFyZW50KzB4MC8weDIwCj4gW8KgwqDCoCAwLjcwNzk5OV0gWzxjMDAwMzAyMD5dIHJldF9mcm9t X3N5c2NhbGwrMHgwLzB4Mgo+IFvCoMKgwqAgMC43MDgzODVdIC0tLVsgZW5kIHRyYWNlIDI2MDk3 NjU2MWEzNzcwZDEgXS0tLQoKSSBhbSBzdXNwZWN0aW5nIHRoZSBlcnJvciBhYm92ZSBtaWdodCBi ZSB0aGUgc2FtZSBjYXVzZSBhcyBRaXUgaGF2ZSBtZW50aW9uaW5nIGF0IHRoZSBvdGhlciB0aHJl YWQuCgpBa2lyYQoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X18KbGludXgtcmlzY3YgbWFpbGluZyBsaXN0CmxpbnV4LXJpc2N2QGxpc3RzLmluZnJhZGVhZC5v cmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1yaXNj dgo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-21.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 848CAC4338F for ; Tue, 17 Aug 2021 09:03:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6C0E160EBD for ; Tue, 17 Aug 2021 09:03:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239296AbhHQJEN (ORCPT ); Tue, 17 Aug 2021 05:04:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235289AbhHQJED (ORCPT ); Tue, 17 Aug 2021 05:04:03 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC732C0613C1 for ; Tue, 17 Aug 2021 02:03:24 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id n5so10579460pjt.4 for ; Tue, 17 Aug 2021 02:03:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=cc:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=PhWxmc8derdQF3ciSQTXL4gicsn8XaD0Xw+UiOkSNfE=; b=dhHi8o3ngIQUg9/6xFAUsFR6MH1zfyTk5yl+TmU+QvURYS7z3XTjs6kV6XwjY7NEHg xN8QipriwivaRm+bBNcJqkLhFWPPPRkC6v/AnoeyU4yT/cgG28ccgeu2NXYhLiSoOmpp yEpdKPuKRHjHRFyl0zrnCgCd5bh15ui9XEOXfJsD2hP58QkL6eieRKJF3tmZXGpJU2N0 prwSzOzUmTwRu9PSv/6BeqsBjEwQBYTsxifqjukhfM+QoH3m3rdxV/BA1ofvDXp6Jj3+ JN/VE20+OzG/qKHxlgoVUowuq/43tq3LbAtN4q78Okn5ULVC+RnzLkKp6DrIof6UkrQG lfBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:cc:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PhWxmc8derdQF3ciSQTXL4gicsn8XaD0Xw+UiOkSNfE=; b=gG31R/Mm6Jc48cY3mitMTTJ54GX0bg7paoLP4kA+35yMJG3hEmQT5bJV9mWK3jcGdN 4WE9F7dmmvY2oJ38KEfluMsfvHNg04UAZRZeOzrre85bvJFyVaE5faqwUFJfOVT02NXW UpnCtPsLrKm95SC5N+CizWxpBstcYvgZOAffs1vfTUSW23rGhSlBn8HlroEi9LCWde5R Trph1fjM2fv2JqVZ1rQLG5zJdaoFiL3DOg60tW0WE+foQzOTpuV3m6pVzeGMWnXNCVpB PwTD4Erv2I0SA7CYQWEzh+2u8DFT3FYB8+gVdFUQFW2QbfNUFyK3bnapHY2kXk4KnsF5 fxAw== X-Gm-Message-State: AOAM531JRAa4qGaIrlOE/LjEvMgbSbNL+nl6/B+ct9AUQVBRdM23gRq8 QM0BXNwh5H7oc8enqnZuvp8QrthntEE= X-Google-Smtp-Source: ABdhPJy0c3WEiR8hPLWgXGJDEd84TGY7ELR4uHSfQ8aWHNpBDXhLAAZRZQyXyzM/SfSCAriJWQDXPw== X-Received: by 2002:a63:1460:: with SMTP id 32mr2578900pgu.323.1629191003883; Tue, 17 Aug 2021 02:03:23 -0700 (PDT) Received: from [10.252.0.198] (ec2-54-250-108-108.ap-northeast-1.compute.amazonaws.com. [54.250.108.108]) by smtp.gmail.com with ESMTPSA id d5sm1566461pju.28.2021.08.17.02.03.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Aug 2021 02:03:23 -0700 (PDT) Cc: akira.tsukamoto@gmail.com, Paul Walmsley , linux@roeck-us.net, geert@linux-m68k.org, qiuwenbo@kylinos.com.cn, aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/1] riscv: __asm_copy_to-from_user: Improve using word copy if size < 9*SZREG To: Palmer Dabbelt References: From: Akira Tsukamoto Message-ID: <468725dc-d110-5ede-e290-c7a97feacd43@gmail.com> Date: Tue, 17 Aug 2021 18:03:19 +0900 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/17/2021 3:09 AM, Palmer Dabbelt wrote: > On Fri, 30 Jul 2021 06:52:44 PDT (-0700), akira.tsukamoto@gmail.com wrote: >> Reduce the number of slow byte_copy when the size is in between >> 2*SZREG to 9*SZREG by using none unrolled word_copy. >> >> Without it any size smaller than 9*SZREG will be using slow byte_copy >> instead of none unrolled word_copy. >> >> Signed-off-by: Akira Tsukamoto >> --- >>  arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++---- >>  1 file changed, 42 insertions(+), 4 deletions(-) >> >> diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S >> index 63bc691cff91..6a80d5517afc 100644 >> --- a/arch/riscv/lib/uaccess.S >> +++ b/arch/riscv/lib/uaccess.S >> @@ -34,8 +34,10 @@ ENTRY(__asm_copy_from_user) >>      /* >>       * Use byte copy only if too small. >>       * SZREG holds 4 for RV32 and 8 for RV64 >> +     * a3 - 2*SZREG is minimum size for word_copy >> +     *      1*SZREG for aligning dst + 1*SZREG for word_copy >>       */ >> -    li    a3, 9*SZREG /* size must be larger than size in word_copy */ >> +    li    a3, 2*SZREG >>      bltu    a2, a3, .Lbyte_copy_tail >> >>      /* >> @@ -66,9 +68,40 @@ ENTRY(__asm_copy_from_user) >>      andi    a3, a1, SZREG-1 >>      bnez    a3, .Lshift_copy >> >> +.Lcheck_size_bulk: >> +    /* >> +     * Evaluate the size if possible to use unrolled. >> +     * The word_copy_unlrolled requires larger than 8*SZREG >> +     */ >> +    li    a3, 8*SZREG >> +    add    a4, a0, a3 >> +    bltu    a4, t0, .Lword_copy_unlrolled >> + >>  .Lword_copy: >> -        /* >> -     * Both src and dst are aligned, unrolled word copy >> +    /* >> +     * Both src and dst are aligned >> +     * None unrolled word copy with every 1*SZREG iteration >> +     * >> +     * a0 - start of aligned dst >> +     * a1 - start of aligned src >> +     * t0 - end of aligned dst >> +     */ >> +    bgeu    a0, t0, .Lbyte_copy_tail /* check if end of copy */ >> +    addi    t0, t0, -(SZREG) /* not to over run */ >> +1: >> +    REG_L    a5, 0(a1) >> +    addi    a1, a1, SZREG >> +    REG_S    a5, 0(a0) >> +    addi    a0, a0, SZREG >> +    bltu    a0, t0, 1b >> + >> +    addi    t0, t0, SZREG /* revert to original value */ >> +    j    .Lbyte_copy_tail >> + >> +.Lword_copy_unlrolled: >> +    /* >> +     * Both src and dst are aligned >> +     * Unrolled word copy with every 8*SZREG iteration >>       * >>       * a0 - start of aligned dst >>       * a1 - start of aligned src >> @@ -97,7 +130,12 @@ ENTRY(__asm_copy_from_user) >>      bltu    a0, t0, 2b >> >>      addi    t0, t0, 8*SZREG /* revert to original value */ >> -    j    .Lbyte_copy_tail >> + >> +    /* >> +     * Remaining might large enough for word_copy to reduce slow byte >> +     * copy >> +     */ >> +    j    .Lcheck_size_bulk >> >>  .Lshift_copy: > > I'm still not convinced that going all the way to such a large unrolling factor is a net win, but this at least provides a much smoother cost curve. I would like to meet and discuss the unrolling factor at some events. The assembler version of memset in arch/riscv/lib/memset.S had thirty two consequent unrolling loading and stores and initially I also thought it was too much unrolling and crazy. However, I could not beat it with the speed with any of my customization when reducing the unrolling. I never thought such a large unrolling would have benefit, my initial thought was minimum two or three would be enough for five or so pipeline cores with in-order and single issue design. At the same time I experienced in the past some x86’s in-order cores would benefit from large unrolling, so I decided to go which was faster after the measurement. The speed of the memset is critical for clearing the entire 4KiB page. The biggest down size is that the large unrolling will increase the binary size, and most of out-of-order cores are able to compensate without large unrolling by reordering instructions internally, so when I am able to rewrite the function with inline assembler, I would like to switch with #ifdef of choosing the portion of unrolling between in-order cores and out-of cores in the future. Currently all physical risc-v cores are in-order design but probably out-of-order cores are coming some time and could benefit from reducing the binary size and relaxing the required memory bandwidth. > > That said, this is causing my 32-bit configs to hang.  There were a few conflicts so I may have messed something up, but nothing is jumping out at me.  I've put what I ended up with on a branch, if you have time to look that'd be great but if not then I'll take another shot at this when I get back around to it. > >    https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=wip-word_user_copy > > Here's the backtrace, though that's probably not all that useful: > > [    0.703694] Unable to handle kernel NULL pointer dereference at virtual address 000005a8 > [    0.704194] Oops [#1] > [    0.704301] Modules linked in:[    0.704463] CPU: 2 PID: 1 Comm: init Not tainted 5.14.0-rc1-00016-g59461ddb9dbd #5 > [    0.704660] Hardware name: riscv-virtio,qemu (DT) > [    0.704802] epc : walk_stackframe+0xac/0xc2[    0.704941]  ra : dump_backtrace+0x1a/0x22 > [    0.705074] epc : c0004558 ra : c0004588 sp : c1c5fe10 > [    0.705216]  gp : c18b41c8 tp : c1cd8000 t0 : 00000000[    0.705357]  t1 : ffffffff t2 : 00000000 s0 : c1c5fe40 > [    0.705506]  s1 : c11313dc a0 : 00000000 a1 : 00000000 > [    0.705647]  a2 : c06fd2c2 a3 : c11313dc a4 : c084292d[    0.705787]  a5 : 00000000 a6 : c1864cb8 a7 : 3fffffff > [    0.705926]  s2 : 00000000 s3 : c1123e88 s4 : 00000000 > [    0.706066]  s5 : c11313dc s6 : c06fd2c2 s7 : 00000001[    0.706206]  s8 : 00000000 s9 : 95af6e28 s10: 00000000 > [    0.706345]  s11: 00000001 t3 : 00000000 t4 : 00000000 > [    0.706482]  t5 : 00000001 t6 : 00000000[    0.706594] status: 00000100 badaddr: 000005a8 cause: 0000000d > [    0.706809] [] walk_stackframe+0xac/0xc2 > [    0.707019] [] dump_backtrace+0x1a/0x22[    0.707149] [] show_stack+0x2c/0x38 > [    0.707271] [] dump_stack_lvl+0x40/0x58 > [    0.707400] [] dump_stack+0x12/0x1a[    0.707521] [] panic+0xfa/0x2a6 > [    0.707632] [] do_exit+0x7a8/0x7ac > [    0.707749] [] do_group_exit+0x2a/0x7e[    0.707872] [] __wake_up_parent+0x0/0x20 > [    0.707999] [] ret_from_syscall+0x0/0x2 > [    0.708385] ---[ end trace 260976561a3770d1 ]--- I am suspecting the error above might be the same cause as Qiu have mentioning at the other thread. Akira