From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 136E3C433E0 for ; Tue, 23 Mar 2021 15:05:38 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A259C619AB for ; Tue, 23 Mar 2021 15:05:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A259C619AB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=OsLRCbQfrSUizmwkk4ohpDTTVL4cg8/csspGXaGs+Ls=; b=UfXR/BwnYqAki0aqefOcHbk8P ZL02iyMU35ZKA4xq8RytssaciRssNfk4BbEn4iLroYX+lTkvoTlNegpILRLjSJ4vZ+M9cOHtj6YRB oMQjOPvDmVqOtXpKWelAjhvFd1kc6rVHPqIGEkelDSqNG7XWZypHwF/hVM/eZJGee+jSyi1xctilC JD6Fn0s67SAyTZ3b0xUqjAZf0/gv9NHuRz0f5gna+p0Io7K26+IKruwPRfuJT7sK70uhuUEGcNNMY ZesdQ2y0P6gC59n0H7BAmmGvhrRAnxhSI6nRLitWL+N+kuJfh6Q0HeESQH/6pa1Dzas4PK/i+FrsH Pjl+5WB8Q==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lOiZo-00FE66-PW; Tue, 23 Mar 2021 15:04:12 +0000 Received: from mail.kernel.org ([198.145.29.99]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lOiZk-00FE43-49 for linux-arm-kernel@lists.infradead.org; Tue, 23 Mar 2021 15:04:10 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id B9CC46198C; Tue, 23 Mar 2021 15:04:01 +0000 (UTC) Date: Tue, 23 Mar 2021 15:03:59 +0000 From: Catalin Marinas To: Will Deacon Cc: Robin Murphy , Yang Yingliang , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, guohanjun@huawei.com Subject: Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes Message-ID: <20210323150358.GA10576@arm.com> References: <20210323073432.3422227-1-yangyingliang@huawei.com> <20210323073432.3422227-3-yangyingliang@huawei.com> <03ac41af-c433-cd66-8195-afbf9c49554c@arm.com> <20210323133217.GA11802@willie-the-truck> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210323133217.GA11802@willie-the-truck> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210323_150408_534985_CCFCB3C8 X-CRM114-Status: GOOD ( 28.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Mar 23, 2021 at 01:32:18PM +0000, Will Deacon wrote: > On Tue, Mar 23, 2021 at 12:08:56PM +0000, Robin Murphy wrote: > > On 2021-03-23 07:34, Yang Yingliang wrote: > > > When copy over 128 bytes, src/dst is added after > > > each ldp/stp instruction, it will cost more time. > > > To improve this, we only add src/dst after load > > > or store 64 bytes. > > > > This breaks the required behaviour for copy_*_user(), since the fault > > handler expects the base address to be up-to-date at all times. Say you're > > copying 128 bytes and fault on the 4th store, it should return 80 bytes not > > copied; the code below would return 128 bytes not copied, even though 48 > > bytes have actually been written to the destination. > > > > We've had a couple of tries at updating this code (because the whole > > template is frankly a bit terrible, and a long way from the well-optimised > > code it was derived from), but getting the fault-handling behaviour right > > without making the handler itself ludicrously complex has proven tricky. And > > then it got bumped down the priority list while the uaccess behaviour in > > general was in flux - now that the dust has largely settled on that I should > > probably try to find time to pick this up again... > > I think the v5 from Oli was pretty close, but it didn't get any review: > > https://lore.kernel.org/r/20200914151800.2270-1-oli.swede@arm.com These are still unread in my inbox as I was planning to look at them again. However, I think we discussed a few options on how to proceed but no concrete plans: 1. Merge Oli's patches as they are, with some potential complexity issues as fixing the user copy accuracy was non-trivial. I think the latest version uses a two-stage approach: when taking a fault, it falls back to to byte-by-byte with the expectation that it faults again and we can then report the correct fault address. 2. Only use Cortex Strings for in-kernel memcpy() while the uaccess routines are some simple loops that align the uaccess part only (unlike Cortex Strings which usually to align the source). 3. Similar to 2 but with Cortex Strings imported automatically with some script to make it easier to keep the routines up to date. If having non-optimal (but good enough) uaccess routines is acceptable, I'd go for (2) with a plan to move to (3) at the next Cortex Strings update. I also need to look again at option (1) to see how complex it is but given the time one spends on importing a new Cortex Strings library, I don't think (1) scales well on the long term. We could, however, go for (1) now and look at (3) with the next update. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel