From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8C69CA0FE1 for ; Fri, 1 Sep 2023 11:47:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245244AbjIALrv (ORCPT ); Fri, 1 Sep 2023 07:47:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245343AbjIALru (ORCPT ); Fri, 1 Sep 2023 07:47:50 -0400 Received: from 1wt.eu (ded1.1wt.eu [163.172.96.212]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 25D291994 for ; Fri, 1 Sep 2023 04:47:26 -0700 (PDT) Received: (from willy@localhost) by mail.home.local (8.17.1/8.17.1/Submit) id 381Bkiql019545; Fri, 1 Sep 2023 13:46:44 +0200 Date: Fri, 1 Sep 2023 13:46:44 +0200 From: Willy Tarreau To: David Laight Cc: "'Ammar Faizi'" , Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Nicholas Rosenberg , Alviro Iskandar Setiawan , Michael William Jonathan , GNU/Weeb Mailing List , Linux Kernel Mailing List Subject: Re: [RFC PATCH v1 0/5] nolibc x86-64 string functions Message-ID: References: <20230830135726.1939997-1-ammarfaizi2@gnuweeb.org> <5a821292d96a4dbc84c96ccdc6b5b666@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5a821292d96a4dbc84c96ccdc6b5b666@AcuMS.aculab.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 01, 2023 at 11:34:18AM +0000, David Laight wrote: > From: Ammar Faizi > > Sent: 30 August 2023 14:57 > > > > This is an RFC patchset for nolibc x86-64 string functions. There are 5 > > patches in this series. > > > > ## Patch 1-3: Use `rep movsb`, `rep stosb`, and `rep cmpsb` for: > > - memcpy() and memmove() > > - memset() > > - memcmp() > > respectively. They can simplify the generated ASM code. > > > ... > > After this series: > > ``` > > 000000000000140a : > > 140a: 48 89 f8 mov %rdi,%rax > > 140d: 48 89 d1 mov %rdx,%rcx > > 1410: 48 8d 7c 0f ff lea -0x1(%rdi,%rcx,1),%rdi > > 1415: 48 8d 74 0e ff lea -0x1(%rsi,%rcx,1),%rsi > > 141a: fd std > > 141b: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi) > > 141d: fc cld > > 141e: c3 ret > > Isn't that completely broken? > > You need to select between forwards and backwards moves. > Since forwards moves are preferred it is best to do > if (dst - src < len) > backards_copy() > else > formwards_copy() > > David You're completely right indeed, reminds me about the copy_up/copy_down that were not used anymore :-) Willy