From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4660FCA0FE1 for ; Fri, 1 Sep 2023 03:37:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348225AbjIADff (ORCPT ); Thu, 31 Aug 2023 23:35:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229573AbjIADfe (ORCPT ); Thu, 31 Aug 2023 23:35:34 -0400 Received: from 1wt.eu (ded1.1wt.eu [163.172.96.212]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7571FCF for ; Thu, 31 Aug 2023 20:35:27 -0700 (PDT) Received: (from willy@localhost) by mail.home.local (8.17.1/8.17.1/Submit) id 3813Z8tH015864; Fri, 1 Sep 2023 05:35:08 +0200 Date: Fri, 1 Sep 2023 05:35:08 +0200 From: Willy Tarreau To: Ammar Faizi Cc: Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Nicholas Rosenberg , Alviro Iskandar Setiawan , Michael William Jonathan , GNU/Weeb Mailing List , Linux Kernel Mailing List Subject: Re: [RFC PATCH v1 3/5] tools/nolibc: x86-64: Use `rep cmpsb` for `memcmp()` Message-ID: References: <20230830135726.1939997-1-ammarfaizi2@gnuweeb.org> <20230830135726.1939997-4-ammarfaizi2@gnuweeb.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 01, 2023 at 10:24:42AM +0700, Ammar Faizi wrote: > On Wed, Aug 30, 2023 at 11:26:57PM +0200, Willy Tarreau wrote: > > Out of curiosity, given that you implemented the 3 other ones directly > > in an asm statement, is there a particular reason this one mixes a bit > > of C and asm ? > > Because this one maybe unused. The other are explicitly exported. Makes sense, indeed. > > It would probably be something around this, in the same vein: > > > > memcmp: > > xchg %esi,%eax // source1 > > mov %rdx,%rcx // count > > rep cmpsb // source2 in rdi; sets ZF on equal, CF if src1 > seta %al // 0 if src2 <= src1, 1 if src2 > src1 > > sbb $0, %al // 0 if src2 == src1, -1 if src2 < src1, 1 if src2 > src1 > > movsx %al, %eax // sign extend to %eax > > ret > > > > Note that the output logic could have to be revisited, I'm not certain but > > at first glance it looks valid. > > After thinking about this more, I think I'll drop the memcmp() patch > because it will prevent optimization when comparing a small value. > > For example, without __asm__: > > memcmp(var, "abcd", 4); > > may compile to: > > cmpl $0x64636261, %reg > ...something... > > But with __asm__, the compiler can't do that. Thus, it's not worth > optimizing the memcmp() in this case. Ah you're totally right! Willy