From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFBAAFA373F for ; Tue, 25 Oct 2022 09:06:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232187AbiJYJGb (ORCPT ); Tue, 25 Oct 2022 05:06:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232065AbiJYJEv (ORCPT ); Tue, 25 Oct 2022 05:04:51 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CB1015A336 for ; Tue, 25 Oct 2022 02:03:56 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6500,9779,10510"; a="307621657" X-IronPort-AV: E=Sophos;i="5.95,211,1661842800"; d="scan'208";a="307621657" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Oct 2022 02:03:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10510"; a="700469060" X-IronPort-AV: E=Sophos;i="5.95,211,1661842800"; d="scan'208";a="700469060" Received: from smile.fi.intel.com ([10.237.72.54]) by fmsmga004.fm.intel.com with ESMTP; 25 Oct 2022 02:03:41 -0700 Received: from andy by smile.fi.intel.com with local (Exim 4.96) (envelope-from ) id 1onFqW-001ukj-1p; Tue, 25 Oct 2022 12:03:40 +0300 Date: Tue, 25 Oct 2022 12:03:40 +0300 From: Andy Shevchenko To: Nathan Moinvaziri Cc: "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if chars match Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 25, 2022 at 11:00:36AM +0300, Andy Shevchenko wrote: > On Tue, Oct 25, 2022 at 4:46 AM Nathan Moinvaziri wrote: ... > > When running tests using Quick Benchmark with two matching 256 character > > strings these changes result in anywhere between ~6-9x speed improvement. > > > > * We use unsigned char instead of int similar to strncasecmp. > > * We only subtract c1 - c2 when they are not equal. ... > You tell us that this is more preformant, but have not provided the > numbers. Can we see those, please? So, I have read carefully and see the reference to some QuickBenchmark I have no idea about. What I meant here is to have numbers provided by an (open source) tool (maybe even in-kernel test case) that anybody can test on their machines. You also missed details about how you run, what the data set has been used, etc. > Note, that you basically trash CPU cache lines when characters are not > equal, and before doing that you have a branching. I'm unsure that > your way is more performant than the original one. -- With Best Regards, Andy Shevchenko