From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934886AbcA1WXC (ORCPT ); Thu, 28 Jan 2016 17:23:02 -0500 Received: from smtprelay0214.hostedemail.com ([216.40.44.214]:35285 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S934719AbcA1WXB (ORCPT ); Thu, 28 Jan 2016 17:23:01 -0500 X-Session-Marker: 6A6F6540706572636865732E636F6D X-Spam-Summary: 50,0,0,,d41d8cd98f00b204,joe@perches.com,:::::::::::::::,RULES_HIT:41:355:379:541:599:960:967:973:988:989:1260:1277:1311:1313:1314:1345:1359:1373:1431:1437:1515:1516:1518:1534:1541:1593:1594:1711:1730:1747:1777:1792:2393:2525:2560:2563:2682:2685:2690:2828:2859:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:5007:6117:6119:6261:7576:7903:7974:9025:10004:10400:10848:11026:11232:11658:11914:12043:12296:12438:12517:12519:12555:12740:13069:13072:13311:13357:13845:14096:14097:14659:21080:30003:30022:30054:30064:30070:30091,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: cloth89_7141349493117 X-Filterd-Recvd-Size: 2459 Message-ID: <1454019777.10099.56.camel@perches.com> Subject: Re: [PATCH] Optimize int_sqrt for small values for faster idle From: Joe Perches To: Andi Kleen , akpm@linux-foundation.org, Anshul Garg Cc: linux-kernel@vger.kernel.org, rafael.j.wysocki@intel.com, lenb@kernel.org, Andi Kleen , Davidlohr Bueso Date: Thu, 28 Jan 2016 14:22:57 -0800 In-Reply-To: <1454017365-8509-1-git-send-email-andi@firstfloor.org> References: <1454017365-8509-1-git-send-email-andi@firstfloor.org> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.18.3-1ubuntu1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (resending with email addresses that shouldn't bounce) (adding Anshul Garg) (fixed Davidlohr Bueso's address) On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is important > for many workloads. > > I instrumented the function on my workstation and most values are > 16bit only and most others 32bit (50% percentile is 122094, > 75% is 3699533). > > sqrt is implemented by starting with an initial estimation, > and then iterating. int_sqrt currently only uses a fixed > estimating which is good for 64bits worth of input. > > This patch adds some checks at the beginning to start with > a better estimate for values fitting in 8, 16bit and 32bit. > This makes int_sqrt between 60+% faster for values in 16bit, > and still somewhat faster (between 10 and 30%) for larger values > upto 32bit. Full 64bit is slightly slower. > > This optimizes the short idle calls and does not hurt the > long sleep (which probably do not care) much. > > An alternative would be a full table drive approach, or > trying some inverted sqrt optimization, but this simple change > already seems to have a good payoff. This thread might be relevant: https://lkml.org/lkml/2015/2/2/600 and perhaps using fls might still be a good approach.