From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755478AbbDONYf (ORCPT ); Wed, 15 Apr 2015 09:24:35 -0400 Received: from cantor2.suse.de ([195.135.220.15]:51867 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753659AbbDONY2 (ORCPT ); Wed, 15 Apr 2015 09:24:28 -0400 Message-ID: <552E668A.7090707@suse.cz> Date: Wed, 15 Apr 2015 15:24:26 +0200 From: Michal Marek User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Alexey Dobriyan CC: akpm@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] tags: much faster, parallel "make tags" References: <20150414172047.GA5641@p183.telecom.by> In-Reply-To: <20150414172047.GA5641@p183.telecom.by> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2015-04-14 19:20, Alexey Dobriyan wrote: > ctags is single-threaded program. Split list of files to be tagged into > equal parts, 1 part for each CPU and then merge the results. > > Speedup on one 2-way box I have is ~143 s => ~99 s (-31%). > On another 4-way box: ~120 s => ~65 s (-46%!). > > Resulting "tags" files aren't byte-for-byte identical because ctags > program numbers anon struct and enum declarations with "__anonNNN" > symbols. If those lines are removed, "tags" file becomes byte-for-byte > identical with those generated with current code. > > Signed-off-by: Alexey Dobriyan > --- > > scripts/tags.sh | 34 ++++++++++++++++++++++++++++++++-- > 1 file changed, 32 insertions(+), 2 deletions(-) > > --- a/scripts/tags.sh > +++ b/scripts/tags.sh > @@ -152,7 +152,24 @@ dogtags() > > exuberant() > { > - all_target_sources | xargs $1 -a \ > + NR_CPUS=1 > + if [ -e /proc/cpuinfo ]; then > + NR_CPUS=$(grep -e '^processor : ' /proc/cpuinfo | wc -l) > + fi I wonder if we should rather respect the -j option to make here. But then most people probably won't realize that make tags is parallel and will not use -j when generating tags. So let's leave it as is. > + > + rm -f .make-tags.src.* .make-tags.* .make-tags.src.* is a subset of .make-tags.* > + > + all_target_sources >.make-tags.src > + # seems like Useless Use of cat(1) but not really > + NR_LINES=$(cat .make-tags.src | wc -l) > + NR_LINES=$((($NR_LINES + $NR_CPUS - 1) / $NR_CPUS)) > + > + split -a 6 -d -l $NR_LINES .make-tags.src .make-tags.src. > + > + for i in .make-tags.src.*; do > + N=$(echo $i | sed -e 's/.*\.//') > + # -u: don't sort now, sort later > + cat $i | xargs $1 -a -f .make-tags.$N -u \ xargs <$i $1 ... if you are concerned about uses of cat(1) ;) and the -a option is not necessary since we are creating the tmp files. > + # write header > + $1 -f tags /dev/null > + # remove header > + for i in .make-tags.*; do > + sed -i -e '/^!/d' $i > + done > + sort .make-tags.* >>tags The hardcoded "tags" filename will break 'make TAGS' when using exuberant ctags via an 'etags' symlink. Michal