All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: dev@dpdk.org
Subject: Re: [PATCH 01/19] devtools: add simple script to find duplicate includes
Date: Fri, 14 Jul 2017 17:54:18 +0200	[thread overview]
Message-ID: <62513523.FtO1afHum7@xps> (raw)
In-Reply-To: <12511927.qublii9NrP@xps>

14/07/2017 17:39, Thomas Monjalon:
> 13/07/2017 08:56, Thomas Monjalon:
> > 12/07/2017 23:59, Stephen Hemminger:
> > > On Tue, 11 Jul 2017 22:33:55 +0200
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > > Thank you for this script, but... it is written in Perl!
> > > > I don't think it is a good idea to add yet another language to DPDK.
> > > > We already have shell and python scripts.
> > > > And I am not sure a lot of (young) people are able to parse it ;)
> > > > 
> > > > I would like to propose this shell script:
> [...]
> > 
> > > plus shell is 7x slower.
> > > 
> > > $ time bash -c "find . -name '*.c' | xargs /tmp/dupinc.sh"
> > > real	0m0.765s
> > > user	0m1.220s
> > > sys	0m0.155s
> > > $time bash -c "find . -name '*.c' | xargs ~/bin/dup_inc.pl"
> > > real	0m0.131s
> > > user	0m0.118s
> > > sys	0m0.014s
> > 
> > I don't think speed is really relevant here :)
> 
> I did my own benchmark (recreation time):
> 
> # time sh -c 'for file in $(git ls-files app buildtools drivers examples lib test) ; do devtools/dup_include.pl $file ; done'
> 4,41s user 1,32s system 101% cpu 5,667 total
> # time devtools/check-duplicate-includes.sh
> 5,48s user 1,00s system 153% cpu 4,222 total
> 
> The shell version is reported as faster on my computer!
> 
> It is faster when filtering only .c and .h files:
> 
> for file in $(git ls-files '*.[ch]') ; do
>     dups=$(sed -rn "s,$pattern,\1,p" $file | sort | uniq -d)
>     [ -z "$dups" ] || echo "$dups" | sed "s,^,$file: duplicated include: ,"
> done
> 
> # time sh -c 'for file in $(git ls-files "*.[ch]") ; do devtools/dup_include.pl $file ; done'
> 3,65s user 1,05s system 100% cpu 4,668 total
> # time devtools/check-duplicate-includes.sh
> 4,72s user 0,80s system 153% cpu 3,603 total
> 
> I prefer this version using only pipes, which is well parallelized:
> 
> for file in $(git ls-files '*.[ch]') ; do
>     sed -rn "s,$pattern,\1,p" $file | sort | uniq -d |
>     sed "s,^,$file: duplicated include: ,"
> done
> 
> 7,40s user 1,49s system 231% cpu 3,847 total

And now, the big shell optimization:
	export LC_ALL=C
Result is impressive:
	2,99s user 0,72s system 258% cpu 1,436 total

I'm sure you will agree to integrate my version now :)

  reply	other threads:[~2017-07-14 15:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11 18:55 [PATCH 00/19] cleanup duplicate includes Stephen Hemminger
2017-07-11 18:55 ` [PATCH 01/19] devtools: add simple script to find " Stephen Hemminger
2017-07-11 20:33   ` Thomas Monjalon
2017-07-11 23:05     ` Stephen Hemminger
2017-07-12  6:41       ` Thomas Monjalon
2017-07-12 21:59     ` Stephen Hemminger
2017-07-13  6:56       ` Thomas Monjalon
2017-07-13 12:19         ` Wiles, Keith
2017-07-13 12:36           ` Thomas Monjalon
2017-07-14 15:39         ` Thomas Monjalon
2017-07-14 15:54           ` Thomas Monjalon [this message]
2017-07-14 16:17             ` Wiles, Keith
2017-07-14 17:07   ` [PATCH] devtools: add script to find duplicated includes Thomas Monjalon
2017-07-14 18:39     ` Wiles, Keith
2017-07-14 18:43       ` Wiles, Keith
2017-07-15  9:44         ` Thomas Monjalon
2017-07-15 10:00     ` [PATCH v2] " Thomas Monjalon
2017-08-03 10:06       ` Thomas Monjalon
2017-07-17 11:50     ` [PATCH] " Neil Horman
2017-07-17 13:01       ` Thomas Monjalon
2017-07-11 18:55 ` [PATCH 02/19] eal: remove duplicate includes Stephen Hemminger
2017-07-11 18:55 ` [PATCH 03/19] cmdline: remove duplicate include of errno.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 04/19] distributor: remove duplicate include of rte_compat.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 05/19] lpm: remove duplicate include of errno.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 06/19] cuckoo_hash: remove duplicate include of rte_log.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 07/19] timer: remove duplicate include of rte_per_lcore.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 08/19] bnxt: remove duplicate include of unistd.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 09/19] ixgbe: remove duplicate include of rte_atomic.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 10/19] i40e: " Stephen Hemminger
2017-07-11 18:55 ` [PATCH 11/19] virtio: remove duplicated includes Stephen Hemminger
2017-07-11 18:55 ` [PATCH 12/19] cxgbe: " Stephen Hemminger
2017-07-11 18:55 ` [PATCH 13/19] vmxnet3: remove duplicated include of rte_atomic.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 14/19] dpaa2: remove duplicated includes Stephen Hemminger
2017-07-13  9:38   ` Hemant Agrawal
2017-07-11 18:55 ` [PATCH 15/19] fsmlc: " Stephen Hemminger
2017-07-13  9:39   ` Hemant Agrawal
2017-07-11 18:55 ` [PATCH 16/19] app/proc-info: remove duplicate include Stephen Hemminger
2017-07-11 18:55 ` [PATCH 17/19] app/test-pmd: remove duplicate includes Stephen Hemminger
2017-07-11 18:55 ` [PATCH 18/19] test/test-pipeline: remove duplicate include of rte_per_lcore.h Stephen Hemminger
2017-07-11 18:55 ` [PATCH 19/19] test/test: remove duplicate includes Stephen Hemminger
2017-07-16 15:34 ` [PATCH 00/19] cleanup " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62513523.FtO1afHum7@xps \
    --to=thomas@monjalon.net \
    --cc=dev@dpdk.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.