dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* dash performance regression with [ in latest github code
@ 2025-02-25 15:04 Jan Pechanec
  2025-02-25 15:13 ` Jan Pechanec
  2025-03-09  9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
  0 siblings, 2 replies; 4+ messages in thread
From: Jan Pechanec @ 2025-02-25 15:04 UTC (permalink / raw)
  To: dash

Hi,

thank you for working on dash.  I was testing it recently and it worked
really well.

However, I noticed the dash code from github does filename pattern
matching even for code like "[ x = x ] && echo ok".  I believe the
unquoted space after '[' should not trigger pattern matching but rather
only to invoke the test/[ utility, as before.  It seems it works fine
though and only doing some extra unneeded work which may not be
immediatelly noticeable.

dash installed on my Oracle Linux 9:

janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash
dash-0.5.11.5-4.el9.x86_64.debug
janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'

real    0m0.752s
user    0m0.748s
sys     0m0.002s

dash from github (commit b3e38adf6718801e7f06267b438c45caec9523bb) take
way more time to do the same thing:

janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'

real    0m4.202s
user    0m1.361s
sys     0m2.804s

For the latter, strace shows open, fstat, getdents*, and close system
calls for each iteration and it depends on number of files in the
current directory.  With more files, it takes more time:

janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
real    0m15.591s
user    0m5.704s
sys     0m9.828s

If I change [ to test, the dash github version behaves as before, and
possibly even faster:

janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done'

real    0m0.662s
user    0m0.659s
sys     0m0.002s

Even bash would be faster than the current github version of dash:

janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
real    0m1.943s
user    0m1.939s
sys     0m0.002s

Unfortunately, I do not have time to work on a patch.

Best regards,
Jan

-- 
Jan Pechanec <jan.pechanec@oracle.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: dash performance regression with [ in latest github code
  2025-02-25 15:04 dash performance regression with [ in latest github code Jan Pechanec
@ 2025-02-25 15:13 ` Jan Pechanec
  2025-03-09  9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
  1 sibling, 0 replies; 4+ messages in thread
From: Jan Pechanec @ 2025-02-25 15:13 UTC (permalink / raw)
  To: dash

My apology, by "github code" I really meant the official git repo as
specified on http://gondor.apana.org.au/~herbert/dash/:

  https://git.kernel.org/pub/scm/utils/dash/dash.git

Cheers,
Jan

On Tue, Feb 25, 2025 at 04:04:47PM +0100, Jan Pechanec wrote:
> Hi,
> 
> thank you for working on dash.  I was testing it recently and it worked
> really well.
> 
> However, I noticed the dash code from github does filename pattern
> matching even for code like "[ x = x ] && echo ok".  I believe the
> unquoted space after '[' should not trigger pattern matching but rather
> only to invoke the test/[ utility, as before.  It seems it works fine
> though and only doing some extra unneeded work which may not be
> immediatelly noticeable.
> 
> dash installed on my Oracle Linux 9:
> 
> janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash
> dash-0.5.11.5-4.el9.x86_64.debug
> janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> 
> real    0m0.752s
> user    0m0.748s
> sys     0m0.002s
> 
> dash from github (commit b3e38adf6718801e7f06267b438c45caec9523bb) take
> way more time to do the same thing:
> 
> janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> 
> real    0m4.202s
> user    0m1.361s
> sys     0m2.804s
> 
> For the latter, strace shows open, fstat, getdents*, and close system
> calls for each iteration and it depends on number of files in the
> current directory.  With more files, it takes more time:
> 
> janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real    0m15.591s
> user    0m5.704s
> sys     0m9.828s
> 
> If I change [ to test, the dash github version behaves as before, and
> possibly even faster:
> 
> janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done'
> 
> real    0m0.662s
> user    0m0.659s
> sys     0m0.002s
> 
> Even bash would be faster than the current github version of dash:
> 
> janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real    0m1.943s
> user    0m1.939s
> sys     0m0.002s
> 
> Unfortunately, I do not have time to work on a patch.
> 
> Best regards,
> Jan
> 
> -- 
> Jan Pechanec <jan.pechanec@oracle.com>

-- 
Jan Pechanec <jan.pechanec@oracle.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] expand: Add bypass for literal "]" in expandmeta
  2025-02-25 15:04 dash performance regression with [ in latest github code Jan Pechanec
  2025-02-25 15:13 ` Jan Pechanec
@ 2025-03-09  9:42 ` Herbert Xu
  2025-03-10  8:35   ` [External] : " Jan Pechanec
  1 sibling, 1 reply; 4+ messages in thread
From: Herbert Xu @ 2025-03-09  9:42 UTC (permalink / raw)
  To: Jan Pechanec; +Cc: dash

Jan Pechanec <Jan.Pechanec@oracle.com> wrote:
> 
> thank you for working on dash.  I was testing it recently and it worked
> really well.
> 
> However, I noticed the dash code from github does filename pattern
> matching even for code like "[ x = x ] && echo ok".  I believe the
> unquoted space after '[' should not trigger pattern matching but rather
> only to invoke the test/[ utility, as before.  It seems it works fine
> though and only doing some extra unneeded work which may not be
> immediatelly noticeable.
> 
> dash installed on my Oracle Linux 9:
> 
> janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash
> dash-0.5.11.5-4.el9.x86_64.debug
> janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> 
> real    0m0.752s
> user    0m0.748s
> sys     0m0.002s
> 
> dash from github (commit b3e38adf6718801e7f06267b438c45caec9523bb) take
> way more time to do the same thing:
> 
> janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> 
> real    0m4.202s
> user    0m1.361s
> sys     0m2.804s
> 
> For the latter, strace shows open, fstat, getdents*, and close system
> calls for each iteration and it depends on number of files in the
> current directory.  With more files, it takes more time:
> 
> janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real    0m15.591s
> user    0m5.704s
> sys     0m9.828s
> 
> If I change [ to test, the dash github version behaves as before, and
> possibly even faster:
> 
> janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done'
> 
> real    0m0.662s
> user    0m0.659s
> sys     0m0.002s
> 
> Even bash would be faster than the current github version of dash:
> 
> janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real    0m1.943s
> user    0m1.939s
> sys     0m0.002s

Fix performance regression for idiomatic "[ ... ]" expression by
adding a bypass for a literal "]" in pathname expansion.

Reported-by: Jan Pechanec <Jan.Pechanec@oracle.com>
Fixes: 8d0eca2d9fb5 ("expand: Rewrite expmeta meta detection")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/src/expand.c b/src/expand.c
index 7a30648..5114646 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -1555,7 +1555,7 @@ expandmeta(struct strlist *str)
 
 		if (fflag)
 			goto nometa;
-		if (!strpbrk(str->text, "*?]"))
+		if (!strpbrk(str->text, "*?]") || !memcmp(str->text, "]", 2))
 			goto nometa;
 		savelastp = exparg.lastp;
 
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [External] : [PATCH] expand: Add bypass for literal "]" in expandmeta
  2025-03-09  9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
@ 2025-03-10  8:35   ` Jan Pechanec
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Pechanec @ 2025-03-10  8:35 UTC (permalink / raw)
  To: Herbert Xu; +Cc: dash

...

> Fix performance regression for idiomatic "[ ... ]" expression by
> adding a bypass for a literal "]" in pathname expansion.
> 
> Reported-by: Jan Pechanec <Jan.Pechanec@oracle.com>
> Fixes: 8d0eca2d9fb5 ("expand: Rewrite expmeta meta detection")
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/src/expand.c b/src/expand.c
> index 7a30648..5114646 100644
> --- a/src/expand.c
> +++ b/src/expand.c
> @@ -1555,7 +1555,7 @@ expandmeta(struct strlist *str)
>  
>  		if (fflag)
>  			goto nometa;
> -		if (!strpbrk(str->text, "*?]"))
> +		if (!strpbrk(str->text, "*?]") || !memcmp(str->text, "]", 2))
>  			goto nometa;
>  		savelastp = exparg.lastp;

Hi Herbert, thank you, this seems to fix the regression reported.  I
just applied the patch and succesfully re-tested.

Regards,
Jan

-- 
Jan Pechanec <jan.pechanec@oracle.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-03-10  8:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-25 15:04 dash performance regression with [ in latest github code Jan Pechanec
2025-02-25 15:13 ` Jan Pechanec
2025-03-09  9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
2025-03-10  8:35   ` [External] : " Jan Pechanec

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).