* dash performance regression with [ in latest github code
@ 2025-02-25 15:04 Jan Pechanec
2025-02-25 15:13 ` Jan Pechanec
2025-03-09 9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
0 siblings, 2 replies; 4+ messages in thread
From: Jan Pechanec @ 2025-02-25 15:04 UTC (permalink / raw)
To: dash
Hi,
thank you for working on dash. I was testing it recently and it worked
really well.
However, I noticed the dash code from github does filename pattern
matching even for code like "[ x = x ] && echo ok". I believe the
unquoted space after '[' should not trigger pattern matching but rather
only to invoke the test/[ utility, as before. It seems it works fine
though and only doing some extra unneeded work which may not be
immediatelly noticeable.
dash installed on my Oracle Linux 9:
janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash
dash-0.5.11.5-4.el9.x86_64.debug
janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
real 0m0.752s
user 0m0.748s
sys 0m0.002s
dash from github (commit b3e38adf6718801e7f06267b438c45caec9523bb) take
way more time to do the same thing:
janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
real 0m4.202s
user 0m1.361s
sys 0m2.804s
For the latter, strace shows open, fstat, getdents*, and close system
calls for each iteration and it depends on number of files in the
current directory. With more files, it takes more time:
janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
real 0m15.591s
user 0m5.704s
sys 0m9.828s
If I change [ to test, the dash github version behaves as before, and
possibly even faster:
janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done'
real 0m0.662s
user 0m0.659s
sys 0m0.002s
Even bash would be faster than the current github version of dash:
janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
real 0m1.943s
user 0m1.939s
sys 0m0.002s
Unfortunately, I do not have time to work on a patch.
Best regards,
Jan
--
Jan Pechanec <jan.pechanec@oracle.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: dash performance regression with [ in latest github code
2025-02-25 15:04 dash performance regression with [ in latest github code Jan Pechanec
@ 2025-02-25 15:13 ` Jan Pechanec
2025-03-09 9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
1 sibling, 0 replies; 4+ messages in thread
From: Jan Pechanec @ 2025-02-25 15:13 UTC (permalink / raw)
To: dash
My apology, by "github code" I really meant the official git repo as
specified on http://gondor.apana.org.au/~herbert/dash/:
https://git.kernel.org/pub/scm/utils/dash/dash.git
Cheers,
Jan
On Tue, Feb 25, 2025 at 04:04:47PM +0100, Jan Pechanec wrote:
> Hi,
>
> thank you for working on dash. I was testing it recently and it worked
> really well.
>
> However, I noticed the dash code from github does filename pattern
> matching even for code like "[ x = x ] && echo ok". I believe the
> unquoted space after '[' should not trigger pattern matching but rather
> only to invoke the test/[ utility, as before. It seems it works fine
> though and only doing some extra unneeded work which may not be
> immediatelly noticeable.
>
> dash installed on my Oracle Linux 9:
>
> janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash
> dash-0.5.11.5-4.el9.x86_64.debug
> janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
>
> real 0m0.752s
> user 0m0.748s
> sys 0m0.002s
>
> dash from github (commit b3e38adf6718801e7f06267b438c45caec9523bb) take
> way more time to do the same thing:
>
> janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
>
> real 0m4.202s
> user 0m1.361s
> sys 0m2.804s
>
> For the latter, strace shows open, fstat, getdents*, and close system
> calls for each iteration and it depends on number of files in the
> current directory. With more files, it takes more time:
>
> janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real 0m15.591s
> user 0m5.704s
> sys 0m9.828s
>
> If I change [ to test, the dash github version behaves as before, and
> possibly even faster:
>
> janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done'
>
> real 0m0.662s
> user 0m0.659s
> sys 0m0.002s
>
> Even bash would be faster than the current github version of dash:
>
> janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real 0m1.943s
> user 0m1.939s
> sys 0m0.002s
>
> Unfortunately, I do not have time to work on a patch.
>
> Best regards,
> Jan
>
> --
> Jan Pechanec <jan.pechanec@oracle.com>
--
Jan Pechanec <jan.pechanec@oracle.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] expand: Add bypass for literal "]" in expandmeta
2025-02-25 15:04 dash performance regression with [ in latest github code Jan Pechanec
2025-02-25 15:13 ` Jan Pechanec
@ 2025-03-09 9:42 ` Herbert Xu
2025-03-10 8:35 ` [External] : " Jan Pechanec
1 sibling, 1 reply; 4+ messages in thread
From: Herbert Xu @ 2025-03-09 9:42 UTC (permalink / raw)
To: Jan Pechanec; +Cc: dash
Jan Pechanec <Jan.Pechanec@oracle.com> wrote:
>
> thank you for working on dash. I was testing it recently and it worked
> really well.
>
> However, I noticed the dash code from github does filename pattern
> matching even for code like "[ x = x ] && echo ok". I believe the
> unquoted space after '[' should not trigger pattern matching but rather
> only to invoke the test/[ utility, as before. It seems it works fine
> though and only doing some extra unneeded work which may not be
> immediatelly noticeable.
>
> dash installed on my Oracle Linux 9:
>
> janp:len49:~/_INST/dash$ strings /usr/bin/dash | grep dash
> dash-0.5.11.5-4.el9.x86_64.debug
> janp:len49:~/_INST/dash$ time dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
>
> real 0m0.752s
> user 0m0.748s
> sys 0m0.002s
>
> dash from github (commit b3e38adf6718801e7f06267b438c45caec9523bb) take
> way more time to do the same thing:
>
> janp:len49:~/_INST/dash$ time ./src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
>
> real 0m4.202s
> user 0m1.361s
> sys 0m2.804s
>
> For the latter, strace shows open, fstat, getdents*, and close system
> calls for each iteration and it depends on number of files in the
> current directory. With more files, it takes more time:
>
> janp:len49:/etc$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real 0m15.591s
> user 0m5.704s
> sys 0m9.828s
>
> If I change [ to test, the dash github version behaves as before, and
> possibly even faster:
>
> janp:len49:~/_INST/dash$ time ~/_INST/dash/src/dash -c 'i=0; while :; do : $((i=i+1)); test $i -eq 500000 && break; done'
>
> real 0m0.662s
> user 0m0.659s
> sys 0m0.002s
>
> Even bash would be faster than the current github version of dash:
>
> janp:len49:~/_INST/dash$ time bash -c 'i=0; while :; do : $((i=i+1)); [ $i -eq 500000 ] && break; done'
> real 0m1.943s
> user 0m1.939s
> sys 0m0.002s
Fix performance regression for idiomatic "[ ... ]" expression by
adding a bypass for a literal "]" in pathname expansion.
Reported-by: Jan Pechanec <Jan.Pechanec@oracle.com>
Fixes: 8d0eca2d9fb5 ("expand: Rewrite expmeta meta detection")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/src/expand.c b/src/expand.c
index 7a30648..5114646 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -1555,7 +1555,7 @@ expandmeta(struct strlist *str)
if (fflag)
goto nometa;
- if (!strpbrk(str->text, "*?]"))
+ if (!strpbrk(str->text, "*?]") || !memcmp(str->text, "]", 2))
goto nometa;
savelastp = exparg.lastp;
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [External] : [PATCH] expand: Add bypass for literal "]" in expandmeta
2025-03-09 9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
@ 2025-03-10 8:35 ` Jan Pechanec
0 siblings, 0 replies; 4+ messages in thread
From: Jan Pechanec @ 2025-03-10 8:35 UTC (permalink / raw)
To: Herbert Xu; +Cc: dash
...
> Fix performance regression for idiomatic "[ ... ]" expression by
> adding a bypass for a literal "]" in pathname expansion.
>
> Reported-by: Jan Pechanec <Jan.Pechanec@oracle.com>
> Fixes: 8d0eca2d9fb5 ("expand: Rewrite expmeta meta detection")
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/src/expand.c b/src/expand.c
> index 7a30648..5114646 100644
> --- a/src/expand.c
> +++ b/src/expand.c
> @@ -1555,7 +1555,7 @@ expandmeta(struct strlist *str)
>
> if (fflag)
> goto nometa;
> - if (!strpbrk(str->text, "*?]"))
> + if (!strpbrk(str->text, "*?]") || !memcmp(str->text, "]", 2))
> goto nometa;
> savelastp = exparg.lastp;
Hi Herbert, thank you, this seems to fix the regression reported. I
just applied the patch and succesfully re-tested.
Regards,
Jan
--
Jan Pechanec <jan.pechanec@oracle.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-03-10 8:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-25 15:04 dash performance regression with [ in latest github code Jan Pechanec
2025-02-25 15:13 ` Jan Pechanec
2025-03-09 9:42 ` [PATCH] expand: Add bypass for literal "]" in expandmeta Herbert Xu
2025-03-10 8:35 ` [External] : " Jan Pechanec
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).