From: Felipe Contreras <felipe.contreras@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Jiang Xin" <worldhello.net@gmail.com>
Cc: "Jiang Xin" <zhiyou.jx@alibaba-inc.com>,
"Junio C Hamano" <gitster@pobox.com>,
"Git List" <git@vger.kernel.org>,
"Đoàn Trần Công Danh" <congdanhqx@gmail.com>,
"Jonathan Nieder" <jrnieder@gmail.com>
Subject: Re: Runaway sed memory use in test on older sed+glibc (was "Re: [PATCH v6 1/3] test: add helper functions for git-bundle")
Date: Thu, 27 May 2021 14:19:00 -0500 [thread overview]
Message-ID: <60aff0a416f40_265302082c@natae.notmuch> (raw)
In-Reply-To: <87tumol4tg.fsf@evledraar.gmail.com>
Ævar Arnfjörð Bjarmason wrote:
>
> On Thu, May 27 2021, Jiang Xin wrote:
>
> > Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年5月27日周四
> > 上午2:51写道:
> >>
> >>
> >> On Mon, Jan 11 2021, Jiang Xin wrote:
> >>
> >> > From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> >> >
> >> > Move git-bundle related functions from t5510 to a library, and this
> >> > lib
> >> > will be shared with a new testcase t6020 which finds a known
> >> > breakage of
> >> > "git-bundle".
> >> > [...]
> >> > +
> >> > +# Format the output of git commands to make a user-friendly and
> >> > stable
> >> > +# text. We can easily prepare the expect text without having to
> >> > worry
> >> > +# about future changes of the commit ID and spaces of the output.
> >> > +make_user_friendly_and_stable_output () {
> >> > + sed \
> >> > + -e "s/${A%${A#???????}}[0-9a-f]*/<COMMIT-A>/g" \
> >> > + -e "s/${B%${B#???????}}[0-9a-f]*/<COMMIT-B>/g" \
> >> > + -e "s/${C%${C#???????}}[0-9a-f]*/<COMMIT-C>/g" \
> >> > + -e "s/${D%${D#???????}}[0-9a-f]*/<COMMIT-D>/g" \
> >> > + -e "s/${E%${E#???????}}[0-9a-f]*/<COMMIT-E>/g" \
> >> > + -e "s/${F%${F#???????}}[0-9a-f]*/<COMMIT-F>/g" \
> >> > + -e "s/${G%${G#???????}}[0-9a-f]*/<COMMIT-G>/g" \
> >> > + -e "s/${H%${H#???????}}[0-9a-f]*/<COMMIT-H>/g" \
> >> > + -e "s/${I%${I#???????}}[0-9a-f]*/<COMMIT-I>/g" \
> >> > + -e "s/${J%${J#???????}}[0-9a-f]*/<COMMIT-J>/g" \
> >> > + -e "s/${K%${K#???????}}[0-9a-f]*/<COMMIT-K>/g" \
> >> > + -e "s/${L%${L#???????}}[0-9a-f]*/<COMMIT-L>/g" \
> >> > + -e "s/${M%${M#???????}}[0-9a-f]*/<COMMIT-M>/g" \
> >> > + -e "s/${N%${N#???????}}[0-9a-f]*/<COMMIT-N>/g" \
> >> > + -e "s/${O%${O#???????}}[0-9a-f]*/<COMMIT-O>/g" \
> >> > + -e "s/${P%${P#???????}}[0-9a-f]*/<COMMIT-P>/g" \
> >> > + -e "s/${TAG1%${TAG1#???????}}[0-9a-f]*/<TAG-1>/g" \
> >> > + -e "s/${TAG2%${TAG2#???????}}[0-9a-f]*/<TAG-2>/g" \
> >> > + -e "s/${TAG3%${TAG3#???????}}[0-9a-f]*/<TAG-3>/g" \
> >> > + -e "s/ *\$//"
> >> > +}
> >>
> >> On one of the gcc farm boxes, a i386 box (gcc45) this fails because
> >> sed
> >> gets killed after >500MB of memory use (I was just eyeballing it in
> >> htop) on the "reate bundle from special rev: main^!" test. This with
> >> GNU
> >> sed 4.2.2.
> >>
> >> I suspect this regex pattern creates some runaway behavior in sed
> >> that's
> >> since been fixed (or maybe it's the glibc regex engine?). The glibc is
> >> 2.19-18+deb8u10:
> >>
> >> + git bundle list-heads special-rev.bdl
> >> + make_user_friendly_and_stable_output
> >> + sed -e s/[0-9a-f]*/<COMMIT-A>/g -e s/[0-9a-f]*/<COMMIT-B>/g -e
> >> s/[0-9a-f]*/<COMMIT-C>/g -e s/[0-9a-f]*/<COMMIT-D>/g -e
> >> s/[0-9a-f]*/<COMMIT-E>/g -e s/[0-9a-f]*/<COMMIT-F>/g -e
> >> s/[0-9a-f]*/<COMMIT-G>/g -e s/[0-9a-f]*/<COMMIT-H>/g -e
> >> s/[0-9a-f]*/<COMMIT-I>/g -e s/[0-9a-f]*/<COMMIT-J>/g -e
> >> s/[0-9a-f]*/<COMMIT-K>/g -e s/[0-9a-f]*/<COMMIT-L>/g -e
> >> s/[0-9a-f]*/<COMMIT-M>/g -e s/[0-9a-f]*/<COMMIT-N>/g -e
> >> s/[0-9a-f]*/<COMMIT-O>/g -e s/[0-9a-f]*/<COMMIT-P>/g -e
> >> s/[0-9a-f]*/<TAG-1>/g -e s/[0-9a-f]*/<TAG-2>/g -e
> >> s/[0-9a-f]*/<TAG-3>/g -e s/ *$//
> >> sed: couldn't re-allocate memory
> >
> > I wrote a program on macOS to check memory footprint for sed and perl.
> > See:
> >
> > https://github.com/jiangxin/compare-sed-perl
>
> Interesting use of Go for as a /usr/bin/time -v replacement :)
Here's a Ruby version:
https://dpaste.com/FYT2QKHJE
I'm not sure if will be useful in this particular case, but Ruby code
always ends up simpler ;)
--
Felipe Contreras
next prev parent reply other threads:[~2021-05-27 19:19 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-03 9:54 [PATCH] bundle: arguments can be read from stdin Jiang Xin
2021-01-04 23:41 ` Junio C Hamano
2021-01-05 16:30 ` [PATCH v2 1/2] bundle: lost objects when removing duplicate pendings Jiang Xin
2021-01-05 16:30 ` [PATCH v2 2/2] bundle: arguments can be read from stdin Jiang Xin
2021-01-07 13:50 ` [PATCH v3 0/2] improvements for git-bundle Jiang Xin
2021-01-07 13:50 ` [PATCH v3 1/2] bundle: lost objects when removing duplicate pendings Jiang Xin
2021-01-07 15:37 ` Đoàn Trần Công Danh
2021-01-08 13:14 ` Jiang Xin
2021-01-08 14:45 ` [PATCH v4 0/2] Improvements for git-bundle Jiang Xin
2021-01-08 14:45 ` [PATCH v4 1/2] bundle: lost objects when removing duplicate pendings Jiang Xin
2021-01-09 2:10 ` Junio C Hamano
2021-01-09 13:32 ` Jiang Xin
2021-01-09 22:02 ` Junio C Hamano
2021-01-10 14:30 ` [PATCH v5 0/3] improvements for git-bundle Jiang Xin
2021-01-10 14:30 ` [PATCH v5 1/3] test: add helper functions " Jiang Xin
2021-01-11 20:09 ` Junio C Hamano
2021-01-12 2:27 ` [PATCH v6 0/3] improvements " Jiang Xin
2021-01-12 2:27 ` [PATCH v6 1/3] test: add helper functions " Jiang Xin
2021-05-26 18:49 ` Runaway sed memory use in test on older sed+glibc (was "Re: [PATCH v6 1/3] test: add helper functions for git-bundle") Ævar Arnfjörð Bjarmason
2021-05-27 11:52 ` Jiang Xin
2021-05-27 12:19 ` Ævar Arnfjörð Bjarmason
2021-05-27 13:48 ` Jeff King
2021-05-27 19:19 ` Felipe Contreras [this message]
2021-06-01 9:45 ` Jiang Xin
2021-06-01 9:42 ` Jiang Xin
2021-06-01 11:50 ` Ævar Arnfjörð Bjarmason
2021-06-01 13:20 ` Jiang Xin
2021-06-01 14:49 ` [PATCH 1/2] t6020: fix bash incompatible issue Jiang Xin
2021-06-01 14:49 ` [PATCH 2/2] t6020: do not mangle trailing spaces in output Jiang Xin
2021-06-05 17:02 ` Ævar Arnfjörð Bjarmason
2021-06-12 5:07 ` [PATCH v2 0/4] Fixed t6020 bash compatible issue and fixed wrong sideband suffix issue Jiang Xin
2021-06-14 4:10 ` Junio C Hamano
2021-06-15 3:11 ` Jiang Xin
2021-06-17 3:14 ` [PATCH v3] t6020: fix incompatible parameter expansion Jiang Xin
2021-06-21 8:41 ` Ævar Arnfjörð Bjarmason
2021-06-12 5:07 ` [PATCH v2 1/4] t6020: fix bash incompatible issue Jiang Xin
2021-06-12 5:07 ` [PATCH v2 2/4] test: refactor create_commits_in() for t5411 and t5548 Jiang Xin
2021-06-12 5:07 ` [PATCH v2 3/4] sideband: append suffix for message whose CR in next pktline Jiang Xin
2021-06-13 7:47 ` Ævar Arnfjörð Bjarmason
2021-06-14 3:50 ` Junio C Hamano
2021-06-14 11:51 ` Jiang Xin
2021-06-15 1:17 ` Junio C Hamano
2021-06-15 1:47 ` Jiang Xin
2021-06-15 2:11 ` Nicolas Pitre
2021-06-15 3:04 ` Jiang Xin
2021-06-15 3:26 ` Nicolas Pitre
2021-06-15 4:46 ` Junio C Hamano
2021-06-15 7:17 ` Jiang Xin
2021-06-15 14:46 ` Nicolas Pitre
2021-06-12 5:07 ` [PATCH v2 4/4] test: compare raw output, not mangle tabs and spaces Jiang Xin
2021-01-12 2:27 ` [PATCH v6 2/3] bundle: lost objects when removing duplicate pendings Jiang Xin
2021-01-12 2:27 ` [PATCH v6 3/3] bundle: arguments can be read from stdin Jiang Xin
2021-01-10 14:30 ` [PATCH v5 2/3] bundle: lost objects when removing duplicate pendings Jiang Xin
2021-01-11 20:12 ` Junio C Hamano
2021-01-10 14:30 ` [PATCH v5 3/3] bundle: arguments can be read from stdin Jiang Xin
2021-01-09 15:09 ` [PATCH v4 1/2] bundle: lost objects when removing duplicate pendings Jiang Xin
2021-01-09 22:02 ` Junio C Hamano
2021-01-08 14:45 ` [PATCH v4 2/2] bundle: arguments can be read from stdin Jiang Xin
2021-01-09 2:18 ` Junio C Hamano
2021-01-07 13:50 ` [PATCH v3 " Jiang Xin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=60aff0a416f40_265302082c@natae.notmuch \
--to=felipe.contreras@gmail.com \
--cc=avarab@gmail.com \
--cc=congdanhqx@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=worldhello.net@gmail.com \
--cc=zhiyou.jx@alibaba-inc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).