From: Junio C Hamano <gitster@pobox.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH 1/4] filter-branch: stop depending on Perl
Date: Tue, 15 Apr 2025 08:50:34 -0700 [thread overview]
Message-ID: <xmqq1ptto245.fsf@gitster.g> (raw)
In-Reply-To: <20250415-b4-pks-drop-perl-v1-1-c6addf175858@pks.im> (Patrick Steinhardt's message of "Tue, 15 Apr 2025 11:57:08 +0200")
Patrick Steinhardt <ps@pks.im> writes:
> While git-filter-branch(1) is written as a shell script, the
> `--state-branch` feature depends on Perl to save and extract the object
> ID mappings. This can lead to subtle breakage though:
>
> - We execute `perl` directly without respecting the `PERL_PATH`
> configured by the distribution. As such, it may happen that we use
> the wrong version of Perl.
>
> - We install the script unchanged even if Perl isn't available at all
> on the system, so using `--state-branch` would lead to failure
> altogether in that case.
>
> Fix this by dropping Perl and instead implementing the feature with
> shell scripting exclusively.
Excellent.
I just realized that the first bullet point above is a good argument
against "#!/usr/bin/env perl" as well. If we were to go the
"#!/usr/bin/env perl" direction, we should stop pretending with
$PERL_PATH that we'd consistently use the same version of Perl
chosen at build-install time, and we just trust users' $PATH, for
consistency.
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> git-filter-branch.sh | 37 +++++++++++++++++++------------------
> 1 file changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> index 3a51d4507c7..24fa317aaaa 100755
> --- a/git-filter-branch.sh
> +++ b/git-filter-branch.sh
> @@ -295,15 +295,18 @@ then
> if test -n "$state_commit"
> then
> echo "Populating map from $state_branch ($state_commit)" 1>&2
> - perl -e'open(MAP, "-|", "git show $ARGV[0]:filter.map") or die;
> - while (<MAP>) {
> - m/(.*):(.*)/ or die;
> - open F, ">../map/$1" or die;
> - print F "$2" or die;
> - close(F) or die;
> - }
> - close(MAP) or die;' "$state_commit" \
> - || die "Unable to load state from $state_branch:filter.map"
> +
> + git show "$state_commit:filter.map" >"$tempdir"/filter-map ||
> + die "Unable to load state from $state_branch:filter.map"
> + while read line
> + do
> + case "$line" in
> + *:*)
> + echo "${line%:*}" >../map/"${line#*:}";;
> + *)
> + die "Unable to load state from $state_branch:filter.map";;
> + esac
> + done <"$tempdir"/filter-map
Quite straight-forward. Do problematic bytes like backslashes
appear in the blob that Perl m/(.*):(.*)/ would have passed intact
without problems but may cause problems to the reimplementation? I
doubt it.
Not in a scope of this change, but use of "git show" in place of
"git cat-file blob" makes my skin somewhat tingle.
> else
> echo "Branch $state_branch does not exist. Will create" 1>&2
> fi
> @@ -633,15 +636,13 @@ if test -n "$state_branch"
> then
> echo "Saving rewrite state to $state_branch" 1>&2
> state_blob=$(
> - perl -e'opendir D, "../map" or die;
> - open H, "|-", "git hash-object -w --stdin" or die;
> - foreach (sort readdir(D)) {
> - next if m/^\.\.?$/;
> - open F, "<../map/$_" or die;
> - chomp($f = <F>);
> - print H "$_:$f\n" or die;
> - }
> - close(H) or die;' || die "Unable to save state")
> + for file in ../map/*
OK. * won't match . or .. so we no longer need to filter. Like the
original, if we had any random cruft file, we may copy such garbage
to the output, but that is not a new problem at all.
Nicely done.
> + do
> + from_commit=$(basename "$file")
> + to_commit=$(cat "$file")
> + echo "$from_commit:$to_commit"
> + done | git hash-object -w --stdin || die "Unable to save state"
> + )
> state_tree=$(printf '100644 blob %s\tfilter.map\n' "$state_blob" | git mktree)
> if test -n "$state_commit"
> then
next prev parent reply other threads:[~2025-04-15 15:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-15 9:57 [PATCH 0/4] Drop Perl dependency in a couple of subsystems Patrick Steinhardt
2025-04-15 9:57 ` [PATCH 1/4] filter-branch: stop depending on Perl Patrick Steinhardt
2025-04-15 15:50 ` Junio C Hamano [this message]
2025-04-15 9:57 ` [PATCH 2/4] request-pull: " Patrick Steinhardt
2025-04-15 16:16 ` Junio C Hamano
2025-04-16 15:07 ` Patrick Steinhardt
2025-04-15 9:57 ` [PATCH 3/4] Documentation: stop depending on Perl to massage user manual Patrick Steinhardt
2025-04-15 9:57 ` [PATCH 4/4] Documentation: stop depending on Perl to generate command list Patrick Steinhardt
2025-04-15 16:32 ` Junio C Hamano
2025-04-16 12:16 ` [PATCH v2 0/4] Drop Perl dependency in a couple of subsystems Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 1/4] filter-branch: stop depending on Perl Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 2/4] request-pull: " Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 3/4] Documentation: stop depending on Perl to massage user manual Patrick Steinhardt
2025-04-16 12:16 ` [PATCH v2 4/4] Documentation: stop depending on Perl to generate command list Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1ptto245.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.