* [PATCH] Remove perl dependency from git-submodule.sh
@ 2012-05-31 8:48 Fredrik Gustafsson
2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason
2012-05-31 9:19 ` Johannes Sixt
0 siblings, 2 replies; 9+ messages in thread
From: Fredrik Gustafsson @ 2012-05-31 8:48 UTC (permalink / raw)
To: git; +Cc: jens.lehmann, gitster, iveqy
Rewrote a perl section in sh.
The code may be a bit slower (doing grep on strings instead of using
perl-lists).
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
---
git-submodule.sh | 35 ++++++++++++++++++-----------------
1 files changed, 18 insertions(+), 17 deletions(-)
diff --git a/git-submodule.sh b/git-submodule.sh
index f46862f..d3dfb24 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -73,24 +73,25 @@ resolve_relative_url ()
#
module_list()
{
+ unmerged=
+ null_sha1=0000000000000000000000000000000000000000
git ls-files --error-unmatch --stage -- "$@" |
- perl -e '
- my %unmerged = ();
- my ($null_sha1) = ("0" x 40);
- while (<STDIN>) {
- chomp;
- my ($mode, $sha1, $stage, $path) =
- /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
- next unless $mode eq "160000";
- if ($stage ne "0") {
- if (!$unmerged{$path}++) {
- print "$mode $null_sha1 U\t$path\n";
- }
- next;
- }
- print "$_\n";
- }
- '
+ while read mode sha1 stage path
+ do
+ if test $mode -eq 160000
+ then
+ if test $stage -ne 0
+ then
+ if test -z "$(echo $unmerged | grep "|$path|")"
+ then
+ echo "$mode $null_sha1 U\t$path"
+ fi
+ unmerged="$unmerged|$path|"
+ else
+ echo "$mode $sha1 $stage\t$path"
+ fi
+ fi
+ done
}
#
--
1.7.5.1180.g4bfe7.dirty
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 8:48 [PATCH] Remove perl dependency from git-submodule.sh Fredrik Gustafsson
@ 2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason
2012-05-31 10:37 ` Fredrik Gustafsson
2012-05-31 9:19 ` Johannes Sixt
1 sibling, 1 reply; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-05-31 9:07 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: git, jens.lehmann, gitster
On Thu, May 31, 2012 at 10:48 AM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> Rewrote a perl section in sh.
>
> The code may be a bit slower (doing grep on strings instead of using
> perl-lists).
May be? Did you test it, if so what's the difference?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 8:48 [PATCH] Remove perl dependency from git-submodule.sh Fredrik Gustafsson
2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason
@ 2012-05-31 9:19 ` Johannes Sixt
2012-05-31 10:40 ` Fredrik Gustafsson
1 sibling, 1 reply; 9+ messages in thread
From: Johannes Sixt @ 2012-05-31 9:19 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: git, jens.lehmann, gitster
Am 5/31/2012 10:48, schrieb Fredrik Gustafsson:
> Rewrote a perl section in sh.
> The code may be a bit slower (doing grep on strings instead of using
> perl-lists).
"A lot" would be more correct on Windows :-) But it can be avoided, I think.
> module_list()
> {
> + unmerged=
> + null_sha1=0000000000000000000000000000000000000000
> git ls-files --error-unmatch --stage -- "$@" |
> - perl -e '
> - my %unmerged = ();
> - my ($null_sha1) = ("0" x 40);
> - while (<STDIN>) {
> - chomp;
> - my ($mode, $sha1, $stage, $path) =
> - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
> - next unless $mode eq "160000";
> - if ($stage ne "0") {
> - if (!$unmerged{$path}++) {
> - print "$mode $null_sha1 U\t$path\n";
> - }
> - next;
> - }
> - print "$_\n";
> - }
> - '
> + while read mode sha1 stage path
Be prepared for backslashes in the path name:
while read -r mode sha1 stage path
> + do
> + if test $mode -eq 160000
$mode is not a number, but a string: test "$mode" = 160000
> + then
> + if test $stage -ne 0
That $stage looks like a number is of no importance, either.
> + then
> + if test -z "$(echo $unmerged | grep "|$path|")"
> + then
> + echo "$mode $null_sha1 U\t$path"
> + fi
> + unmerged="$unmerged|$path|"
IIUC, the purpose of $unmerged and this check is to avoid that an unmerged
path is dumped for each stage that is listed by ls-files. Therefore it
should be sufficient to just check that the current path is different from
the last path.
> + else
> + echo "$mode $sha1 $stage\t$path"
> + fi
> + fi
> + done
> }
>
> #
-- Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason
@ 2012-05-31 10:37 ` Fredrik Gustafsson
0 siblings, 0 replies; 9+ messages in thread
From: Fredrik Gustafsson @ 2012-05-31 10:37 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Fredrik Gustafsson, git, jens.lehmann, gitster
On Thu, May 31, 2012 at 11:07:22AM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Thu, May 31, 2012 at 10:48 AM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> > Rewrote a perl section in sh.
> >
> > The code may be a bit slower (doing grep on strings instead of using
> > perl-lists).
>
> May be? Did you test it, if so what's the difference?
I did not test it. I think that it can be easily judged by someone that
knows how perl implements its arrays. With an array implementation with
a hashmap-solution, it should be slower, of course depending on the
number of submodules.
--
Med vänliga hälsningar
Fredrik Gustafsson
tel: 0733-608274
e-post: iveqy@iveqy.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 9:19 ` Johannes Sixt
@ 2012-05-31 10:40 ` Fredrik Gustafsson
2012-05-31 11:25 ` Johannes Sixt
2012-05-31 17:49 ` Junio C Hamano
0 siblings, 2 replies; 9+ messages in thread
From: Fredrik Gustafsson @ 2012-05-31 10:40 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Fredrik Gustafsson, git, jens.lehmann, gitster
On Thu, May 31, 2012 at 11:19:04AM +0200, Johannes Sixt wrote:
> Am 5/31/2012 10:48, schrieb Fredrik Gustafsson:
> > Rewrote a perl section in sh.
>
> > The code may be a bit slower (doing grep on strings instead of using
> > perl-lists).
>
> "A lot" would be more correct on Windows :-) But it can be avoided, I think.
>
> > module_list()
> > {
> > + unmerged=
> > + null_sha1=0000000000000000000000000000000000000000
> > git ls-files --error-unmatch --stage -- "$@" |
> > - perl -e '
> > - my %unmerged = ();
> > - my ($null_sha1) = ("0" x 40);
> > - while (<STDIN>) {
> > - chomp;
> > - my ($mode, $sha1, $stage, $path) =
> > - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
> > - next unless $mode eq "160000";
> > - if ($stage ne "0") {
> > - if (!$unmerged{$path}++) {
> > - print "$mode $null_sha1 U\t$path\n";
> > - }
> > - next;
> > - }
> > - print "$_\n";
> > - }
> > - '
> > + while read mode sha1 stage path
>
> Be prepared for backslashes in the path name:
>
> while read -r mode sha1 stage path
We are not using -r on any place in git-submodule.sh. Maybe we should? I
can provide a patch if needed.
>
> > + do
> > + if test $mode -eq 160000
>
> $mode is not a number, but a string: test "$mode" = 160000
okay, fixed in next iteration.
>
> > + then
> > + if test $stage -ne 0
>
> That $stage looks like a number is of no importance, either.
Actually I don't know what stage does and if it's important here. This
part is just to mimic the perl code. Should it be removed?
>
> > + then
> > + if test -z "$(echo $unmerged | grep "|$path|")"
> > + then
> > + echo "$mode $null_sha1 U\t$path"
> > + fi
> > + unmerged="$unmerged|$path|"
>
> IIUC, the purpose of $unmerged and this check is to avoid that an unmerged
> path is dumped for each stage that is listed by ls-files. Therefore it
> should be sufficient to just check that the current path is different from
> the last path.
That requires that submodules always is in the same order, right? That
would not work for:
sub1
sub2
sub1
case. But is that order really a possibility or is it always
sub1
sub1
sub2
?
--
Med vänliga hälsningar
Fredrik Gustafsson
tel: 0733-608274
e-post: iveqy@iveqy.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 10:40 ` Fredrik Gustafsson
@ 2012-05-31 11:25 ` Johannes Sixt
2012-05-31 17:49 ` Junio C Hamano
1 sibling, 0 replies; 9+ messages in thread
From: Johannes Sixt @ 2012-05-31 11:25 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: git, jens.lehmann, gitster
Am 5/31/2012 12:40, schrieb Fredrik Gustafsson:
> On Thu, May 31, 2012 at 11:19:04AM +0200, Johannes Sixt wrote:
>> Be prepared for backslashes in the path name:
>>
>> while read -r mode sha1 stage path
>
> We are not using -r on any place in git-submodule.sh. Maybe we should? I
> can provide a patch if needed.
I can imagine that this would fix a bug or two with paths that contain
to-be-quoted characters.
>>> + do
>>> + if test $mode -eq 160000
>>
>> $mode is not a number, but a string: test "$mode" = 160000
>
> okay, fixed in next iteration.
>
>>
>>> + then
>>> + if test $stage -ne 0
>>
>> That $stage looks like a number is of no importance, either.
>
> Actually I don't know what stage does and if it's important here. This
> part is just to mimic the perl code. Should it be removed?
No; you should 'test "$stage" != 0'.
>>> + then
>>> + if test -z "$(echo $unmerged | grep "|$path|")"
>>> + then
>>> + echo "$mode $null_sha1 U\t$path"
>>> + fi
>>> + unmerged="$unmerged|$path|"
>>
>> IIUC, the purpose of $unmerged and this check is to avoid that an unmerged
>> path is dumped for each stage that is listed by ls-files. Therefore it
>> should be sufficient to just check that the current path is different from
>> the last path.
>
> That requires that submodules always is in the same order, right?
ls-files guarantees a suitable order: different stages of the same
submodule path appear on consecutive lines.
-- Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 10:40 ` Fredrik Gustafsson
2012-05-31 11:25 ` Johannes Sixt
@ 2012-05-31 17:49 ` Junio C Hamano
2012-05-31 18:48 ` Fredrik Gustafsson
1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2012-05-31 17:49 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: Johannes Sixt, git, jens.lehmann
Fredrik Gustafsson <iveqy@iveqy.com> writes:
> On Thu, May 31, 2012 at 11:19:04AM +0200, Johannes Sixt wrote:
>> Am 5/31/2012 10:48, schrieb Fredrik Gustafsson:
>> > Rewrote a perl section in sh.
>>
>> > The code may be a bit slower (doing grep on strings instead of using
>> > perl-lists).
>>
>> "A lot" would be more correct on Windows :-) But it can be avoided, I think.
>>
>> > module_list()
>> > {
>> > + unmerged=
>> > + null_sha1=0000000000000000000000000000000000000000
>> > git ls-files --error-unmatch --stage -- "$@" |
>> > - perl -e '
>> > - my %unmerged = ();
>> > - my ($null_sha1) = ("0" x 40);
>> > - while (<STDIN>) {
>> > - chomp;
>> > - my ($mode, $sha1, $stage, $path) =
>> > - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
>> > - next unless $mode eq "160000";
>> > - if ($stage ne "0") {
>> > - if (!$unmerged{$path}++) {
>> > - print "$mode $null_sha1 U\t$path\n";
>> > - }
>> > - next;
>> > - }
>> > - print "$_\n";
>> > - }
>> > - '
>> > + while read mode sha1 stage path
>>
>> Be prepared for backslashes in the path name:
>>
>> while read -r mode sha1 stage path
>
> We are not using -r on any place in git-submodule.sh. Maybe we should? I
> can provide a patch if needed.
I've already written off "git-submodule.sh" script as unfriendly to
pathnames with funny characters in them. Using "read -r" may work
around backslashes in the path, but you won't be able to sensibly
handle LFs in filenames, for example. In other words, we could just
tell users "don't use funny pathnames"---and from that point of
view, rewriting the Perl scriptlet with a pure shell implementation
that does not fork other little tools is sensible, especially if it
results in better performance, more readable code, etc.
Having said that, in the longer term, I think the right direction to
go is the opposite. It would be better to make "git-submodule.sh"
work better with paths with funny characters in them, and one
obvious approach is to read "ls-files -z" output with something
capable of parsing NUL-terminated records, e.g. a Perl scriptlet.
Adding a new shell loop like this patch only adds one place that
needs to be fixed later when that happens, so I am not sure I like
this patch.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 17:49 ` Junio C Hamano
@ 2012-05-31 18:48 ` Fredrik Gustafsson
2012-05-31 19:26 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Fredrik Gustafsson @ 2012-05-31 18:48 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Fredrik Gustafsson, Johannes Sixt, git, jens.lehmann
On Thu, May 31, 2012 at 10:49:41AM -0700, Junio C Hamano wrote:
> Having said that, in the longer term, I think the right direction to
> go is the opposite. It would be better to make "git-submodule.sh"
> work better with paths with funny characters in them, and one
> obvious approach is to read "ls-files -z" output with something
> capable of parsing NUL-terminated records, e.g. a Perl scriptlet.
> Adding a new shell loop like this patch only adds one place that
> needs to be fixed later when that happens, so I am not sure I like
> this patch.
Is perl really a dependency that git wants? Today only a few bit (often
non critical) are in perl. I thought the way was to get rid of those and
replace them with c? I'm very critical to dependencies when they are not needed.
I don't think forking for text-parsing when not needed is a good idea
either. Apart from the runtime issues, it makes the code harder to read.
With that said I do agree that funny path names should be supported and
maybe the correct solution is to make more use of perl and less of sh.
Mixing those, and doing it in the same file, I don't think is a good
idea.
Is the right direction to run a shellscript that invokes a
perl-scriptlet for textparsing?
--
Med vänliga hälsningar
Fredrik Gustafsson
tel: 0733-608274
e-post: iveqy@iveqy.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh
2012-05-31 18:48 ` Fredrik Gustafsson
@ 2012-05-31 19:26 ` Junio C Hamano
0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2012-05-31 19:26 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: Johannes Sixt, git, jens.lehmann
Fredrik Gustafsson <iveqy@iveqy.com> writes:
> On Thu, May 31, 2012 at 10:49:41AM -0700, Junio C Hamano wrote:
>> Having said that, in the longer term, I think the right direction to
>> go is the opposite. It would be better to make "git-submodule.sh"
>> work better with paths with funny characters in them, and one
>> obvious approach is to read "ls-files -z" output with something
>> capable of parsing NUL-terminated records, e.g. a Perl scriptlet.
>> Adding a new shell loop like this patch only adds one place that
>> needs to be fixed later when that happens, so I am not sure I like
>> this patch.
>
> Is perl really a dependency that git wants?
It depends on your definition of "want"; I'd say "if alternative is
to lose things like functionality, performance, etc., we would
rather live with it."
It is one of the more widely available scripting languages whose
scripts are more portable across platforms (sadly, we ought to be
able to use sed and awk which are more available but we have seen
portability issues with them); if we want to step outside of what
can be done with POSIX shell scripts (e.g. handling NUL terminated
stream) but are not ready to rewrite everything in C, I would say it
is the least evil among others. So the short answer is yes.
> Today only a few bit (often
> non critical) are in perl. I thought the way was to get rid of those and
> replace them with c?
But your patch does not help us bring ourselves any closer to
replace anything with C at all.
> I'm very critical to dependencies when they are not needed.
The key-phrase is "when they are not needed".
If the patch were to replace it with awk, sed or shell *without*
losing functionality, performance, readability, portability &
maintainability, it would be giving us one less dependency without
losing anything. It may not be replacing everything with C, but at
least it would not be going backwards.
On the other hand, if the patch were to replace Perl scriptlet with
ruby or python, that would be adding unnecessary extra dependencies,
as we do not have anything written in them in the core. To such a
patch, we can confidently say "It adds unnecessary dependency
without value" and reject it.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-05-31 19:26 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-31 8:48 [PATCH] Remove perl dependency from git-submodule.sh Fredrik Gustafsson
2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason
2012-05-31 10:37 ` Fredrik Gustafsson
2012-05-31 9:19 ` Johannes Sixt
2012-05-31 10:40 ` Fredrik Gustafsson
2012-05-31 11:25 ` Johannes Sixt
2012-05-31 17:49 ` Junio C Hamano
2012-05-31 18:48 ` Fredrik Gustafsson
2012-05-31 19:26 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).