* [PATCH] Remove perl dependency from git-submodule.sh @ 2012-05-31 8:48 Fredrik Gustafsson 2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason 2012-05-31 9:19 ` Johannes Sixt 0 siblings, 2 replies; 9+ messages in thread From: Fredrik Gustafsson @ 2012-05-31 8:48 UTC (permalink / raw) To: git; +Cc: jens.lehmann, gitster, iveqy Rewrote a perl section in sh. The code may be a bit slower (doing grep on strings instead of using perl-lists). Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com> --- git-submodule.sh | 35 ++++++++++++++++++----------------- 1 files changed, 18 insertions(+), 17 deletions(-) diff --git a/git-submodule.sh b/git-submodule.sh index f46862f..d3dfb24 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -73,24 +73,25 @@ resolve_relative_url () # module_list() { + unmerged= + null_sha1=0000000000000000000000000000000000000000 git ls-files --error-unmatch --stage -- "$@" | - perl -e ' - my %unmerged = (); - my ($null_sha1) = ("0" x 40); - while (<STDIN>) { - chomp; - my ($mode, $sha1, $stage, $path) = - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/; - next unless $mode eq "160000"; - if ($stage ne "0") { - if (!$unmerged{$path}++) { - print "$mode $null_sha1 U\t$path\n"; - } - next; - } - print "$_\n"; - } - ' + while read mode sha1 stage path + do + if test $mode -eq 160000 + then + if test $stage -ne 0 + then + if test -z "$(echo $unmerged | grep "|$path|")" + then + echo "$mode $null_sha1 U\t$path" + fi + unmerged="$unmerged|$path|" + else + echo "$mode $sha1 $stage\t$path" + fi + fi + done } # -- 1.7.5.1180.g4bfe7.dirty ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 8:48 [PATCH] Remove perl dependency from git-submodule.sh Fredrik Gustafsson @ 2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason 2012-05-31 10:37 ` Fredrik Gustafsson 2012-05-31 9:19 ` Johannes Sixt 1 sibling, 1 reply; 9+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2012-05-31 9:07 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: git, jens.lehmann, gitster On Thu, May 31, 2012 at 10:48 AM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > Rewrote a perl section in sh. > > The code may be a bit slower (doing grep on strings instead of using > perl-lists). May be? Did you test it, if so what's the difference? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason @ 2012-05-31 10:37 ` Fredrik Gustafsson 0 siblings, 0 replies; 9+ messages in thread From: Fredrik Gustafsson @ 2012-05-31 10:37 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Fredrik Gustafsson, git, jens.lehmann, gitster On Thu, May 31, 2012 at 11:07:22AM +0200, Ævar Arnfjörð Bjarmason wrote: > On Thu, May 31, 2012 at 10:48 AM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > > Rewrote a perl section in sh. > > > > The code may be a bit slower (doing grep on strings instead of using > > perl-lists). > > May be? Did you test it, if so what's the difference? I did not test it. I think that it can be easily judged by someone that knows how perl implements its arrays. With an array implementation with a hashmap-solution, it should be slower, of course depending on the number of submodules. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 8:48 [PATCH] Remove perl dependency from git-submodule.sh Fredrik Gustafsson 2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason @ 2012-05-31 9:19 ` Johannes Sixt 2012-05-31 10:40 ` Fredrik Gustafsson 1 sibling, 1 reply; 9+ messages in thread From: Johannes Sixt @ 2012-05-31 9:19 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: git, jens.lehmann, gitster Am 5/31/2012 10:48, schrieb Fredrik Gustafsson: > Rewrote a perl section in sh. > The code may be a bit slower (doing grep on strings instead of using > perl-lists). "A lot" would be more correct on Windows :-) But it can be avoided, I think. > module_list() > { > + unmerged= > + null_sha1=0000000000000000000000000000000000000000 > git ls-files --error-unmatch --stage -- "$@" | > - perl -e ' > - my %unmerged = (); > - my ($null_sha1) = ("0" x 40); > - while (<STDIN>) { > - chomp; > - my ($mode, $sha1, $stage, $path) = > - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/; > - next unless $mode eq "160000"; > - if ($stage ne "0") { > - if (!$unmerged{$path}++) { > - print "$mode $null_sha1 U\t$path\n"; > - } > - next; > - } > - print "$_\n"; > - } > - ' > + while read mode sha1 stage path Be prepared for backslashes in the path name: while read -r mode sha1 stage path > + do > + if test $mode -eq 160000 $mode is not a number, but a string: test "$mode" = 160000 > + then > + if test $stage -ne 0 That $stage looks like a number is of no importance, either. > + then > + if test -z "$(echo $unmerged | grep "|$path|")" > + then > + echo "$mode $null_sha1 U\t$path" > + fi > + unmerged="$unmerged|$path|" IIUC, the purpose of $unmerged and this check is to avoid that an unmerged path is dumped for each stage that is listed by ls-files. Therefore it should be sufficient to just check that the current path is different from the last path. > + else > + echo "$mode $sha1 $stage\t$path" > + fi > + fi > + done > } > > # -- Hannes ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 9:19 ` Johannes Sixt @ 2012-05-31 10:40 ` Fredrik Gustafsson 2012-05-31 11:25 ` Johannes Sixt 2012-05-31 17:49 ` Junio C Hamano 0 siblings, 2 replies; 9+ messages in thread From: Fredrik Gustafsson @ 2012-05-31 10:40 UTC (permalink / raw) To: Johannes Sixt; +Cc: Fredrik Gustafsson, git, jens.lehmann, gitster On Thu, May 31, 2012 at 11:19:04AM +0200, Johannes Sixt wrote: > Am 5/31/2012 10:48, schrieb Fredrik Gustafsson: > > Rewrote a perl section in sh. > > > The code may be a bit slower (doing grep on strings instead of using > > perl-lists). > > "A lot" would be more correct on Windows :-) But it can be avoided, I think. > > > module_list() > > { > > + unmerged= > > + null_sha1=0000000000000000000000000000000000000000 > > git ls-files --error-unmatch --stage -- "$@" | > > - perl -e ' > > - my %unmerged = (); > > - my ($null_sha1) = ("0" x 40); > > - while (<STDIN>) { > > - chomp; > > - my ($mode, $sha1, $stage, $path) = > > - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/; > > - next unless $mode eq "160000"; > > - if ($stage ne "0") { > > - if (!$unmerged{$path}++) { > > - print "$mode $null_sha1 U\t$path\n"; > > - } > > - next; > > - } > > - print "$_\n"; > > - } > > - ' > > + while read mode sha1 stage path > > Be prepared for backslashes in the path name: > > while read -r mode sha1 stage path We are not using -r on any place in git-submodule.sh. Maybe we should? I can provide a patch if needed. > > > + do > > + if test $mode -eq 160000 > > $mode is not a number, but a string: test "$mode" = 160000 okay, fixed in next iteration. > > > + then > > + if test $stage -ne 0 > > That $stage looks like a number is of no importance, either. Actually I don't know what stage does and if it's important here. This part is just to mimic the perl code. Should it be removed? > > > + then > > + if test -z "$(echo $unmerged | grep "|$path|")" > > + then > > + echo "$mode $null_sha1 U\t$path" > > + fi > > + unmerged="$unmerged|$path|" > > IIUC, the purpose of $unmerged and this check is to avoid that an unmerged > path is dumped for each stage that is listed by ls-files. Therefore it > should be sufficient to just check that the current path is different from > the last path. That requires that submodules always is in the same order, right? That would not work for: sub1 sub2 sub1 case. But is that order really a possibility or is it always sub1 sub1 sub2 ? -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 10:40 ` Fredrik Gustafsson @ 2012-05-31 11:25 ` Johannes Sixt 2012-05-31 17:49 ` Junio C Hamano 1 sibling, 0 replies; 9+ messages in thread From: Johannes Sixt @ 2012-05-31 11:25 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: git, jens.lehmann, gitster Am 5/31/2012 12:40, schrieb Fredrik Gustafsson: > On Thu, May 31, 2012 at 11:19:04AM +0200, Johannes Sixt wrote: >> Be prepared for backslashes in the path name: >> >> while read -r mode sha1 stage path > > We are not using -r on any place in git-submodule.sh. Maybe we should? I > can provide a patch if needed. I can imagine that this would fix a bug or two with paths that contain to-be-quoted characters. >>> + do >>> + if test $mode -eq 160000 >> >> $mode is not a number, but a string: test "$mode" = 160000 > > okay, fixed in next iteration. > >> >>> + then >>> + if test $stage -ne 0 >> >> That $stage looks like a number is of no importance, either. > > Actually I don't know what stage does and if it's important here. This > part is just to mimic the perl code. Should it be removed? No; you should 'test "$stage" != 0'. >>> + then >>> + if test -z "$(echo $unmerged | grep "|$path|")" >>> + then >>> + echo "$mode $null_sha1 U\t$path" >>> + fi >>> + unmerged="$unmerged|$path|" >> >> IIUC, the purpose of $unmerged and this check is to avoid that an unmerged >> path is dumped for each stage that is listed by ls-files. Therefore it >> should be sufficient to just check that the current path is different from >> the last path. > > That requires that submodules always is in the same order, right? ls-files guarantees a suitable order: different stages of the same submodule path appear on consecutive lines. -- Hannes ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 10:40 ` Fredrik Gustafsson 2012-05-31 11:25 ` Johannes Sixt @ 2012-05-31 17:49 ` Junio C Hamano 2012-05-31 18:48 ` Fredrik Gustafsson 1 sibling, 1 reply; 9+ messages in thread From: Junio C Hamano @ 2012-05-31 17:49 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: Johannes Sixt, git, jens.lehmann Fredrik Gustafsson <iveqy@iveqy.com> writes: > On Thu, May 31, 2012 at 11:19:04AM +0200, Johannes Sixt wrote: >> Am 5/31/2012 10:48, schrieb Fredrik Gustafsson: >> > Rewrote a perl section in sh. >> >> > The code may be a bit slower (doing grep on strings instead of using >> > perl-lists). >> >> "A lot" would be more correct on Windows :-) But it can be avoided, I think. >> >> > module_list() >> > { >> > + unmerged= >> > + null_sha1=0000000000000000000000000000000000000000 >> > git ls-files --error-unmatch --stage -- "$@" | >> > - perl -e ' >> > - my %unmerged = (); >> > - my ($null_sha1) = ("0" x 40); >> > - while (<STDIN>) { >> > - chomp; >> > - my ($mode, $sha1, $stage, $path) = >> > - /^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/; >> > - next unless $mode eq "160000"; >> > - if ($stage ne "0") { >> > - if (!$unmerged{$path}++) { >> > - print "$mode $null_sha1 U\t$path\n"; >> > - } >> > - next; >> > - } >> > - print "$_\n"; >> > - } >> > - ' >> > + while read mode sha1 stage path >> >> Be prepared for backslashes in the path name: >> >> while read -r mode sha1 stage path > > We are not using -r on any place in git-submodule.sh. Maybe we should? I > can provide a patch if needed. I've already written off "git-submodule.sh" script as unfriendly to pathnames with funny characters in them. Using "read -r" may work around backslashes in the path, but you won't be able to sensibly handle LFs in filenames, for example. In other words, we could just tell users "don't use funny pathnames"---and from that point of view, rewriting the Perl scriptlet with a pure shell implementation that does not fork other little tools is sensible, especially if it results in better performance, more readable code, etc. Having said that, in the longer term, I think the right direction to go is the opposite. It would be better to make "git-submodule.sh" work better with paths with funny characters in them, and one obvious approach is to read "ls-files -z" output with something capable of parsing NUL-terminated records, e.g. a Perl scriptlet. Adding a new shell loop like this patch only adds one place that needs to be fixed later when that happens, so I am not sure I like this patch. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 17:49 ` Junio C Hamano @ 2012-05-31 18:48 ` Fredrik Gustafsson 2012-05-31 19:26 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Fredrik Gustafsson @ 2012-05-31 18:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: Fredrik Gustafsson, Johannes Sixt, git, jens.lehmann On Thu, May 31, 2012 at 10:49:41AM -0700, Junio C Hamano wrote: > Having said that, in the longer term, I think the right direction to > go is the opposite. It would be better to make "git-submodule.sh" > work better with paths with funny characters in them, and one > obvious approach is to read "ls-files -z" output with something > capable of parsing NUL-terminated records, e.g. a Perl scriptlet. > Adding a new shell loop like this patch only adds one place that > needs to be fixed later when that happens, so I am not sure I like > this patch. Is perl really a dependency that git wants? Today only a few bit (often non critical) are in perl. I thought the way was to get rid of those and replace them with c? I'm very critical to dependencies when they are not needed. I don't think forking for text-parsing when not needed is a good idea either. Apart from the runtime issues, it makes the code harder to read. With that said I do agree that funny path names should be supported and maybe the correct solution is to make more use of perl and less of sh. Mixing those, and doing it in the same file, I don't think is a good idea. Is the right direction to run a shellscript that invokes a perl-scriptlet for textparsing? -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Remove perl dependency from git-submodule.sh 2012-05-31 18:48 ` Fredrik Gustafsson @ 2012-05-31 19:26 ` Junio C Hamano 0 siblings, 0 replies; 9+ messages in thread From: Junio C Hamano @ 2012-05-31 19:26 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: Johannes Sixt, git, jens.lehmann Fredrik Gustafsson <iveqy@iveqy.com> writes: > On Thu, May 31, 2012 at 10:49:41AM -0700, Junio C Hamano wrote: >> Having said that, in the longer term, I think the right direction to >> go is the opposite. It would be better to make "git-submodule.sh" >> work better with paths with funny characters in them, and one >> obvious approach is to read "ls-files -z" output with something >> capable of parsing NUL-terminated records, e.g. a Perl scriptlet. >> Adding a new shell loop like this patch only adds one place that >> needs to be fixed later when that happens, so I am not sure I like >> this patch. > > Is perl really a dependency that git wants? It depends on your definition of "want"; I'd say "if alternative is to lose things like functionality, performance, etc., we would rather live with it." It is one of the more widely available scripting languages whose scripts are more portable across platforms (sadly, we ought to be able to use sed and awk which are more available but we have seen portability issues with them); if we want to step outside of what can be done with POSIX shell scripts (e.g. handling NUL terminated stream) but are not ready to rewrite everything in C, I would say it is the least evil among others. So the short answer is yes. > Today only a few bit (often > non critical) are in perl. I thought the way was to get rid of those and > replace them with c? But your patch does not help us bring ourselves any closer to replace anything with C at all. > I'm very critical to dependencies when they are not needed. The key-phrase is "when they are not needed". If the patch were to replace it with awk, sed or shell *without* losing functionality, performance, readability, portability & maintainability, it would be giving us one less dependency without losing anything. It may not be replacing everything with C, but at least it would not be going backwards. On the other hand, if the patch were to replace Perl scriptlet with ruby or python, that would be adding unnecessary extra dependencies, as we do not have anything written in them in the core. To such a patch, we can confidently say "It adds unnecessary dependency without value" and reject it. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-05-31 19:26 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-31 8:48 [PATCH] Remove perl dependency from git-submodule.sh Fredrik Gustafsson 2012-05-31 9:07 ` Ævar Arnfjörð Bjarmason 2012-05-31 10:37 ` Fredrik Gustafsson 2012-05-31 9:19 ` Johannes Sixt 2012-05-31 10:40 ` Fredrik Gustafsson 2012-05-31 11:25 ` Johannes Sixt 2012-05-31 17:49 ` Junio C Hamano 2012-05-31 18:48 ` Fredrik Gustafsson 2012-05-31 19:26 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).