git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Improve sed portability
@ 2008-06-11 13:09 Chris Ridd
  2008-06-11 14:04 ` Johannes Sixt
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Ridd @ 2008-06-11 13:09 UTC (permalink / raw)
  To: git; +Cc: Chris Ridd

On Solaris /usr/bin/sed apparently fails to process input that doesn't
end in a \n. Consequently constructs like

  re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)

cause re to be set to the empty string. Such a construct is used in
git-submodule.sh.

Changing the printf to add a \n seems the safest change. The
POSIX-compliant seds shipped with Solaris do not have this problem.

Signed-off-by: Chris Ridd <chris.ridd@isode.com>
---
 git-submodule.sh |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 1007372..e515bcc 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -73,7 +73,7 @@ resolve_relative_url ()
 module_name()
 {
 	# Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
-	re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
+	re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
 	name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
 		sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
        test -z "$name" &&
-- 
1.5.3.6

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-11 13:09 [PATCH] Improve sed portability Chris Ridd
@ 2008-06-11 14:04 ` Johannes Sixt
  2008-06-11 15:29   ` Chris Ridd
  2008-06-12  8:33   ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Sixt @ 2008-06-11 14:04 UTC (permalink / raw)
  To: Chris Ridd; +Cc: git

Chris Ridd schrieb:
> On Solaris /usr/bin/sed apparently fails to process input that doesn't
> end in a \n. Consequently constructs like
> 
>   re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
> 
> cause re to be set to the empty string.

So does /usr/bin/sed of AIX 4.3!

> @@ -73,7 +73,7 @@ resolve_relative_url ()
>  module_name()
>  {
>  	# Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
> -	re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
> +	re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')

You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
hidden from the shell so that printf can interpret it?

>  	name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
>  		sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )

I trust you have tested this. But I wonder whether this leaves a stray
newline in $re that gets in the way inside the sed expression...

>         test -z "$name" &&


-- Hannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-11 14:04 ` Johannes Sixt
@ 2008-06-11 15:29   ` Chris Ridd
  2008-06-11 16:39     ` Jeff King
                       ` (2 more replies)
  2008-06-12  8:33   ` Junio C Hamano
  1 sibling, 3 replies; 9+ messages in thread
From: Chris Ridd @ 2008-06-11 15:29 UTC (permalink / raw)
  Cc: git

Johannes Sixt wrote:
> Chris Ridd schrieb:
>> On Solaris /usr/bin/sed apparently fails to process input that doesn't
>> end in a \n. Consequently constructs like
>>
>>   re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
>>
>> cause re to be set to the empty string.
> 
> So does /usr/bin/sed of AIX 4.3!

I ought to have mentioned this occurs on Solaris 8, 10, build 90 of 
OpenSolaris, and on HP-UX 11iv1. I stared at that regex for quite a 
while before realising the problem was with the input :-)

>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>>  module_name()
>>  {
>>  	# Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
>> -	re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>> +	re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
> 
> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
> hidden from the shell so that printf can interpret it?

It is necessary to use double quotes. This:

     printf '%s\n' foobar

prints a literal \, a literal n, and no newline:

     foobar\n

Not desirable :-(

Of course, using a plain old:

     echo "$1"

should work well too. Why is printf being used here and not echo, anyway?

>>  	name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
>>  		sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
> 
> I trust you have tested this. But I wonder whether this leaves a stray
> newline in $re that gets in the way inside the sed expression...

Yes, I've tested this as we use submodules heavily. I think the $( .. ) 
notation will remove the trailing \n printed by sed, but to be sure I 
inserted a 'set -x' at the top of the module_name() function and 
double-checked that the re variable didn't get any stray \n 
character(s). Bash versions 2 and 3 were used.

So without the change, on Solaris I get:

     No submodule mapping found in .gitmodules for path 'foobar'

for the first submodule that we use, and the repository clone fails.

With the change, all our repositories clone OK.

Cheers,

Chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-11 15:29   ` Chris Ridd
@ 2008-06-11 16:39     ` Jeff King
  2008-06-12  7:46     ` Johannes Sixt
  2008-07-13 20:00     ` Jakub Narebski
  2 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2008-06-11 16:39 UTC (permalink / raw)
  To: Chris Ridd; +Cc: git

On Wed, Jun 11, 2008 at 04:29:53PM +0100, Chris Ridd wrote:

> It is necessary to use double quotes. This:
>
>     printf '%s\n' foobar
>
> prints a literal \, a literal n, and no newline:
>
>     foobar\n
>
> Not desirable :-(

On what platform?

> Of course, using a plain old:
>
>     echo "$1"
>
> should work well too. Why is printf being used here and not echo, anyway?

Because the original didn't have a newline, and "echo -n" isn't
portable?

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-11 15:29   ` Chris Ridd
  2008-06-11 16:39     ` Jeff King
@ 2008-06-12  7:46     ` Johannes Sixt
  2008-06-12  8:29       ` Chris Ridd
  2008-07-13 20:00     ` Jakub Narebski
  2 siblings, 1 reply; 9+ messages in thread
From: Johannes Sixt @ 2008-06-12  7:46 UTC (permalink / raw)
  To: Chris Ridd; +Cc: no To-header on input, git

Chris Ridd schrieb:
> Johannes Sixt wrote:
>> Chris Ridd schrieb:
>>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>>>  module_name()
>>>  {
>>>      # Do we have "submodule.<something>.path = $1" defined in
>>> .gitmodules file?
>>> -    re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>> +    re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>
>> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
>> hidden from the shell so that printf can interpret it?
> 
> It is necessary to use double quotes. This:
> 
>     printf '%s\n' foobar
> 
> prints a literal \, a literal n, and no newline:
> 
>     foobar\n
> 
> Not desirable :-(

On both Linux and AIX 4.3 I see:

$  printf 'x\ny'; echo z
x
yz

The printf turns the \n into LF.

I mentioned this in the first place because I don't know what various
shells do with \n when they see "%s\n". But one way or the other, the \n
will be turned into LF, either by the shell or by printf. So it's not a
big deal.

> Of course, using a plain old:
> 
>     echo "$1"
> 
> should work well too. Why is printf being used here and not echo, anyway?

Because the "$1" could contain character sequences that some 'echo'
implementations mangle.

-- Hannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-12  7:46     ` Johannes Sixt
@ 2008-06-12  8:29       ` Chris Ridd
  2008-06-12  9:07         ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Ridd @ 2008-06-12  8:29 UTC (permalink / raw)
  To: git

Johannes Sixt wrote:
> Chris Ridd schrieb:
>> Johannes Sixt wrote:
>>> Chris Ridd schrieb:
>>>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>>>>  module_name()
>>>>  {
>>>>      # Do we have "submodule.<something>.path = $1" defined in
>>>> .gitmodules file?
>>>> -    re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>>> +    re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
>>> hidden from the shell so that printf can interpret it?
>> It is necessary to use double quotes. This:
>>
>>     printf '%s\n' foobar
>>
>> prints a literal \, a literal n, and no newline:
>>
>>     foobar\n
>>
>> Not desirable :-(
> 
> On both Linux and AIX 4.3 I see:
> 
> $  printf 'x\ny'; echo z
> x
> yz
> 
> The printf turns the \n into LF.

Yes, and I don't know *what* I did yesterday, but Solaris 8, 10, (every 
OS I mentioned before) behave the same as your test.

I did actually have my eyes tested later on yesterday :-)

> I mentioned this in the first place because I don't know what various
> shells do with \n when they see "%s\n". But one way or the other, the \n
> will be turned into LF, either by the shell or by printf. So it's not a
> big deal.

I agree.

>> Of course, using a plain old:
>>
>>     echo "$1"
>>
>> should work well too. Why is printf being used here and not echo, anyway?
> 
> Because the "$1" could contain character sequences that some 'echo'
> implementations mangle.

Indeed. If $1 started with -n that might cause problems on some platforms.

Should I revise my commit to use single quotes again?

Cheers,

Chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-11 14:04 ` Johannes Sixt
  2008-06-11 15:29   ` Chris Ridd
@ 2008-06-12  8:33   ` Junio C Hamano
  1 sibling, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-06-12  8:33 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Chris Ridd, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Chris Ridd schrieb:
>> On Solaris /usr/bin/sed apparently fails to process input that doesn't
>> end in a \n. Consequently constructs like
>> 
>>   re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
>> 
>> cause re to be set to the empty string.
>
> So does /usr/bin/sed of AIX 4.3!
>
>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>>  module_name()
>>  {
>>  	# Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
>> -	re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>> +	re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>
> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
> hidden from the shell so that printf can interpret it?

"\n" inside dq is _not_ interpreted by the shell (printf interprets it),
but I tend to agree that using sq is worry-free and better.

>>  	name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
>>  		sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
>
> I trust you have tested this. But I wonder whether this leaves a stray
> newline in $re that gets in the way inside the sed expression...

I suspect the very original was written (or copied from something that
wrote) like this:

	re=$(echo -n "$1" | sed -e '...')

and mechanically replaced to

	re=$(printf '%s' "$1" | sed -e '...')

because "echo" is not quite portable.

But the original misunderstands the command substitution.  The trailing LF
is removed by it, so as long as "$1" is a single line, $re will get a line
without the trailing LF _anyway_.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-12  8:29       ` Chris Ridd
@ 2008-06-12  9:07         ` Jeff King
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2008-06-12  9:07 UTC (permalink / raw)
  To: Chris Ridd; +Cc: git

On Thu, Jun 12, 2008 at 09:29:27AM +0100, Chris Ridd wrote:

>> Because the "$1" could contain character sequences that some 'echo'
>> implementations mangle.
>
> Indeed. If $1 started with -n that might cause problems on some platforms.

It's much worse than that. Any backslash sequence can be interpolated.
4b7cc26 (git-am: use printf instead of echo on user-supplied strings).

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Improve sed portability
  2008-06-11 15:29   ` Chris Ridd
  2008-06-11 16:39     ` Jeff King
  2008-06-12  7:46     ` Johannes Sixt
@ 2008-07-13 20:00     ` Jakub Narebski
  2 siblings, 0 replies; 9+ messages in thread
From: Jakub Narebski @ 2008-07-13 20:00 UTC (permalink / raw)
  To: Chris Ridd

Chris Ridd <chris.ridd@isode.com> writes:

> Of course, using a plain old:
> 
>      echo "$1"
> 
> should work well too. Why is printf being used here and not echo,
> anyway?

Uh, because 'echo -n' is not portable enough, and for some reason it
was though that there shouldn't be final newline?

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-07-13 20:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-11 13:09 [PATCH] Improve sed portability Chris Ridd
2008-06-11 14:04 ` Johannes Sixt
2008-06-11 15:29   ` Chris Ridd
2008-06-11 16:39     ` Jeff King
2008-06-12  7:46     ` Johannes Sixt
2008-06-12  8:29       ` Chris Ridd
2008-06-12  9:07         ` Jeff King
2008-07-13 20:00     ` Jakub Narebski
2008-06-12  8:33   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).