* [PATCH] Improve sed portability
@ 2008-06-11 13:09 Chris Ridd
2008-06-11 14:04 ` Johannes Sixt
0 siblings, 1 reply; 9+ messages in thread
From: Chris Ridd @ 2008-06-11 13:09 UTC (permalink / raw)
To: git; +Cc: Chris Ridd
On Solaris /usr/bin/sed apparently fails to process input that doesn't
end in a \n. Consequently constructs like
re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
cause re to be set to the empty string. Such a construct is used in
git-submodule.sh.
Changing the printf to add a \n seems the safest change. The
POSIX-compliant seds shipped with Solaris do not have this problem.
Signed-off-by: Chris Ridd <chris.ridd@isode.com>
---
git-submodule.sh | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/git-submodule.sh b/git-submodule.sh
index 1007372..e515bcc 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -73,7 +73,7 @@ resolve_relative_url ()
module_name()
{
# Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
- re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
+ re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
test -z "$name" &&
--
1.5.3.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-11 13:09 [PATCH] Improve sed portability Chris Ridd
@ 2008-06-11 14:04 ` Johannes Sixt
2008-06-11 15:29 ` Chris Ridd
2008-06-12 8:33 ` Junio C Hamano
0 siblings, 2 replies; 9+ messages in thread
From: Johannes Sixt @ 2008-06-11 14:04 UTC (permalink / raw)
To: Chris Ridd; +Cc: git
Chris Ridd schrieb:
> On Solaris /usr/bin/sed apparently fails to process input that doesn't
> end in a \n. Consequently constructs like
>
> re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
>
> cause re to be set to the empty string.
So does /usr/bin/sed of AIX 4.3!
> @@ -73,7 +73,7 @@ resolve_relative_url ()
> module_name()
> {
> # Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
> - re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
> + re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
hidden from the shell so that printf can interpret it?
> name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
> sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
I trust you have tested this. But I wonder whether this leaves a stray
newline in $re that gets in the way inside the sed expression...
> test -z "$name" &&
-- Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-11 14:04 ` Johannes Sixt
@ 2008-06-11 15:29 ` Chris Ridd
2008-06-11 16:39 ` Jeff King
` (2 more replies)
2008-06-12 8:33 ` Junio C Hamano
1 sibling, 3 replies; 9+ messages in thread
From: Chris Ridd @ 2008-06-11 15:29 UTC (permalink / raw)
Cc: git
Johannes Sixt wrote:
> Chris Ridd schrieb:
>> On Solaris /usr/bin/sed apparently fails to process input that doesn't
>> end in a \n. Consequently constructs like
>>
>> re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
>>
>> cause re to be set to the empty string.
>
> So does /usr/bin/sed of AIX 4.3!
I ought to have mentioned this occurs on Solaris 8, 10, build 90 of
OpenSolaris, and on HP-UX 11iv1. I stared at that regex for quite a
while before realising the problem was with the input :-)
>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>> module_name()
>> {
>> # Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
>> - re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>> + re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>
> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
> hidden from the shell so that printf can interpret it?
It is necessary to use double quotes. This:
printf '%s\n' foobar
prints a literal \, a literal n, and no newline:
foobar\n
Not desirable :-(
Of course, using a plain old:
echo "$1"
should work well too. Why is printf being used here and not echo, anyway?
>> name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
>> sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
>
> I trust you have tested this. But I wonder whether this leaves a stray
> newline in $re that gets in the way inside the sed expression...
Yes, I've tested this as we use submodules heavily. I think the $( .. )
notation will remove the trailing \n printed by sed, but to be sure I
inserted a 'set -x' at the top of the module_name() function and
double-checked that the re variable didn't get any stray \n
character(s). Bash versions 2 and 3 were used.
So without the change, on Solaris I get:
No submodule mapping found in .gitmodules for path 'foobar'
for the first submodule that we use, and the repository clone fails.
With the change, all our repositories clone OK.
Cheers,
Chris
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-11 15:29 ` Chris Ridd
@ 2008-06-11 16:39 ` Jeff King
2008-06-12 7:46 ` Johannes Sixt
2008-07-13 20:00 ` Jakub Narebski
2 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2008-06-11 16:39 UTC (permalink / raw)
To: Chris Ridd; +Cc: git
On Wed, Jun 11, 2008 at 04:29:53PM +0100, Chris Ridd wrote:
> It is necessary to use double quotes. This:
>
> printf '%s\n' foobar
>
> prints a literal \, a literal n, and no newline:
>
> foobar\n
>
> Not desirable :-(
On what platform?
> Of course, using a plain old:
>
> echo "$1"
>
> should work well too. Why is printf being used here and not echo, anyway?
Because the original didn't have a newline, and "echo -n" isn't
portable?
-Peff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-11 15:29 ` Chris Ridd
2008-06-11 16:39 ` Jeff King
@ 2008-06-12 7:46 ` Johannes Sixt
2008-06-12 8:29 ` Chris Ridd
2008-07-13 20:00 ` Jakub Narebski
2 siblings, 1 reply; 9+ messages in thread
From: Johannes Sixt @ 2008-06-12 7:46 UTC (permalink / raw)
To: Chris Ridd; +Cc: no To-header on input, git
Chris Ridd schrieb:
> Johannes Sixt wrote:
>> Chris Ridd schrieb:
>>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>>> module_name()
>>> {
>>> # Do we have "submodule.<something>.path = $1" defined in
>>> .gitmodules file?
>>> - re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>> + re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>
>> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
>> hidden from the shell so that printf can interpret it?
>
> It is necessary to use double quotes. This:
>
> printf '%s\n' foobar
>
> prints a literal \, a literal n, and no newline:
>
> foobar\n
>
> Not desirable :-(
On both Linux and AIX 4.3 I see:
$ printf 'x\ny'; echo z
x
yz
The printf turns the \n into LF.
I mentioned this in the first place because I don't know what various
shells do with \n when they see "%s\n". But one way or the other, the \n
will be turned into LF, either by the shell or by printf. So it's not a
big deal.
> Of course, using a plain old:
>
> echo "$1"
>
> should work well too. Why is printf being used here and not echo, anyway?
Because the "$1" could contain character sequences that some 'echo'
implementations mangle.
-- Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-12 7:46 ` Johannes Sixt
@ 2008-06-12 8:29 ` Chris Ridd
2008-06-12 9:07 ` Jeff King
0 siblings, 1 reply; 9+ messages in thread
From: Chris Ridd @ 2008-06-12 8:29 UTC (permalink / raw)
To: git
Johannes Sixt wrote:
> Chris Ridd schrieb:
>> Johannes Sixt wrote:
>>> Chris Ridd schrieb:
>>>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>>>> module_name()
>>>> {
>>>> # Do we have "submodule.<something>.path = $1" defined in
>>>> .gitmodules file?
>>>> - re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>>> + re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>>> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
>>> hidden from the shell so that printf can interpret it?
>> It is necessary to use double quotes. This:
>>
>> printf '%s\n' foobar
>>
>> prints a literal \, a literal n, and no newline:
>>
>> foobar\n
>>
>> Not desirable :-(
>
> On both Linux and AIX 4.3 I see:
>
> $ printf 'x\ny'; echo z
> x
> yz
>
> The printf turns the \n into LF.
Yes, and I don't know *what* I did yesterday, but Solaris 8, 10, (every
OS I mentioned before) behave the same as your test.
I did actually have my eyes tested later on yesterday :-)
> I mentioned this in the first place because I don't know what various
> shells do with \n when they see "%s\n". But one way or the other, the \n
> will be turned into LF, either by the shell or by printf. So it's not a
> big deal.
I agree.
>> Of course, using a plain old:
>>
>> echo "$1"
>>
>> should work well too. Why is printf being used here and not echo, anyway?
>
> Because the "$1" could contain character sequences that some 'echo'
> implementations mangle.
Indeed. If $1 started with -n that might cause problems on some platforms.
Should I revise my commit to use single quotes again?
Cheers,
Chris
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-12 8:29 ` Chris Ridd
@ 2008-06-12 9:07 ` Jeff King
0 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2008-06-12 9:07 UTC (permalink / raw)
To: Chris Ridd; +Cc: git
On Thu, Jun 12, 2008 at 09:29:27AM +0100, Chris Ridd wrote:
>> Because the "$1" could contain character sequences that some 'echo'
>> implementations mangle.
>
> Indeed. If $1 started with -n that might cause problems on some platforms.
It's much worse than that. Any backslash sequence can be interpolated.
4b7cc26 (git-am: use printf instead of echo on user-supplied strings).
-Peff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-11 15:29 ` Chris Ridd
2008-06-11 16:39 ` Jeff King
2008-06-12 7:46 ` Johannes Sixt
@ 2008-07-13 20:00 ` Jakub Narebski
2 siblings, 0 replies; 9+ messages in thread
From: Jakub Narebski @ 2008-07-13 20:00 UTC (permalink / raw)
To: Chris Ridd
Chris Ridd <chris.ridd@isode.com> writes:
> Of course, using a plain old:
>
> echo "$1"
>
> should work well too. Why is printf being used here and not echo,
> anyway?
Uh, because 'echo -n' is not portable enough, and for some reason it
was though that there shouldn't be final newline?
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Improve sed portability
2008-06-11 14:04 ` Johannes Sixt
2008-06-11 15:29 ` Chris Ridd
@ 2008-06-12 8:33 ` Junio C Hamano
1 sibling, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-06-12 8:33 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Chris Ridd, git
Johannes Sixt <j.sixt@viscovery.net> writes:
> Chris Ridd schrieb:
>> On Solaris /usr/bin/sed apparently fails to process input that doesn't
>> end in a \n. Consequently constructs like
>>
>> re=$(printf '%s' foo | sed -e 's/bar/BAR/g' $)
>>
>> cause re to be set to the empty string.
>
> So does /usr/bin/sed of AIX 4.3!
>
>> @@ -73,7 +73,7 @@ resolve_relative_url ()
>> module_name()
>> {
>> # Do we have "submodule.<something>.path = $1" defined in .gitmodules file?
>> - re=$(printf '%s' "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>> + re=$(printf "%s\n" "$1" | sed -e 's/[].[^$\\*]/\\&/g')
>
> You change sq into dq. Is this not dangerous? Shouldn't backslash-en be
> hidden from the shell so that printf can interpret it?
"\n" inside dq is _not_ interpreted by the shell (printf interprets it),
but I tend to agree that using sq is worry-free and better.
>> name=$( git config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
>> sed -n -e 's|^submodule\.\(.*\)\.path '"$re"'$|\1|p' )
>
> I trust you have tested this. But I wonder whether this leaves a stray
> newline in $re that gets in the way inside the sed expression...
I suspect the very original was written (or copied from something that
wrote) like this:
re=$(echo -n "$1" | sed -e '...')
and mechanically replaced to
re=$(printf '%s' "$1" | sed -e '...')
because "echo" is not quite portable.
But the original misunderstands the command substitution. The trailing LF
is removed by it, so as long as "$1" is a single line, $re will get a line
without the trailing LF _anyway_.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-07-13 20:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-11 13:09 [PATCH] Improve sed portability Chris Ridd
2008-06-11 14:04 ` Johannes Sixt
2008-06-11 15:29 ` Chris Ridd
2008-06-11 16:39 ` Jeff King
2008-06-12 7:46 ` Johannes Sixt
2008-06-12 8:29 ` Chris Ridd
2008-06-12 9:07 ` Jeff King
2008-07-13 20:00 ` Jakub Narebski
2008-06-12 8:33 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).