* git-svn url-encodes ~ when it should not
@ 2008-10-18 21:39 Björn Steinbrink
2008-10-18 22:47 ` Björn Steinbrink
0 siblings, 1 reply; 9+ messages in thread
From: Björn Steinbrink @ 2008-10-18 21:39 UTC (permalink / raw)
To: Eric Wong; +Cc: git
Hi,
Jose Carlos Garcia Sogo reported on #git that a git-svn clone of this
svn repo fails for him:
https://sucs.org/~welshbyte/svn/backuptool/trunk
I can reproduce that here with:
git-svn version 1.6.0.2.541.g46dc1.dirty (svn 1.5.1)
The error message I get is:
Apache got a malformed URI: Unusable URI: it does not refer to this
repository at /usr/local/libexec/git-core/git-svn line 4057
strace revealed that git-svn url-encodes ~ while svn does not do that.
For svn we have:
write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
<S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
While git-svn shows:
write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
<S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
Björn
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git-svn url-encodes ~ when it should not
2008-10-18 21:39 git-svn url-encodes ~ when it should not Björn Steinbrink
@ 2008-10-18 22:47 ` Björn Steinbrink
2008-10-21 21:12 ` [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs Eric Wong
0 siblings, 1 reply; 9+ messages in thread
From: Björn Steinbrink @ 2008-10-18 22:47 UTC (permalink / raw)
To: Eric Wong; +Cc: git, jsogo
[Adding Jose to Cc:, didn't have his address earlier]
On 2008.10.18 23:39:19 +0200, Björn Steinbrink wrote:
> Hi,
>
> Jose Carlos Garcia Sogo reported on #git that a git-svn clone of this
> svn repo fails for him:
> https://sucs.org/~welshbyte/svn/backuptool/trunk
>
> I can reproduce that here with:
> git-svn version 1.6.0.2.541.g46dc1.dirty (svn 1.5.1)
>
> The error message I get is:
> Apache got a malformed URI: Unusable URI: it does not refer to this
> repository at /usr/local/libexec/git-core/git-svn line 4057
>
> strace revealed that git-svn url-encodes ~ while svn does not do that.
>
> For svn we have:
> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
>
> While git-svn shows:
> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
>
> Björn
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-18 22:47 ` Björn Steinbrink
@ 2008-10-21 21:12 ` Eric Wong
2008-10-21 21:53 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2008-10-21 21:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Björn Steinbrink, git, jsogo
Thanks to Jose Carlos Garcia Sogo and Björn Steinbrink for the
bug report.
On 2008.10.18 23:39:19 +0200, Björn Steinbrink wrote:
> Hi,
>
> Jose Carlos Garcia Sogo reported on #git that a git-svn clone of this
> svn repo fails for him:
> https://sucs.org/~welshbyte/svn/backuptool/trunk
>
> I can reproduce that here with:
> git-svn version 1.6.0.2.541.g46dc1.dirty (svn 1.5.1)
>
> The error message I get is:
> Apache got a malformed URI: Unusable URI: it does not refer to this
> repository at /usr/local/libexec/git-core/git-svn line 4057
>
> strace revealed that git-svn url-encodes ~ while svn does not do that.
>
> For svn we have:
> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
>
> While git-svn shows:
> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
git-svn.perl | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/git-svn.perl b/git-svn.perl
index ef6d773..a97049a 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -852,7 +852,7 @@ sub escape_uri_only {
my ($uri) = @_;
my @tmp;
foreach (split m{/}, $uri) {
- s/([^\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+ s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
push @tmp, $_;
}
join('/', @tmp);
@@ -3537,7 +3537,7 @@ sub repo_path {
sub url_path {
my ($self, $path) = @_;
if ($self->{url} =~ m#^https?://#) {
- $path =~ s/([^a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg;
+ $path =~ s/([^~a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg;
}
$self->{url} . '/' . $self->repo_path($path);
}
@@ -3890,7 +3890,7 @@ sub escape_uri_only {
my ($uri) = @_;
my @tmp;
foreach (split m{/}, $uri) {
- s/([^\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+ s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
push @tmp, $_;
}
join('/', @tmp);
--
Eric Wong
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-21 21:12 ` [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs Eric Wong
@ 2008-10-21 21:53 ` Junio C Hamano
2008-10-22 6:13 ` Mike Hommey
2008-10-22 8:16 ` Eric Wong
0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-10-21 21:53 UTC (permalink / raw)
To: Eric Wong; +Cc: Björn Steinbrink, git, jsogo
Eric Wong <normalperson@yhbt.net> writes:
>> strace revealed that git-svn url-encodes ~ while svn does not do that.
>>
>> For svn we have:
>> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
>> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
>>
>> While git-svn shows:
>> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
>> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
This looks like an XML based request sequence to me (and svn is talking
WebDAV here, right?); it makes me wonder what exact quoting rules are used
there. I would expect $path in <S:src-path>$path</S:src-path> to quote
a letters in it e.g. '<' as "<" --- which is quite different from what
the s/// substitutions in the patch seem to be doing.
> diff --git a/git-svn.perl b/git-svn.perl
> index ef6d773..a97049a 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -852,7 +852,7 @@ sub escape_uri_only {
> - s/([^\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
> + s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
Admittedly I do not know git-svn (nor Perl svn bindings it uses), and I
suspect that some of the XML-level escaping is done in the libsvn side,
but it would be nice if somebody can at least verify that the code after
the patch works with repositories with funny characters in pathnames
(perhaps list all the printables including "<&>?*!@.+-%^"). Even nicer
would be a log message that says "the resulting code covers all cases
because it follows _that_ spec to escape _all_ problematic letters",
pointing at some in svn (or libsvn-perl) resource.
The patch may make a path with '~' work, but it (neither in the patch text
nor in the commit log message) does not have much to give readers enough
confidence that the code after the patch is the _final_ one, as opposed to
being just a band-aid for a single symptom that happened to have been
discovered this time.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-21 21:53 ` Junio C Hamano
@ 2008-10-22 6:13 ` Mike Hommey
2008-10-22 6:24 ` Junio C Hamano
2008-10-22 8:16 ` Eric Wong
1 sibling, 1 reply; 9+ messages in thread
From: Mike Hommey @ 2008-10-22 6:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Eric Wong, Björn Steinbrink, git, jsogo
On Tue, Oct 21, 2008 at 02:53:28PM -0700, Junio C Hamano wrote:
> Eric Wong <normalperson@yhbt.net> writes:
>
> >> strace revealed that git-svn url-encodes ~ while svn does not do that.
> >>
> >> For svn we have:
> >> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> >> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
> >>
> >> While git-svn shows:
> >> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> >> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
>
> This looks like an XML based request sequence to me (and svn is talking
> WebDAV here, right?);
XML based would be &126;, not %7E.
Anyways, aren't there ready-to-use url quoting functions in perl ?
Mike
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-22 6:13 ` Mike Hommey
@ 2008-10-22 6:24 ` Junio C Hamano
0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-10-22 6:24 UTC (permalink / raw)
To: Mike Hommey; +Cc: Eric Wong, Björn Steinbrink, git, jsogo
Mike Hommey <mh@glandium.org> writes:
> On Tue, Oct 21, 2008 at 02:53:28PM -0700, Junio C Hamano wrote:
>
>> >> For svn we have:
>> >> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
>> >> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
>> >>
>> >> While git-svn shows:
>> >> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
>> >> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
>>
>> This looks like an XML based request sequence to me (and svn is talking
>> WebDAV here, right?);
>
> XML based would be &126;, not %7E.
Read what you quoted again and realize you are agreeing with me ;-).
The former (with "~") is what svn expects, the latter (with %7E) is what
git-svn incorrectly threw at the server causing problems. I am wondering
the whole %xx escaping thing, which does not seem to match what svn seems
to expect.
> Anyways, aren't there ready-to-use url quoting functions in perl ?
My question is not about correct "url quoting", but if using "url quoting"
is correct in this context.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-21 21:53 ` Junio C Hamano
2008-10-22 6:13 ` Mike Hommey
@ 2008-10-22 8:16 ` Eric Wong
2008-10-22 18:53 ` Junio C Hamano
1 sibling, 1 reply; 9+ messages in thread
From: Eric Wong @ 2008-10-22 8:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Björn Steinbrink, git, jsogo
Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
>
> >> strace revealed that git-svn url-encodes ~ while svn does not do that.
> >>
> >> For svn we have:
> >> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> >> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
> >>
> >> While git-svn shows:
> >> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> >> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
>
> This looks like an XML based request sequence to me (and svn is talking
> WebDAV here, right?); it makes me wonder what exact quoting rules are used
> there. I would expect $path in <S:src-path>$path</S:src-path> to quote
> a letters in it e.g. '<' as "<" --- which is quite different from what
> the s/// substitutions in the patch seem to be doing.
I agree. I haven't checked if the SVN libraries do proper XML escaping
for us (but the problem hasn't shown up yet :). I was already
completely disappointed that git-svn had to do its own escaping when
transmitting data using the SVN libraries (and dependent on the protocol
being used, too!).
> > diff --git a/git-svn.perl b/git-svn.perl
> > index ef6d773..a97049a 100755
> > --- a/git-svn.perl
> > +++ b/git-svn.perl
> > @@ -852,7 +852,7 @@ sub escape_uri_only {
> > - s/([^\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
> > + s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
>
> Admittedly I do not know git-svn (nor Perl svn bindings it uses), and I
> suspect that some of the XML-level escaping is done in the libsvn side,
> but it would be nice if somebody can at least verify that the code after
> the patch works with repositories with funny characters in pathnames
> (perhaps list all the printables including "<&>?*!@.+-%^"). Even nicer
> would be a log message that says "the resulting code covers all cases
> because it follows _that_ spec to escape _all_ problematic letters",
> pointing at some in svn (or libsvn-perl) resource.
Help with looking at what SVN does and writing testcases would
definitely be appreciated on this matter. Or perhaps this can be done
at GitTogether :)
> The patch may make a path with '~' work, but it (neither in the patch text
> nor in the commit log message) does not have much to give readers enough
> confidence that the code after the patch is the _final_ one, as opposed to
> being just a band-aid for a single symptom that happened to have been
> discovered this time.
This is definitely a band-aid fix until I or somebody else takes the
time to figure out:
1. exactly which characters need to be escaped
2. for which protocols those characters need to be escaped
3. which part(s) of the URI they need to be escaped for
(repository root vs SVN path)
4. which versions of SVN needs more (or less) escaping rules
(I vote for somebody else, especially for #4 :)
--
Eric Wong
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-22 8:16 ` Eric Wong
@ 2008-10-22 18:53 ` Junio C Hamano
2008-10-24 9:13 ` Eric Wong
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2008-10-22 18:53 UTC (permalink / raw)
To: Eric Wong; +Cc: Björn Steinbrink, git, jsogo
Eric Wong <normalperson@yhbt.net> writes:
> Junio C Hamano <gitster@pobox.com> wrote:
>
> Help with looking at what SVN does and writing testcases would
> definitely be appreciated on this matter. Or perhaps this can be done
> at GitTogether :)
I'm not sure it would be a good use of time at GitTogether to do something
whose spec is pretty much self-evident (essentially for this one it boils
down to "define what are the 'funny' bytes, and list the protocols
supported by svn, and come up with paths with funny bytes in it and
see what libsvn-perl gives to the underlying svn library, and what the svn
library does over the wire"). Ongoiong "fix start-up sequence around
worktree area" might be a better fit; I dunno.
>> The patch may make a path with '~' work, but it (neither in the patch text
>> nor in the commit log message) does not have much to give readers enough
>> confidence that the code after the patch is the _final_ one, as opposed to
>> being just a band-aid for a single symptom that happened to have been
>> discovered this time.
>
> This is definitely a band-aid fix until I or somebody else takes the
> time to figure out:
>
> 1. exactly which characters need to be escaped
> 2. for which protocols those characters need to be escaped
> 3. which part(s) of the URI they need to be escaped for
> (repository root vs SVN path)
> 4. which versions of SVN needs more (or less) escaping rules
>
> (I vote for somebody else, especially for #4 :)
Item 3. above disturbs me. Do you mean that in:
https://sucs.org/~welshbyte/svn/backuptool/trunk/foo~bar.txt
the two tildes might have to be sent to libsvn-perl differently?
Even if that is the case, I am inclined suggest taking the patch in the
meantime as an interim workaround, with the understanding that we know the
patch improves the situation for the tilde before welshbyte and even
though we do not know if the patch regresses for the latter one between
foo and bar, it would be much rarer to have tilde in such places.
Care to come up with an updated log message?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
2008-10-22 18:53 ` Junio C Hamano
@ 2008-10-24 9:13 ` Eric Wong
0 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2008-10-24 9:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Björn Steinbrink, git, jsogo
Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
> > Junio C Hamano <gitster@pobox.com> wrote:
> >> The patch may make a path with '~' work, but it (neither in the patch text
> >> nor in the commit log message) does not have much to give readers enough
> >> confidence that the code after the patch is the _final_ one, as opposed to
> >> being just a band-aid for a single symptom that happened to have been
> >> discovered this time.
> >
> > This is definitely a band-aid fix until I or somebody else takes the
> > time to figure out:
> >
> > 1. exactly which characters need to be escaped
> > 2. for which protocols those characters need to be escaped
> > 3. which part(s) of the URI they need to be escaped for
> > (repository root vs SVN path)
> > 4. which versions of SVN needs more (or less) escaping rules
> >
> > (I vote for somebody else, especially for #4 :)
>
> Item 3. above disturbs me. Do you mean that in:
>
> https://sucs.org/~welshbyte/svn/backuptool/trunk/foo~bar.txt
>
> the two tildes might have to be sent to libsvn-perl differently?
Yes, something like this is unfortunately a possibility (as is
having to worry about this at all in git-svn).
> Even if that is the case, I am inclined suggest taking the patch in the
> meantime as an interim workaround, with the understanding that we know the
> patch improves the situation for the tilde before welshbyte and even
> though we do not know if the patch regresses for the latter one between
> foo and bar, it would be much rarer to have tilde in such places.
>
> Care to come up with an updated log message?
From aa4f2cdcf64934e13886fabb3b5e986a5cda79f6 Mon Sep 17 00:00:00 2001
From: Eric Wong <normalperson@yhbt.net>
Date: Tue, 21 Oct 2008 14:12:15 -0700
Subject: [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
This change only fixes the tilde case in the repository URL. A
more comprehensive set of escaping rules and tests will be
needed in the future for complete compatibility when using
uncommon characters.
Thanks to Jose Carlos Garcia Sogo and Björn Steinbrink for the
bug report.
On 2008.10.18 23:39:19 +0200, Björn Steinbrink wrote:
> Hi,
>
> Jose Carlos Garcia Sogo reported on #git that a git-svn clone of this
> svn repo fails for him:
> https://sucs.org/~welshbyte/svn/backuptool/trunk
>
> I can reproduce that here with:
> git-svn version 1.6.0.2.541.g46dc1.dirty (svn 1.5.1)
>
> The error message I get is:
> Apache got a malformed URI: Unusable URI: it does not refer to this
> repository at /usr/local/libexec/git-core/git-svn line 4057
>
> strace revealed that git-svn url-encodes ~ while svn does not do that.
>
> For svn we have:
> write(5, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> <S:src-path>https://sucs.org/~welshbyte/svn/backuptool/trunk</S:src-path>...
>
> While git-svn shows:
> write(7, "<S:update-report send-all=\"true\" xmlns:S=\"svn:\">
> <S:src-path>https://sucs.org/%7Ewelshbyte/svn/backuptool/trunk</S:src-path>...
Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
git-svn.perl | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/git-svn.perl b/git-svn.perl
index ef6d773..a97049a 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -852,7 +852,7 @@ sub escape_uri_only {
my ($uri) = @_;
my @tmp;
foreach (split m{/}, $uri) {
- s/([^\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+ s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
push @tmp, $_;
}
join('/', @tmp);
@@ -3537,7 +3537,7 @@ sub repo_path {
sub url_path {
my ($self, $path) = @_;
if ($self->{url} =~ m#^https?://#) {
- $path =~ s/([^a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg;
+ $path =~ s/([^~a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg;
}
$self->{url} . '/' . $self->repo_path($path);
}
@@ -3890,7 +3890,7 @@ sub escape_uri_only {
my ($uri) = @_;
my @tmp;
foreach (split m{/}, $uri) {
- s/([^\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+ s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
push @tmp, $_;
}
join('/', @tmp);
--
Eric Wong
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-10-24 9:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-18 21:39 git-svn url-encodes ~ when it should not Björn Steinbrink
2008-10-18 22:47 ` Björn Steinbrink
2008-10-21 21:12 ` [PATCH] git-svn: don't escape tilde ('~') for http(s) URLs Eric Wong
2008-10-21 21:53 ` Junio C Hamano
2008-10-22 6:13 ` Mike Hommey
2008-10-22 6:24 ` Junio C Hamano
2008-10-22 8:16 ` Eric Wong
2008-10-22 18:53 ` Junio C Hamano
2008-10-24 9:13 ` Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).