From: Thomas Rast <trast@student.ethz.ch>
To: <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>, <avarab@gmail.com>,
<jstpierre@mecheye.net>, "Shawn O. Pearce" <spearce@spearce.org>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2] Do not unquote + into ' ' in URLs
Date: Sat, 24 Jul 2010 16:49:04 +0200 [thread overview]
Message-ID: <ed2d311355fca478f97b82f8d955494509d6b9de.1279982471.git.trast@student.ethz.ch> (raw)
In-Reply-To: <201007240104.25341.trast@student.ethz.ch>
Since 9d2e942 (decode file:// and ssh:// URLs, 2010-05-23) the URL
logic unquotes escaped URLs. For the %2B type of escape, this is
conformant with RFC 2396. However, it also unquotes + into a space
character, which is only appropriate for the query strings in HTTP.
This notably broke fetching from the gtk+ repository.
We cannot just remove the corresponding code since the same
url_decode_internal() is also used by the HTTP backend to decode query
parameters. Introduce a new argument that controls whether the +
decoding happens, and use it only in the (client-side) url_decode().
Reported-by: Jasper St. Pierre <jstpierre@mecheye.net>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
I wrote:
> Junio C Hamano wrote:
> >
> > http-backend.c::get_info_refs()
> > -> http-backend.c::get_parameter()
> > -> http-backend.c::get_parameters()
> > -> url.c::url_decode_parameter_value()
> > -> url.c::url_decode_internal()
>
> You're right, I forgot about those. I imagine it would be one of two
> cases:
[...]
> Shawn, can you help with this?
The third case, of course, is:
* It only uses these functions for parameter decoding, which of course
was correct to begin with.
So after hopefully drinking enough coffee, I made this one. The catch
is that I'm not entirely clear whether *not* decoding the +
client-side anywhere in the URL is correct for http:// URLs? If the
client decodes and re-encodes the URL, then the + would be turned into
a %2B on the re-encoding. Then again maybe UI-facing URLs should
never have a query part at all?
t/t5601-clone.sh | 10 ++++++++--
url.c | 11 ++++++-----
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 8abb71a..4431dfd 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -178,8 +178,14 @@ test_expect_success 'clone respects global branch.autosetuprebase' '
test_expect_success 'respect url-encoding of file://' '
git init x+y &&
- test_must_fail git clone "file://$PWD/x+y" xy-url &&
- git clone "file://$PWD/x%2By" xy-url
+ git clone "file://$PWD/x+y" xy-url-1 &&
+ git clone "file://$PWD/x%2By" xy-url-2
+'
+
+test_expect_success 'do not query-string-decode + in URLs' '
+ rm -rf x+y &&
+ git init "x y" &&
+ test_must_fail git clone "file://$PWD/x+y" xy-no-plus
'
test_expect_success 'do not respect url-encoding of non-url path' '
diff --git a/url.c b/url.c
index 2306236..cd8f74f 100644
--- a/url.c
+++ b/url.c
@@ -67,7 +67,8 @@ static int url_decode_char(const char *q)
return val;
}
-static char *url_decode_internal(const char **query, const char *stop_at, struct strbuf *out)
+static char *url_decode_internal(const char **query, const char *stop_at,
+ struct strbuf *out, int decode_plus)
{
const char *q = *query;
@@ -90,7 +91,7 @@ static char *url_decode_internal(const char **query, const char *stop_at, struct
}
}
- if (c == '+')
+ if (decode_plus && c == '+')
strbuf_addch(out, ' ');
else
strbuf_addch(out, c);
@@ -110,17 +111,17 @@ char *url_decode(const char *url)
strbuf_add(&out, url, colon - url);
url = colon;
}
- return url_decode_internal(&url, NULL, &out);
+ return url_decode_internal(&url, NULL, &out, 0);
}
char *url_decode_parameter_name(const char **query)
{
struct strbuf out = STRBUF_INIT;
- return url_decode_internal(query, "&=", &out);
+ return url_decode_internal(query, "&=", &out, 1);
}
char *url_decode_parameter_value(const char **query)
{
struct strbuf out = STRBUF_INIT;
- return url_decode_internal(query, "&", &out);
+ return url_decode_internal(query, "&", &out, 1);
}
--
1.7.2.278.g76edd.dirty
next prev parent reply other threads:[~2010-07-24 14:49 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-23 13:18 URL decoding changed semantics of + in URLs Thomas Rast
2010-07-23 13:21 ` Thomas Rast
2010-07-23 14:10 ` Ævar Arnfjörð Bjarmason
2010-07-23 14:25 ` Jasper St. Pierre
2010-07-23 21:23 ` [PATCH] Do not unquote + into ' ' " Thomas Rast
2010-07-23 22:20 ` Ævar Arnfjörð Bjarmason
2010-07-23 22:26 ` Junio C Hamano
2010-07-23 23:04 ` Thomas Rast
2010-07-24 14:49 ` Thomas Rast [this message]
2010-07-31 21:18 ` [PATCH v2] " Jasper St. Pierre
2010-07-31 21:33 ` Thomas Rast
2010-08-06 10:46 ` Ralf Ebert
2010-07-26 15:40 ` URL decoding changed semantics of + " Jeff King
2010-07-26 17:57 ` Ævar Arnfjörð Bjarmason
2010-07-26 18:22 ` Jasper St. Pierre
2010-07-26 18:30 ` Matthieu Moy
2010-07-26 18:35 ` Ævar Arnfjörð Bjarmason
2010-07-26 18:44 ` Jasper St. Pierre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ed2d311355fca478f97b82f8d955494509d6b9de.1279982471.git.trast@student.ethz.ch \
--to=trast@student.ethz.ch \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jstpierre@mecheye.net \
--cc=peff@peff.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).