git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Ericsson <ae@op5.se>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2] fetch: Strip usernames from url's before storing them
Date: Wed, 15 Apr 2009 22:45:14 +0200	[thread overview]
Message-ID: <49E6475A.3090801@op5.se> (raw)
In-Reply-To: <7vbpqxvnpl.fsf@gitster.siamese.dyndns.org>

Thanks for the feedback. Many appreciated.

Junio C Hamano wrote:
> Andreas Ericsson <ae@op5.se> writes:
> 
>> +/*
>> + * Strip username information from the url and return it in a
>> + * newly allocated string which the caller has to free.
>> + *
>> + * The url's we want to catch are the following:
>> + *   ssh://[user@]host.xz[:port]/path/to/repo.git/
>> + *   [user@]host.xz:/path/to/repo.git/
>> + *   http[s]://[user[:password]@]host.xz/path/to/repo.git
> 
> If this is a valid URL:
> 
> 	scheme://host.xz/path@with@at@sign.git/
> 
> we do not want to mistakenly trigger this logic.
> 

I'm guessing, and I'm slightly inebriated so don't yell too hard at me if
you think your mail will hit me in the morning ;-)

It *is* valid for "bare ssh" url's, but that would only trigger the fourth
(and last) case of the if() else if() thing which would return a strdup().
It won't have a scheme, and it won't have a colon after the @-sign.

An url with a scheme can't contain a colon *before* the at-sign if it does
contain a username, and it won't contain a colon *after* the @-sign if it
*does* contain a username. This is a guess, but I wrote that part of the
transport code initially, so I've got a decent inkling of what git accepts
in the first place (although it might not be strong enough in the face of
the various RFC's, I know).

> I do not know if rsync://me@there/path is supported, but we should
> generalize to support any scheme://me@there/path to keep the code simpler.
> You do not do anything special based on the URL scheme other than learning
> how long the scheme:// part is to copy it anyway.

I've never seen rsync://user@ style url's before. It's trivial to add support
for it if it's valid though, but for the reasons above, I'm not too fussed
about trying to support URL's we're extremely unlikely to see in real life.
The last else if() really does catch a lot.

>  Perhaps like...
> 
> char *transport_anonymize_url(const char *url)
> {
> 	char *anon_url, *scheme_prefix, *anon_part;
> 	size_t len, prefix_len = 0;
> 
> 	anon_part = strchr(url, '@');
> 	if (is_local(url) || !anon_part)
> 		goto literal_copy;
> 
> 	anon_part++;
> 	scheme_prefix = strstr(url, "://");
> 	if (scheme_prefix) {
> 		const char *cp;
> 		/* make sure scheme is reasonable */
> 		for (cp = url; cp < scheme_prefix; cp++) {
> 			switch (*cp) { /* RFC 1738 2.1 */
> 			case '+':
> 			case '.':
> 			case '-':
> 				break; /* ok */

Isn't %xx also a valid escape sequence for unknown chars in url's, according
to the aforementioned RFC?


> 			default:
> 				if (isalnum(*cp))
> 					break;
> 				/* it isn't */
> 				goto literal_copy;
> 			}
> 		}
> 		/* @ past the first slash does not count */
> 		cp = strchr(scheme_prefix + 3, '/');
> 		if (cp < anon_part)
> 			goto literal_copy;
> 		prefix_len = scheme_prefix - url + 3;
> 	}
> 	else if (!strchr(anon_part, ':'))
> 		/* cannot be "me@there:/path/name" */

Nice, but the comment is misleading to the simple (or drunk) mind.
It can't contain an @-sign at all due to the check above, which I'd
be happier if it mentions (me being both atm, I'm inclined to think
in rather straight lines).

Otherwise I really like it.

> 		goto literal_copy;
> 	len = prefix_len + strlen(anon_part);
> 	anon_url = xmalloc(len + 1);
> 	memcpy(anon_url, url, prefix_len);
> 	memcpy(anon_url + prefix_len, anon_part, strlen(anon_part));
> 	return anon_url;
>  literal_copy:
> 	return xstrdup(url);
> }


This looks sensible and fairly generic, and I'll happily defer to a
more capable and less drunk programmer any time of the day. I'll copy
it into anon-url.c tomorrow and see how it pans out with some of the
weirder URL's you gave today.

If anyone's got some tricky url's they want me to try, email me
before 08:00 GMT tomorrow and I'll give 'em a whirl with multiple
algo's.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

  reply	other threads:[~2009-04-15 20:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-15 12:16 [PATCH] fetch: Strip usernames from url's before storing them Andreas Ericsson
2009-04-15 12:30 ` Michael J Gruber
2009-04-15 14:01   ` Andreas Ericsson
2009-04-15 17:19     ` Junio C Hamano
2009-04-15 18:08       ` Andreas Ericsson
2009-04-15 13:18 ` Johannes Sixt
2009-04-15 14:14   ` Andreas Ericsson
2009-04-15 14:30     ` [PATCH v2] " Andreas Ericsson
2009-04-15 17:19       ` Junio C Hamano
2009-04-15 20:45         ` Andreas Ericsson [this message]
2009-04-17  8:20         ` [PATCH v3] " Andreas Ericsson
2009-04-20  7:39           ` Andreas Ericsson
2009-04-20  8:36             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49E6475A.3090801@op5.se \
    --to=ae@op5.se \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).