git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Wong <normalperson@yhbt.net>
To: git discussion list <git@vger.kernel.org>
Subject: Re: [PATCH] translate bad characters in refnames during git-svn fetch
Date: Tue, 17 Jul 2007 05:28:52 -0700	[thread overview]
Message-ID: <20070717122852.GA21372@mayonaise> (raw)
In-Reply-To: <20070716174731.GA4792@lapse.madduck.net>

martin f krafft <madduck@madduck.net> wrote:
> also sprach Eric Wong <normalperson@yhbt.net> [2007.07.16.0530 +0200]:
> > The major issue with this is that it doesn't handle odd cases
> > where a refname is sanitized into something (say "1234~2"
> > sanitizes to "1234=2"), and then another branch is created named
> > "1234=2".
> 
> Well, we can't please everyone, can we? :)
> 
> I like Jan's proposal about using the % escape, even though it
> doesn't make pretty branch names.

I like it, too.  How about something like the two functions below?  This
will break things a bit for people currently using % in refnames,
however.

I think this will work rather nicely once I've figured out how the path
globbing code works[1] and where to sanitize/desanitize the refnames
properly.

It would be far easier to take your approach and sanitize them only
for the command-line, but storing unsanitized git refnames into the
.git/config is something I want to avoid:

  Somebody naming directories on the SVN side with the path component
  ":refs/remotes" in them could screw things up for us.

# transform the refname as per rules in git-check-ref-format(1):
sub sanitize_ref_name {
	my ($refname) = @_;

	# It cannot end with a slash /, we'll throw up on this because
	# SVN can't have directories with a slash in their name, either:
	if ($refname =~ m{/$}) {
		die "ref: '$refname' ends with a trailing slash, this is ",
		    "not permitted by git nor Subversion\n";
	}

	# It cannot have ASCII control character space, tilde ~, caret ^,
	# colon :, question-mark ?, asterisk *, or open bracket[ anywhere
	#
	# Additionally, % must be escaped because it is used for escaping
	# and we want our escaped refname to be reversible
	$refname =~ s{( \%~\^:\?\*\[\t)}{uc sprintf('%%%02x',ord($1))}eg;

	# no slash-separated component can begin with a dot .
	# /.* becomes /%2E*
	$refname =~ s{/\.}{/%2E}g;
	# It cannot have two consecutive dots .. anywhere
	# .. becomes %2E%2E
	$refname =~ s{\.\.}{%2E%2E}g;

	$refname;
}

sub desanitize_ref_name {
	my ($refname) = @_;
	$refname =~ s{%(?:([0-9A-F]{2})}{chr hex($1)}g;

	$refname;
}

> On the other hand, we could make the translation regexps
> configurable...

Hopefully not needed.  I fear it would just add to confusion.


[1] I don't remember writing the globbing code myself, maybe it was my
psychotic alter ego, but I'm having trouble following it at this time of
the night/morning.

-- 
Eric Wong

  reply	other threads:[~2007-07-17 12:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-15 13:05 [PATCH] translate bad characters in refnames during git-svn fetch martin f krafft
2007-07-16  3:30 ` Eric Wong
2007-07-16 11:15   ` Jan Hudec
2007-07-16 17:47     ` martin f krafft
2007-07-17 12:28       ` Eric Wong [this message]
2007-07-17 13:17         ` martin f krafft
2007-07-26 10:59           ` Robert Ewald
2007-07-26 12:35             ` Martin F Krafft
2007-07-28  7:23         ` Mike Hommey
2007-07-28  7:33           ` David Kastrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070717122852.GA21372@mayonaise \
    --to=normalperson@yhbt.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).