All of lore.kernel.org
 help / color / mirror / Atom feed
* git-svn sucks when it should not
@ 2008-07-07  0:00 Johannes Schindelin
  2008-07-07  9:44 ` Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2008-07-07  0:00 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

Hi Eric,

I have the pleasure of needing to work with a subversion project where 
parts of the webserver are password restricted.

In particular, I cannot access the parent directory, and one of 
the branches is protected, too.

Maybe you remember me describing that problem on IRC a few weeks ago: yes, 
it is still persistent.

Now, I thought that I know my way around Perl, at least a little bit, but 
while git-svn barfed on the repository, I... uhm, well, you probably get 
the idea.

The funny part is this: when I say "git svn clone $URL/trunk", or the same 
with the absolute paths to the single tags, instead of "git svn clone -s 
$URL", git-svn does the correct thing.  It works, importing the stuff as 
"git-svn".

So I tried to just edit out by hand the branches section, so that the 
password-protected branch would not be a problem.

The result was surprising: git svn fetch exited with success, but it 
did... absolutely nothing.

After a lot of frustrating hours, which were not at all helped by 
brilliant variable names such as "r" and "gsv", I now know this: the log 
contains paths that do not have a prefix "trunk", but "<dir>/trunk", 
where "<dir>" is the last directory of the URL.

Changing git-svn's URL to the parent of <dir> is a no-go, since that is -- 
as I mentioned above -- password protected.

Yes, in a perfect world I could just force the admin to change that, but 
no, this is not a perfect world, so do not even try to suggest that if 
you want to help.

Changing the fetch line to "<dir>/trunk:refs/remotes/trunk" does not work 
either, since git-svn cleverly checks $URL/<dir>/<dir>/trunk/.

I then tried to hack match_globs() and match_paths() to add that extra 
prefix to the patterns, so that that extra prefix + trunk would be 
matched and edited out.  This happened to work out alright.

But I tried for several hours to get in a proper solution which does not 
throw up on the tags, and I have to conclude that this piece of code is 
not hackable by anybody else but you.

So I stand defeated by your program.  Thank you.

My ugly, ugly workaround that is however easy, easy, is a shell script 
that uses curl to find out what refs are new, and clones each ref 
individually, then pushes all the results together into one repository.

Should not have been _that_ hard,
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git-svn sucks when it should not
  2008-07-07  0:00 git-svn sucks when it should not Johannes Schindelin
@ 2008-07-07  9:44 ` Eric Wong
  2008-07-07 11:49   ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2008-07-07  9:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi Eric,

Hi Johannes,

> I have the pleasure of needing to work with a subversion project where 
> parts of the webserver are password restricted.
> 
> In particular, I cannot access the parent directory, and one of 
> the branches is protected, too.
> 
> Maybe you remember me describing that problem on IRC a few weeks ago: yes, 
> it is still persistent.
> 
> Now, I thought that I know my way around Perl, at least a little bit, but 
> while git-svn barfed on the repository, I... uhm, well, you probably get 
> the idea.
> 
> The funny part is this: when I say "git svn clone $URL/trunk", or the same 
> with the absolute paths to the single tags, instead of "git svn clone -s 
> $URL", git-svn does the correct thing.  It works, importing the stuff as 
> "git-svn".

Noted, -s/--stdlayout doesn't work well with access-restricted
repositories.  I remember leaving the minimize_url() stuff out of the
non--stdlayout code path so git-svn could at least have a way of working
with restricted repositories.

> So I tried to just edit out by hand the branches section, so that the 
> password-protected branch would not be a problem.
> 
> The result was surprising: git svn fetch exited with success, but it 
> did... absolutely nothing.
> 
> After a lot of frustrating hours, which were not at all helped by 
> brilliant variable names such as "r" and "gsv", I now know this: the log 
> contains paths that do not have a prefix "trunk", but "<dir>/trunk", 
> where "<dir>" is the last directory of the URL.

$r should always revision number ('r' is pretty common in the svn
command-line client in referring to revision numbers).  $gsv is just
a $gs vector (like 'argv') with multiple Git::SVN objects.

On the other hand, your comments about some of the other code being
gross is completely valid...

> Changing git-svn's URL to the parent of <dir> is a no-go, since that is -- 
> as I mentioned above -- password protected.
> 
> Yes, in a perfect world I could just force the admin to change that, but 
> no, this is not a perfect world, so do not even try to suggest that if 
> you want to help.

In a perfect world, everybody would be using git and we wouldn't have to
deal with SVN :)

I highly doubt I'd ask you to get an admin to give you access, so please
don't suggest that's something I would ask of you.  I always try to stay
as under-the-radar as possible as far as dealing with admins go and
avoid ever mentioning git-svn to them.

Closed repositories suck, yes, and I've had the fortune (and misfortune
to some users such as yourself) to have never had work with anything
less than full read access.

> Changing the fetch line to "<dir>/trunk:refs/remotes/trunk" does not work 
> either, since git-svn cleverly checks $URL/<dir>/<dir>/trunk/.
> 
> I then tried to hack match_globs() and match_paths() to add that extra 
> prefix to the patterns, so that that extra prefix + trunk would be 
> matched and edited out.  This happened to work out alright.
> 
> But I tried for several hours to get in a proper solution which does not 
> throw up on the tags, and I have to conclude that this piece of code is 
> not hackable by anybody else but you.

I've actually been afraid to touch the globbing/paths stuff myself.  I
got it working for common repository types ~18 months ago and have been
afraid to look at it ever since, so I definitely feel your pain.

git-svn works alright for most repositories people come across, but
yes, it has trouble with less-common, restricted ones.

Does it do some brain-damaged things?  Yes, and those things should get
fixed.  But nowadays I'm much busier with other projects and working
with SVN/SVN bindings was a traumatic experience for me, and (possibly
as a result) the git-svn code scares me, too.

A first step to making the git-svn code more accessible would be
to split out the individual packages into smaller, more manageable
files.  The second would be documenting more of the internal data
structures that it uses (including the ones SVN returns).

Rewriting the paths/globbing code from scratch would probably be
the next step...

> So I stand defeated by your program.  Thank you.
> 
> My ugly, ugly workaround that is however easy, easy, is a shell script 
> that uses curl to find out what refs are new, and clones each ref 
> individually, then pushes all the results together into one repository.
> 
> Should not have been _that_ hard,
> Dscho

It might be helpful to publish your script so other people can see/use it.

dcommit, multi-init/fetch (precursor to clone --stdlayout) were all
prototyped as shell scripts using git-svn commands before they were
integrated into the main program.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git-svn sucks when it should not
  2008-07-07  9:44 ` Eric Wong
@ 2008-07-07 11:49   ` Johannes Schindelin
  2008-07-07 16:29     ` Avery Pennarun
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2008-07-07 11:49 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

Hi,

On Mon, 7 Jul 2008, Eric Wong wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>
> > [...] a shell script that uses curl to find out what refs are new, and 
> > clones each ref individually, then pushes all the results together 
> > into one repository.
> 
> It might be helpful to publish your script so other people can see/use 
> it.

-- snipsnap --
#!/bin/sh

# This script looks for the config variable svn-manual.url, and if it
# is set, will traverse the url and its subdirectories with curl, and
# install different svn-remotes for all found refs.
#
# It heavily relies on curl being able to screen-scrape the directories,
# in other words, it wants an HTTP on the other side that has directory
# listings enabled.
#
# The quick and dirty heuristics to find out what makes a ref is that
# a ref's subdirectory contains files, while a subdirectory containing
# only subdirectories is supposed to contain refs (or subdirectories
# of refs).

list_contains_files () {
	while test $# -gt 0
	do
		case "$1" in
		*/) ;;
		*) echo "$1";;
		esac
		shift
	done
}

svn_manually_fetch_one_dir () {
	contents="$(curl --silent -k "$1"/ |
		sed -n "s/.*a href=\"\([^\"]*\).*/\1/p" |
		grep -ve '^\.\./$' -e '^http:' -e '^/')"
	test -z "$contents" && return
	test -z "$(list_contains_files $contents)" || {
		test -z "$(git config svn-remote."$2".url)" && {
			git config svn-remote."$2".url "$1" &&
			git config svn-remote."$2".fetch :"$2" ||
			return
		}
		git svn fetch -R "$2"
		return
	}
	for dir in $contents
	do
		dir=${dir%%/}
		svn_manually_fetch_one_dir "$1/$dir" "$2/$dir" || break
	done
}

svn_fetch_semi_manually () {
	url="$(git config svn-manual.url)"
	test -z "$url" && return 1
	svn_manually_fetch_one_dir "$url" refs/remotes
}

svn_fetch_semi_manually || git svn fetch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git-svn sucks when it should not
  2008-07-07 11:49   ` Johannes Schindelin
@ 2008-07-07 16:29     ` Avery Pennarun
  2008-07-07 17:18       ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Avery Pennarun @ 2008-07-07 16:29 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Eric Wong, git

On 7/7/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>  # It heavily relies on curl being able to screen-scrape the directories,
>  # in other words, it wants an HTTP on the other side that has directory
>  # listings enabled.

I wrote a similar script myself, although it makes assumptions about
the meaning of /branches and /tags rather than using the dirs vs.
files trick.

Rather than relying on screen scraping with curl, you might prefer to
use "svn ls" instead, since it'll work with any svn-compatible
repository type and (I presume) doesn't require special web server
settings.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git-svn sucks when it should not
  2008-07-07 16:29     ` Avery Pennarun
@ 2008-07-07 17:18       ` Johannes Schindelin
  0 siblings, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2008-07-07 17:18 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Eric Wong, git

Hi,

On Mon, 7 Jul 2008, Avery Pennarun wrote:

> On 7/7/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> >  # It heavily relies on curl being able to screen-scrape the 
> >  # directories, in other words, it wants an HTTP on the other side 
> >  # that has directory listings enabled.
> 
> I wrote a similar script myself, although it makes assumptions about the 
> meaning of /branches and /tags rather than using the dirs vs. files 
> trick.
> 
> Rather than relying on screen scraping with curl, you might prefer to 
> use "svn ls" instead, since it'll work with any svn-compatible 
> repository type and (I presume) doesn't require special web server 
> settings.

Good to know!  At least if you want to use svn more often ;-)

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-07-07 17:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-07  0:00 git-svn sucks when it should not Johannes Schindelin
2008-07-07  9:44 ` Eric Wong
2008-07-07 11:49   ` Johannes Schindelin
2008-07-07 16:29     ` Avery Pennarun
2008-07-07 17:18       ` Johannes Schindelin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.