From: Yaroslav Halchenko <yoh@onerussian.com>
To: Git Gurus hangout <git@vger.kernel.org>
Cc: Benjamin Poldrack <benjaminpoldrack@gmail.com>,
Joey Hess <id@joeyh.name>, Jens Lehmann <Jens.Lehmann@web.de>
Subject: Re: problems serving non-bare repos with submodules over http
Date: Wed, 20 Apr 2016 15:45:33 -0400 [thread overview]
Message-ID: <20160420194533.GO23764@onerussian.com> (raw)
In-Reply-To: <CAGZ79kYS-F1yKpNP7jmhTiZT1R_pucUBBTCbmHKZz6Xd6dy8EA@mail.gmail.com>
On Wed, 20 Apr 2016, Stefan Beller wrote:
> > I do realize that the situation is quite uncommon, partially I guess due
> > to git submodules mechanism flexibility and power on one hand and
> > under-use (imho) on the other, which leads to discovery of regressions
> > [e.g. 1] and corner cases as mine.
> Thanks for fixing the under-use and reporting bugs. :)
I am thrilled to help ;)
> > [1] http://thread.gmane.org/gmane.comp.version-control.git/288064
> > [2] http://www.onerussian.com/tmp/git-web-submodules.sh
> > My use case: We are trying to serve a git repository with submodules
> > specified with relative paths over http from a simple web server. With a demo
> > case and submodule specification [complete script to reproduce including the
> > webserver using python is at 2] such as
> > (git)hopa:/tmp/gitxxmsxYFO[master]git
> > $> tree
> > .
> > ├── f1
> > └── sub1
> > └── f2
> > $> cat .gitmodules
> > [submodule "sub1"]
> > path = sub1
> > url = ./sub1
> > 1. After cloning
> > git clone http://localhost:8080/.git
> > I cannot 'submodule update' the sub1 in the clone since its url after
> > 'submodule init' would be http://localhost:8080/.git/sub1 . If I manually fix
> > it up -- it seems to proceed normally since in original repository I have
> > sub1/.git/ directory and not the "gitlink" for that submodule.
> So the expected URL would be http://localhost:8080/sub1/.git ?
ATM, yes
> I thought you could leave out the .git prefix, i.e. you can type
> git clone http://localhost:8080
> and Git will recognize the missing .git and try that as well. The relative URL
> would then be constructed as http://localhost:8080/sub1, which will use the
> same mechanism to find the missing .git ending.
[note1] Unfortunately it is not the case ATM (git version
2.8.1.369.geae769a, output is interspersed with log from the python's simple
http server):
$> git clone http://localhost:8080 xxx
Cloning into 'xxx'...
127.0.0.1 - - [20/Apr/2016 15:01:25] code 404, message File not found
127.0.0.1 - - [20/Apr/2016 15:01:25] "GET /info/refs?service=git-upload-pack HTTP/1.1" 404 -
fatal: repository 'http://localhost:8080/' not found
> > 2. If I serve the clone [2 demos that too] itself, there is no easy remedy at
> > all since sub1/.git is not a directory but a gitlink.
> Not sure I understand the second question.
If I serve via http a repository where sub1/.git is a "gitlink":
(git)hopa:/tmp/gitxxmsxYFO_[master]
$> cat sub1/.git
gitdir: ../.git/modules/sub1
Such repository cannot be cloned:
(git)hopa:/tmp/gitxxmsxYFO_[master]git
$> git clone http://localhost:8080/sub1 /tmp/xxx
Cloning into '/tmp/xxx'...
127.0.0.1 - - [20/Apr/2016 15:04:01] code 404, message File not found
127.0.0.1 - - [20/Apr/2016 15:04:01] "GET /sub1/info/refs?service=git-upload-pack HTTP/1.1" 404 -
fatal: repository 'http://localhost:8080/sub1/' not found
$> git clone http://localhost:8080/sub1/.git /tmp/xxx
Cloning into '/tmp/xxx'...
127.0.0.1 - - [20/Apr/2016 15:04:06] code 404, message File not found
127.0.0.1 - - [20/Apr/2016 15:04:06] "GET /sub1/.git/info/refs?service=git-upload-pack HTTP/1.1" 404 -
fatal: repository 'http://localhost:8080/sub1/.git/' not found
> > N.B. I haven't approached nested submodules case yet in [2]
> > I wondered
> > a. could 'git clone' (probably actually some relevant helper used by fetch
> > etc) acquire ability to sense for URL/.git if URL itself doesn't point to a
> > usable git repository?
> So you mean in case of relative submodules, we need to take the parent
> url, and remove the ".git" at the end and try again if we cannot find
> the submodule?
that would be the a.2 which I have forgotten to outline ;)
in a. I was suggesting what you have assumed [note 1 above] would be
happening (but doesn't) ATM: that /.git would be automagically sensed.
> > I think this could provide complete remedy for 1 since then relative urls
> > would be properly assembled, with similar 'sensing' for /.git for the final urls
> > I guess we could do it with rewrites/forwards on the "server side",
> > but it wouldn't be generally acceptable solution.
> > b. is there a better or already existing way to remedy my situation?
> > c. shouldn't "git clone" (or the relevant helper) be aware of remote
> > /.git possibly being a gitlink file within submodule?
> Oh. I think that non-bare repositories including submodules are not designed
> to be cloned, because they are for use in the file system.
Well -- that is the beauty of git being a distributed VCS, that non-bare repos
seems to be as nicely cloneable as bare ones. And in general it seems to work
with submodules as well, since they should be the "consistent"
philosophically...
> Even a local clone fails:
> # gerrit is a project I know which also has submodules:
> git clone --recurse-submodules https://gerrit.googlesource.com/gerrit g1
> git clone --recurse-submodules g1 g2
> ...
> fatal: clone of '...' into submodule path '...' failed
I guess that is just yet another bug with relative paths in the
submodules.
> So I think for cloning repositories you want to have each repository
> as its own thing (bare or non bare).
in your first line in the example above you somewhat have shown the
counter-argument to the statement. Indeed each repository should be its own
thing, just possibly registered as a submodule to another one.
> The submodule mechanism is just a way to express a relation between
> the reositories, it's like composing them together, but by that composition
> it breaks the properties of each repository to be easily clonable.
It doesn't really (unless in the cases we both pointed out). E.g. I can as
easily clone original sub1 repository which was registered as a submodule of
another one. Either treatment of them by git during cloning (and placing under
root repo's .git/modules, etc) undermines that feature -- that is the
question we could also discuss here somewhat I guess ;)
> I think we should fix that.
would be awesome! Thanks in advance ;)
> I guess the local clone case is 'easy' as you only need
> to handle the link instead of directory thing correctly.
> For the case you describe (cloning from a remote, whether it is http or ssh),
> we would need to discuss security implications I would assume? It sounds
> scary at first to follow a random git link to the outer space of the repository.
more like "into the inner space". git already (as above example shown)
descends right away into "/info/refs?", so how sensing "/.git/" would be any
different?
> (A similar thing is that you cannot have symlinks in a git repository pointing
> outside of it, IIRC? At least that was fishy.)
that might indeed be dangerous. but once again, per above argument similarly
up to the "provider" I guess to guarantee protection, e.g. forbidding following
symlink on the webserver for that served directory, if content is not under his
control.
Cheers and thanks for your quick reply Stefan!
--
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik
next prev parent reply other threads:[~2016-04-20 19:45 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-20 15:22 problems serving non-bare repos with submodules over http Yaroslav Halchenko
2016-04-20 16:14 ` Stefan Beller
2016-04-20 19:45 ` Yaroslav Halchenko [this message]
2016-04-20 19:51 ` Junio C Hamano
2016-04-20 21:05 ` Stefan Beller
2016-04-20 21:27 ` Junio C Hamano
2016-04-20 23:05 ` Stefan Beller
2016-04-21 3:14 ` Yaroslav Halchenko
2016-04-21 17:11 ` Stefan Beller
2016-04-21 17:45 ` Junio C Hamano
2016-04-21 17:48 ` Stefan Beller
2016-04-21 22:42 ` Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160420194533.GO23764@onerussian.com \
--to=yoh@onerussian.com \
--cc=Jens.Lehmann@web.de \
--cc=benjaminpoldrack@gmail.com \
--cc=git@vger.kernel.org \
--cc=id@joeyh.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).