Git development
 help / color / mirror / Atom feed
* A darcs that can pull from git
@ 2005-04-24 22:32 Juliusz Chroboczek
  2005-04-25 13:31 ` David Roundy
  0 siblings, 1 reply; 7+ messages in thread
From: Juliusz Chroboczek @ 2005-04-24 22:32 UTC (permalink / raw)
  To: darcs-devel; +Cc: Git Mailing List

I've just finished putting together a hack for darcs to allow it to
pull from Git repositories.  You'll find the patch (Darcs patch, not
diff patch) on

  http://www.pps.jussieu.fr/~jch/software/files/darcs-git-20050424.darcs

You should get yourself a copy of darcs-unstable, then apply this
patch:

  $ darcs get http://www.abridgegame.org/repos/darcs-unstable darcs-git
  $ cd darcs-git
  $ darcs apply darcs-git-20050424.darcs
  $ make darcs

If you get merge conflicts, try using a version of the darcs-unstable
tree from 18.04.2005, which is what I started with.

A minor problem: there's something broken with the build procedure;
you'll probably need to manually do a ``make Context.hs'' followed
with ``make darcs'' when the build breaks.

After you build darcs-git, you should be able to do something like

  $ cd ..
  $ mkdir a
  $ cd a
  $ darcs initialize
  $ ../darcs-git/darcs pull /usr/local/src/git-pasky-0.4
  $ darcs changes

This version can *pull* from git, but it cannot push; in other words,
the only way to export your data from Darcs back to git is to use diff
and patch.

Please be aware that this is just a proof-of-concept prototype.  David
and the rest of the Central Committee haven't looked at this code yet;
it is quite likely that future versions of Darcs will generate
completely different patches from git repositories.  It is also likely
that THIS CODE WILL EAT YOUR DATA.

The major issue is that we generate no patch dependencies.  If you try
to cherry-pick from repositories generated with this version, you'd
better know what you're doing.

David, could you please have a look at the patches

  Sun Apr 24 16:50:02 CEST 2005  Juliusz Chroboczek <jch@pps.jussieu.fr>
    * First cut at remodularising repo access.

  Sun Apr 24 16:01:32 CEST 2005  Juliusz Chroboczek <jch@pps.jussieu.fr>
    * Change Repository to DarcsRepo.

and tell me whether this sort of restructuring is okay with you.

(David, I'm not claiming that this scheme is better than the ``tagging
like crazy'' scheme that you outlined; I'm only trying to prove that
my scheme is workable.)

Right now, I'm taking a Git commit and manually generating a Darcs
patch id from that, which is a bad idea.  A better way would be to get
Darcs to deal with arbitrarily shaped patch ids; a patch that
originates with git would get the git patch id, while a patch that
comes from Darcs would retain its patch id even when pushed to git.
David, you had some objections to that; any chance we could discuss
the issue?

This is slow.  There are a few obvious improvements to make to the
performance, but I'd rather first implement whatsnew, diff and apply,
and fix the problem with patch dependencies.  (Whatsnew is where git's
performance is actually likely to be better than Darcs, but it will
require some abstracting of ``Slurpy'' in order to make that
effective.)  Unfortunately, I don't expect to have hacking time before
next week-end.


Enjoy,

                                        Juliusz Chroboczek

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: A darcs that can pull from git
  2005-04-24 22:32 A darcs that can pull from git Juliusz Chroboczek
@ 2005-04-25 13:31 ` David Roundy
  2005-04-25 15:12   ` Juliusz Chroboczek
  0 siblings, 1 reply; 7+ messages in thread
From: David Roundy @ 2005-04-25 13:31 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: darcs-devel, Git Mailing List

On Mon, Apr 25, 2005 at 12:32:18AM +0200, Juliusz Chroboczek wrote:
> I've just finished putting together a hack for darcs to allow it to
> pull from Git repositories.  You'll find the patch (Darcs patch, not
> diff patch) on

Very cool! :)

>   http://www.pps.jussieu.fr/~jch/software/files/darcs-git-20050424.darcs

First off, you need to include a license header in the git files indicating
that unlike the rest of darcs, they may only be distributed under GPL v2.
Something like the following would probably be fine (but it's Linus'
copyright that's involved, not mine)

/*
 * GIT - The information manager from hell
 *
 * Copyright (C) Linus Torvalds, 2005

  This program is free software; you can redistribute it and/or modify
  it under the terms of version 2 of the GNU General Public License as
  published by the Free Software Foundation.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software Foundation,
  Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 */

Without this header, it's either illegal to distribute these files, or
they're assumed to be under GPLv2 or later along with the rest of darcs,
which also isn't legal...

> I've just finished putting together a hack for darcs to allow it to
> pull from Git repositories.  You'll find the patch (Darcs patch, not
> diff patch) on
>   http://www.pps.jussieu.fr/~jch/software/files/darcs-git-20050424.darcs

Any chance you can host a gettable repository? If not, I'd be happy to give
you an account on darcs.net on which you could host darcs-git.

> If you get merge conflicts, try using a version of the darcs-unstable
> tree from 18.04.2005, which is what I started with.

There is a conflict in GNUMakefile, which is moderately easy to resolve.
It would be nice to keep files in alphabetical order (as is mostly
currently the case, which has some small chance of reducing the likelihood
of conflicts.

> A minor problem: there's something broken with the build procedure;
> you'll probably need to manually do a ``make Context.hs'' followed
> with ``make darcs'' when the build breaks.

Or alternatively run "autoconf; ./configure; make"

> After you build darcs-git, you should be able to do something like
> 
>   $ cd ..
>   $ mkdir a
>   $ cd a
>   $ darcs initialize
>   $ ../darcs-git/darcs pull /usr/local/src/git-pasky-0.4
>   $ darcs changes

Do you have any plans/ideas for allowing pulls directly from a remote git
repository? Obviously it'll be less efficient, since you'll have to
download at least one file in its entirety.  Perhaps we could store a git
mirror in _darcs and use rsync?  :(

> David, could you please have a look at the patches
> 
>   Sun Apr 24 16:50:02 CEST 2005  Juliusz Chroboczek <jch@pps.jussieu.fr>
>     * First cut at remodularising repo access.
> 
>   Sun Apr 24 16:01:32 CEST 2005  Juliusz Chroboczek <jch@pps.jussieu.fr>
>     * Change Repository to DarcsRepo.
> 
> and tell me whether this sort of restructuring is okay with you.

Those two look fine to me.  I'm increasingly liking (as I get to understand
it better) your ideas regarding modularizing repository access.

> (David, I'm not claiming that this scheme is better than the ``tagging
> like crazy'' scheme that you outlined; I'm only trying to prove that
> my scheme is workable.)

Okay, it does look like most of your code will be equally useful for the
"tagging like crazy" scheme.

> Right now, I'm taking a Git commit and manually generating a Darcs
> patch id from that, which is a bad idea.  A better way would be to get
> Darcs to deal with arbitrarily shaped patch ids; a patch that
> originates with git would get the git patch id, while a patch that
> comes from Darcs would retain its patch id even when pushed to git.
> David, you had some objections to that; any chance we could discuss
> the issue?

We certainly can discuss it, and I still object.  I think it'd be much
better to map from git commits to darcs patch ids.  Your scheme and mine
both have uglinesses.

My "tag like crazy" scheme gives a unique mapping of a git commit to one
darcs patch and one darcs tag, but has the ugliness that a darcs patch
can't be mapped to a git commit without adding an additional darcs tag.  I
tend to see this as a plus.  It reflects the fact semantic difference
between a darcs patch and a git commit--the git commit is actually
equivalent not to a darcs patch but rather a darcs tag.

Your idea has the niceness that it could provide a one-to-one (as opposed
to two-to-one) mapping between darcs patches and git commits, but the catch
is that we don't know the git commit ID until after the patch has been
moved into git-land.  This is directly analagous to the wart in my scheme
that darcs patches would acquire a tag when moved into git-land.  I prefer
my scheme, since extra tags are relatively harmless, and reflect the
dependencies in the git repository.

> This is slow.  There are a few obvious improvements to make to the
> performance, but I'd rather first implement whatsnew, diff and apply,
> and fix the problem with patch dependencies.  (Whatsnew is where git's
> performance is actually likely to be better than Darcs, but it will
> require some abstracting of ``Slurpy'' in order to make that
> effective.)  Unfortunately, I don't expect to have hacking time before
> next week-end.

All right.  I'll look forward to another installment after next weekend
then!  :)

I had a few minor comments on your code, which I've forgotten.  One was
that either you're compiling with ghc 6.2, or you've disabled
-Werror... It'd be nice to be sure that your code is -Werror-clean with ghc
6.4 as well.
-- 
David Roundy
http://www.darcs.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: A darcs that can pull from git
  2005-04-25 13:31 ` David Roundy
@ 2005-04-25 15:12   ` Juliusz Chroboczek
  2005-04-26  0:55     ` Linus Torvalds
  2005-04-26 11:06     ` David Roundy
  0 siblings, 2 replies; 7+ messages in thread
From: Juliusz Chroboczek @ 2005-04-25 15:12 UTC (permalink / raw)
  To: darcs-devel, Git Mailing List

>>   http://www.pps.jussieu.fr/~jch/software/files/darcs-git-20050424.darcs

> First off, you need to include a license header in the git files indicating
> that unlike the rest of darcs, they may only be distributed under GPL v2.

Linus, could you please suggest a suitable license statement to
include in whichever files of yours we choose to include in Darcs?  Is
David's suggestion (stock GPL boilerplate with ``or any later
version'' removed) okay with you?

> Any chance you can host a gettable repository?

The last tag in darcs-unstable is 1.0.0rc2, which prevents me from
publishing a partial repository in my web space.  Perhaps you could
pull a recent tag into darcs-unstable?

> If not, I'd be happy to give you an account on darcs.net on which
> you could host darcs-git.

That would be great (let me know if you need an ssh public key).

> Do you have any plans/ideas for allowing pulls directly from a
> remote git repository?

I haven't thought about it yet.  Does anyone have any ideas about how
to efficiently pull from git without a complete local copy?

I'll reply to the more technical points below on darcs-devel -- no
point in spamming the kind folks on git@ any further, especially as
they probably know about

  http://www.abridgegame.org/pipermail/darcs-devel/

                                        Juliusz


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: A darcs that can pull from git
  2005-04-25 15:12   ` Juliusz Chroboczek
@ 2005-04-26  0:55     ` Linus Torvalds
  2005-04-26 11:06     ` David Roundy
  1 sibling, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2005-04-26  0:55 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Git Mailing List, darcs-devel


[ Side note: I tend to read the mailing lists much less often, and more 
  likely to skip stuff, so if you have a question that is literally for me 
  personally, it's probably best to Cc my private address rather than 
  depending on me reading every single mailing list email ]

On Mon, 25 Apr 2005, Juliusz Chroboczek wrote:
> 
> Linus, could you please suggest a suitable license statement to
> include in whichever files of yours we choose to include in Darcs?  Is
> David's suggestion (stock GPL boilerplate with ``or any later
> version'' removed) okay with you?

Stock GNU boilerplate without the "or any later version" works fine. 

As does a simple one-liner "Licensed under GPLv2", for that matter. It's 
not like there can be any real confusion.

		Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: A darcs that can pull from git
  2005-04-25 15:12   ` Juliusz Chroboczek
  2005-04-26  0:55     ` Linus Torvalds
@ 2005-04-26 11:06     ` David Roundy
  2005-04-26 12:34       ` Petr Baudis
  1 sibling, 1 reply; 7+ messages in thread
From: David Roundy @ 2005-04-26 11:06 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Git Mailing List, darcs-devel

On Mon, Apr 25, 2005 at 05:12:59PM +0200, Juliusz Chroboczek wrote:
> > Do you have any plans/ideas for allowing pulls directly from a
> > remote git repository?
> 
> I haven't thought about it yet.  Does anyone have any ideas about how
> to efficiently pull from git without a complete local copy?

I don't think so.  My best thought so far would be to have something like a
~/.gitcache/, which would store the sha1 objects themselves, so at least
we'd only end up with *one* local copy.  I'm actually curious what the true
git people do about this--it would be nice to share a cache.  For darcs'
purposes, we could prune the cache from time to time.  If we're running
with a darcs backend, we really only need the recent versions of files and
trees.

Do the git have any suggestions about how to avoid excess downloads or
excess copies of a git repository? It seems to me like it would make sense
to always download sha1s to ~/.gitcache/, and then hardlink them to the
current git repository, so you wouldn't end up ever downloading the same
sha1 twice.  Or we should use $GITCACHE/, to give the user some
flexibility.  But perhaps this is an already-solved problem, and I've just
not noticed...

As far as other details, currently we can just walk the tree to find out
what files are needed, right?
-- 
David Roundy
http://www.darcs.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: A darcs that can pull from git
  2005-04-26 11:06     ` David Roundy
@ 2005-04-26 12:34       ` Petr Baudis
  2005-04-26 12:47         ` [darcs-devel] " David Roundy
  0 siblings, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2005-04-26 12:34 UTC (permalink / raw)
  To: Juliusz Chroboczek, darcs-devel, Git Mailing List

Dear diary, on Tue, Apr 26, 2005 at 01:06:17PM CEST, I got a letter
where David Roundy <droundy@abridgegame.org> told me that...
> On Mon, Apr 25, 2005 at 05:12:59PM +0200, Juliusz Chroboczek wrote:
> > > Do you have any plans/ideas for allowing pulls directly from a
> > > remote git repository?
> > 
> > I haven't thought about it yet.  Does anyone have any ideas about how
> > to efficiently pull from git without a complete local copy?
> 
> I don't think so.  My best thought so far would be to have something like a
> ~/.gitcache/, which would store the sha1 objects themselves, so at least
> we'd only end up with *one* local copy.  I'm actually curious what the true
> git people do about this--it would be nice to share a cache.  For darcs'
> purposes, we could prune the cache from time to time.  If we're running
> with a darcs backend, we really only need the recent versions of files and
> trees.
> 
> Do the git have any suggestions about how to avoid excess downloads or
> excess copies of a git repository? It seems to me like it would make sense
> to always download sha1s to ~/.gitcache/, and then hardlink them to the
> current git repository, so you wouldn't end up ever downloading the same
> sha1 twice.  Or we should use $GITCACHE/, to give the user some
> flexibility.  But perhaps this is an already-solved problem, and I've just
> not noticed...

I'm not sure about the problem you are actually trying to solve, and I
didn't manage to guess it quickly just from the mails themselves;
cg-init /local/path now hardlinks the sha1 objects to the local
.git/objects directory, so you get no space waste. If you are talking
about downloading stuff from remote repositories, http-pull might help.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [darcs-devel] Re: A darcs that can pull from git
  2005-04-26 12:34       ` Petr Baudis
@ 2005-04-26 12:47         ` David Roundy
  0 siblings, 0 replies; 7+ messages in thread
From: David Roundy @ 2005-04-26 12:47 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Juliusz Chroboczek, darcs-devel, Git Mailing List

On Tue, Apr 26, 2005 at 02:34:45PM +0200, Petr Baudis wrote:
> Dear diary, on Tue, Apr 26, 2005 at 01:06:17PM CEST, I got a letter
> where David Roundy <droundy@abridgegame.org> told me that...
> > Do the git have any suggestions about how to avoid excess downloads or
> > excess copies of a git repository? It seems to me like it would make sense
> > to always download sha1s to ~/.gitcache/, and then hardlink them to the
> > current git repository, so you wouldn't end up ever downloading the same
> > sha1 twice.  Or we should use $GITCACHE/, to give the user some
> > flexibility.  But perhaps this is an already-solved problem, and I've just
> > not noticed...
> 
> I'm not sure about the problem you are actually trying to solve, and I
> didn't manage to guess it quickly just from the mails themselves;
> cg-init /local/path now hardlinks the sha1 objects to the local
> .git/objects directory, so you get no space waste. If you are talking
> about downloading stuff from remote repositories, http-pull might help.

Yeah, what I was wondering about was the scenario where a user does (and
pardon any errors, I haven't actually used cogito) something like

cd foo
cg-init http://remote_repository
cd ../bar
cg-init ../foo

(so far we've only got hard links and everything is great)

http-pull http://remote_repository (downloads a few more commits to bar)
cd ../foo
http-pull http://remote_repository

Does this last pull download the same commits as the previous one? Ideally
it wouldn't.  The whole point of the sha1-named files is that you don't
have to worry about where you got it from.  Ideally the second pull would
get the actual files from ../bar, where they've already been downloaded.

Or perhaps (and this was what I was *really* hoping) all the cogito remote
operations would store a hardlink of their results in a common cache
directory, so that one could actually do

cd foo
cg-init http://remote_repository
cd ../bar
cg-init http://remote_repository

without either downloading anything twice, or wasting any disk space.  In
practice what's more likely in practice is that you'll want to

cd foo
cg-init http://linus_remote_repository
cd ../bar
cg-init http://gregkh_remote_repository

and would like to avoid downloading redundant info.
-- 
David Roundy
http://www.darcs.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-04-26 12:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-24 22:32 A darcs that can pull from git Juliusz Chroboczek
2005-04-25 13:31 ` David Roundy
2005-04-25 15:12   ` Juliusz Chroboczek
2005-04-26  0:55     ` Linus Torvalds
2005-04-26 11:06     ` David Roundy
2005-04-26 12:34       ` Petr Baudis
2005-04-26 12:47         ` [darcs-devel] " David Roundy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox