Git development

Git development
 help / color / mirror / Atom feed

* Re: [RFC] Another way to provide help details. (was Re: [PATCH] Add help details to git help command.)
From: Steven Cole @ 2005-04-19 16:03 UTC (permalink / raw)
  To: David Greaves; +Cc: Petr Baudis, git
In-Reply-To: <4265189E.6090801@dgreaves.com>

David Greaves wrote:
> Petr Baudis wrote:
> 
>> Dear diary, on Tue, Apr 19, 2005 at 03:40:54AM CEST, I got a letter
>> where Steven Cole <elenstev@mesatop.com> told me that...
>>
>>> Here is perhaps a better way to provide detailed help for each
>>> git command.  A command.help file for each command can be
>>> written in the style of a man page.
>>
>>
>>
>> I don't like it. I think the 'help' command should serve primarily as a
>> quick reference, which does not blend so well with a manual page - it's
>> too long and too convoluted by repeated output.
>>
>> I'd just print the top comment from each file. :-)
>>
> 
> On the other hand, having more complete docs seems like an excellent 
> idea (and other threads support that)
> I'd certainly like to see more specification oriented documentation...
> (even if it turns out to be disposable)
> 
> Steven, if you carry on sending more verbose docs I'll certainly read 
> and work with you on editing them...

I only did those first two as a straw man.  Doing the others is a couple
hours (or less) work, but I don't want to do it if folks don't want it.

Having the help files separate has advantages/disadvantages.

> 
> Nb kernel-doc doesn't seem appropriate for user level docs.
> maybe, whilst there's so much flux, have:
>   git man command
> that just outputs text
> 
> If Petr wants the top comment to be extracted by help then maybe a 
> bottom comment block could contain the more complete text?
> I *really* think that the user docs should live in the source for now 
> (hence I think that git man is better than going straight to man/docbook).
> 
> I wasn't sure whether to perlise the code or do a shell-lib - but 
> looking at the algorithms needed in things like git status I reckon the 
> shell will end up becoming a hackish mess of awk/sed/tr/sort/uniq/pipe 
> (ie perl) anyway.
> 
> So I'm going to have a go at that - Petr, if you have a minute could you 
> send me, off list, a bit of perl code that epitomises the style you like?
> 
> David
> 

Funny you should mention Perl.  Here is small bit of code:

[steven@spc0 git-pasky-testing]$ cat print_help_header.pl
#!/usr/bin/perl
# reads from stdin   writes to stdout  no error checking
<STDIN>;<STDIN>;
while (substr( $line=<STDIN>, 0, 1) eq "#") {
                  print $line;
}

[steven@spc0 git-pasky-testing]$ ./print_help_header.pl <gitdiff.sh | grep ^# | grep -v "(c)" | cut -c 3-
Make a diff between two GIT trees.

By default compares the current working tree to the state at the
last commit. You can specify -r rev1:rev2 or -r rev1 -r rev2 to
tell it to make a diff between the specified revisions. If you
do not specify a revision, the current working tree is implied
(note that no revision is different from empty revision - -r rev:
compares between rev and HEAD, while -r rev compares between rev
and working tree).

-p instead of one ID denotes a parent commit to the specified ID
(which must not be a tree, obviously).

Outputs a diff converting the first tree to the second one.
-------end of output from perl plus grep and cut.

Without the perl, extra comments came out (plus the dreaded first blank line).

[steven@spc0 git-pasky-testing]$ cat gitdiff.sh | grep -v "/bin" | grep ^# | grep -v "(c)" | cut -c 3-

Make a diff between two GIT trees.

By default compares the current working tree to the state at the
last commit. You can specify -r rev1:rev2 or -r rev1 -r rev2 to
tell it to make a diff between the specified revisions. If you
do not specify a revision, the current working tree is implied
(note that no revision is different from empty revision - -r rev:
compares between rev and HEAD, while -r rev compares between rev
and working tree).

-p instead of one ID denotes a parent commit to the specified ID
(which must not be a tree, obviously).

Outputs a diff converting the first tree to the second one.
FIXME: The commandline parsing is awful.
-------end of output from grep and cut.

David,

I'm a bit pressed for time, so if you or anyone else would like to
use this code to fix up my earlier patch, you're welcome to it.
Otherwise, it will be later this evening or tomorrow before I can
do any more with this.

Steven

^ permalink raw reply

* Re: GIT Web Interface
From: Kay Sievers @ 2005-04-19 15:59 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050419005244.GR5554@pasky.ji.cz>

On Tue, 2005-04-19 at 02:52 +0200, Petr Baudis wrote:
> Dear diary, on Tue, Apr 19, 2005 at 02:44:15AM CEST, I got a letter
> where Kay Sievers <kay.sievers@vrfy.org> told me that...
> > I'm hacking on a simple web interface, cause I missed the bkweb too much.
> > It can't do much more than browse through the source tree and show the
> > log now, but that should change... :)
> >   http://ehlo.org/~kay/gitweb.pl?project=linux-2.6
> 
> Hmm, looks nice for a start. (But you have obsolete git-pasky tree there! ;-)

Yeah, it's fresh now. :)

> > How can I get the files touched with a changeset and the corresponding
> > diffs belonging to it?
> 
> diff-tree to get the list of files, you can do the corresponding diffs
> e.g. by doing git diff -r tree1:tree2. Preferably make a patch for it
> first to make it possible to diff individual files this way.

Ah, nice! Got it working.

Thanks,
Kay


^ permalink raw reply

* Re: Darcs and git: plan of action
From: Linus Torvalds @ 2005-04-19 14:55 UTC (permalink / raw)
  To: David Roundy; +Cc: Git Mailing List, darcs-devel
In-Reply-To: <20050419104252.GA28269@abridgegame.org>

On Tue, 19 Apr 2005, David Roundy wrote:
> 
> Would a small amount of human-readable change information be acceptable in
> the free-form comment area? In the rename thread I got the impression this
> would be okay for renames.  For example,
> 
> rename foo bar

Sure. That's human-readable and meaningful, as in "it actually makes sense 
as a commit comment regardless of any darcs issues". As does:

> replace [_a-zA-Z0-9] old_variable new_variable file/path

which is almost so (a human would have written "rename old to new", but
the above isn't _that_ different).

HOWEVER, then the requirement would be that we'd never have complex
combinations of the above. Ie having 2-5 lines of something like that is
"human-readable". Having 10+ lines of the above is not. See?

I have this suspicion that the "replace" thing often ends up being done on
dozens of files, and I don't want to have dozens of lines of stuff that
ends up really being machine-readable. But if it's ok to depend on the
content changes (you _do_ see which files changed) together with a single
line of "replace [token-def] xxx yyy", then hell yes - I consider that to
be useful information even outside of git.

(In other words: if it looks like something a careful human _could_ have
written, it's certainly ok. But if it looks like something a careful human
would have used a script to generate 40 entries of, it's bad).

		Linus

^ permalink raw reply

* Re: [RFC] Another way to provide help details. (was Re: [PATCH] Add help details to git help command.)
From: David Greaves @ 2005-04-19 14:41 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Steven Cole, git
In-Reply-To: <20050419015124.GW5554@pasky.ji.cz>

Petr Baudis wrote:
> Dear diary, on Tue, Apr 19, 2005 at 03:40:54AM CEST, I got a letter
> where Steven Cole <elenstev@mesatop.com> told me that...
> 
>>Here is perhaps a better way to provide detailed help for each
>>git command.  A command.help file for each command can be
>>written in the style of a man page.
> 
> 
> I don't like it. I think the 'help' command should serve primarily as a
> quick reference, which does not blend so well with a manual page - it's
> too long and too convoluted by repeated output.
> 
> I'd just print the top comment from each file. :-)
> 

On the other hand, having more complete docs seems like an excellent 
idea (and other threads support that)
I'd certainly like to see more specification oriented documentation...
(even if it turns out to be disposable)

Steven, if you carry on sending more verbose docs I'll certainly read 
and work with you on editing them...

Nb kernel-doc doesn't seem appropriate for user level docs.
maybe, whilst there's so much flux, have:
   git man command
that just outputs text

If Petr wants the top comment to be extracted by help then maybe a 
bottom comment block could contain the more complete text?
I *really* think that the user docs should live in the source for now 
(hence I think that git man is better than going straight to man/docbook).

I wasn't sure whether to perlise the code or do a shell-lib - but 
looking at the algorithms needed in things like git status I reckon the 
shell will end up becoming a hackish mess of awk/sed/tr/sort/uniq/pipe 
(ie perl) anyway.

So I'm going to have a go at that - Petr, if you have a minute could you 
send me, off list, a bit of perl code that epitomises the style you like?

David

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Martin Schlemmer @ 2005-04-19 14:40 UTC (permalink / raw)
  To: Petr Baudis; +Cc: David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050419105008.GB12757@pasky.ji.cz>

[-- Attachment #1: Type: text/plain, Size: 1995 bytes --]

On Tue, 2005-04-19 at 12:50 +0200, Petr Baudis wrote:
> Dear diary, on Tue, Apr 19, 2005 at 12:05:10PM CEST, I got a letter
> where Martin Schlemmer <azarah@nosferatu.za.org> told me that...
> > On Tue, 2005-04-19 at 11:28 +0200, Petr Baudis wrote:
> > > Dear diary, on Tue, Apr 19, 2005 at 11:18:55AM CEST, I got a letter
> > > where David Greaves <david@dgreaves.com> told me that...
> > >
> > > Dunno. I do it personally all the time, with git at least.
> > > 
> > > What do others think? :-)
> > > 
> > 
> > I think pull is pull.  If you are doing lots of local stuff and do not
> > want it overwritten, it should have been in a forked branch.
> 
> I disagree. This already forces you to have two branches (one to pull
> from to get the data, mirroring the remote branch, one for your real
> work) uselessly and needlessly.
> 
> I think there is just no good name for what pull is doing now, and
> update seems like a great name for what pull-and-merge really is. Pull
> really is pull - it _pulls_ the data, while update also updates the
> given tree. No surprises.
> 
> (We should obviously have also update-without-pull but that is probably
> not going to be so common so a parameter for update (like -n) should be
> fine for that.)
> 
> These naming issues may appear silly but I think they matter big time
> for usability, intuitiveness, and learning curve (I don't want git-pasky
> become another GNU arch).
> 

Ok, so 'pull' do the bk thing, and 'update' do the cvs thing.  I think
however you should do either do one or the other.  Maybe drop the
'update', and rather add 'checkout' (or 'co' for short) which will
update the tree (or merge with local changes if needed).  Then you have
two distinct separate things (ok, so pretty much how bk do things).

This will also enable you to make 'fork', 'export', etc just do the
right thing with the database, but leave 'checkout' up to the user if he
wants to do so.

-- 
Martin Schlemmer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: missing: git api, reference, user manual and mission statement
From: Kevin Smith @ 2005-04-19 14:10 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20050419135810.GA19393@elte.hu>

Ingo Molnar wrote:
> * Kevin Smith <yarcs@qualitycode.com> wrote:
> 
>>Git is very immature, and currently should only be used by brave 
>>pioneers. About the only way for a mortal to even try git is to stick 
>>to git-pasky releases, and not try to track all the patches flying 
>>around.
> 
> hey, it's a 2 weeks old project, but it's certainly one of the 
> fastest-growing projects i've ever seen: it has so much steam that it's 
> scary :) It seems that a true emergency focused a massive, spontaneous 
> concentration of OSS development power.

Absolutely! I'm totally impressed with the progress so far.

Kevin

^ permalink raw reply

* wit - a git web interface
From: Christian Meder @ 2005-04-19 14:04 UTC (permalink / raw)
  To: git

Hi,

I uploaded a first draft of wit to
http://www.absolutegiganten.org/wit

Right now it's a minimal web interface on top of git. Unpack it, make
sure you've got at least Python 2.3, optionally install c2html, adjust
config.py and start from the root with
$ PYTHONPATH=. python git/web/wit.py
Point your browser to http://localhost:8090

I append my random notes about this thing:

* wit was built with git 075930771b68528ae13630375df2fe634e9ac610 which
is 2-3 days old, it's untested with newer gits, it's functional for me

* philosophy: provide a git repository browser and show the basic types
of git under appropriate URIs like
/commit/head, /commit/<sha1>, /tree/head, /tree/<sha1>, /blob/<sha1>
and encode operations in URIs like /tree/<sha1>/tarball

* the html is probably invalid and looks like crap

* it's just sitting on top of git because that's the least moving part
of git right now (hah), no usage of git-pasky yet

* it's done with quixote 2.0 which is included

* don't use it on a kernel tree or you will experience slowness beyond
your wildest dreams

* it doesn't use revision.h or other lighning fast C goodies until the
dust settles down

* it'll probably eat your dog and falls apart when blowing at it

* I happen to think that it's a nice start

Any and all feedback is greatly appreciated.

Greetings,



				Christian

-- 
Christian Meder, email: chris@absolutegiganten.org

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

                (Eihei Dogen Zenji)


^ permalink raw reply

* Re: missing: git api, reference, user manual and mission statement
From: Ingo Molnar @ 2005-04-19 13:58 UTC (permalink / raw)
  To: Kevin Smith; +Cc: Klaus Robert Suetterlin, git
In-Reply-To: <42650CFC.1010400@qualitycode.com>


* Kevin Smith <yarcs@qualitycode.com> wrote:

> Klaus Robert Suetterlin wrote:
> > 1) There is no clear (e.g. by name) distinction between ``git as done
> > by Linus'', which is a kind of content addressable database with added
> > semantics, and ``git as done by the rest of You'', which is a kind of
> > SCM on top of Linuses stuff.
> 
> I also see this as one of the biggest obstacles right now. It would be 
> very helpful if we could achieve the clear separation between git and 
> non-git that has been part of the design since the beginning.
> 
> Git is very immature, and currently should only be used by brave 
> pioneers. About the only way for a mortal to even try git is to stick 
> to git-pasky releases, and not try to track all the patches flying 
> around.

hey, it's a 2 weeks old project, but it's certainly one of the 
fastest-growing projects i've ever seen: it has so much steam that it's 
scary :) It seems that a true emergency focused a massive, spontaneous 
concentration of OSS development power.

	Ingo

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Jon Seymour @ 2005-04-19 13:54 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Martin Schlemmer, David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050419105008.GB12757@pasky.ji.cz>

> I disagree. This already forces you to have two branches (one to pull
> from to get the data, mirroring the remote branch, one for your real
> work) uselessly and needlessly.
> 
> ...
> These naming issues may appear silly but I think they matter big time
> for usability, intuitiveness, and learning curve (I don't want git-pasky
> become another GNU arch).
> 

Not that it is worth that much, but my $0.02 is that Petr is right on
this one. I want something that allows me to get the objects into my
local repository without funking with my working directory.

As a long time CVS user, "git update" would do what I expect it to. I
don't have any pre-conceptions about what "pull" does, so it doesn't
phase me if pull is used for this purpose. However, perhaps pull means
something in some other SCM that would cause confusion for others?

Some alternatives to "pull" are offered: hoard, gather, make-local, download.

Regards,

jon.
-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/

^ permalink raw reply

* Re: missing: git api, reference, user manual and mission statement
From: Kevin Smith @ 2005-04-19 13:51 UTC (permalink / raw)
  To: Klaus Robert Suetterlin; +Cc: git
In-Reply-To: <20050419123631.GD3739@xdt04.mpe-garching.mpg.de>

Klaus Robert Suetterlin wrote:
> 1) There is no clear (e.g. by name) distinction between ``git as done
> by Linus'', which is a kind of content addressable database with added
> semantics, and ``git as done by the rest of You'', which is a kind of
> SCM on top of Linuses stuff.

I also see this as one of the biggest obstacles right now. It would be
very helpful if we could achieve the clear separation between git and
non-git that has been part of the design since the beginning.

Git is very immature, and currently should only be used by brave
pioneers. About the only way for a mortal to even try git is to stick to
git-pasky releases, and not try to track all the patches flying around.

> Linus must have had an idea of the final product, and how to use
> that.  The real day to day workflow.  

The best description I have seen so far is the README for git-pasky:
  http://pasky.or.cz/~pasky/dev/git/

It's not a bad mini-tutorial, really.

> I really believe a lot of questions on the git mailing list could
> be answered if there was a user manual and a reference for git.
> Even before all of it will be implemented.

As you know, documentation is a great way for non-coders to contribute
to a project. Until someone steps up to write it, it won't happen.

In a highly iterative development process like this one, it actually
doesn't make sense to write the docs first. You really don't know how it
should work until you code it, play with it, and then realize it should
be doing something different.

> The list of dependencies is long and growing.  So if the intent of
> doing gitSCM with shell scripts was to make it portable: that goal was missed.

I think the main goal was rapid implementation. I totally expect there
to be one or several wrappers, written in various languages, that will
eventually replace git-pasky.

> Still gitLinus lacks a clear definition of its interface, so I
> guess no one will be able to tell if it works correct.  How could You
> do a test case without knowing
> a) what the software should do and
> b) how You should tell it?

I agree that it would be nice to have automated unit tests.

> And of course there are still memory leaks.  

The code is still young, and these will be fixed. I'm not bothered that
there are leaks at this moment. I am a bit bothered by Linus's attitude
that some small leaks might not ever need to be fixed.

Kevin

^ permalink raw reply

* Re: naive question
From: Ingo Molnar @ 2005-04-19 13:51 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Paul Mackerras, git
In-Reply-To: <1113916741.4166.0.camel@localhost.localdomain>


* David Woodhouse <dwmw2@infradead.org> wrote:

> On Tue, 2005-04-19 at 23:00 +1000, Paul Mackerras wrote:
> > Is there a way to check out a tree without changing the mtime of any
> > files that you have already checked out and which are the same as the
> > version you are checking out?  It seems that checkout-cache -a doesn't
> > overwrite any existing files, and checkout-cache -f -a overwrites all
> > files and gives them the current mtime.  This is a pain if you are
> > using make and your tree is large (like, for instance, the linux
> > kernel :), because it means that after a checkout-cache -f -a you get
> > to recompile everything.
> 
> Corollary: why aren't we storing mtime in the tree objects?

Check the "[bug] git: check-files mtime problem?" thread - i noticed 
this problem before and gave a few suggestions but the discussion got 
nowhere. But the problem is still very much present.

	Ingo

^ permalink raw reply

* [script] ge: export commits as patches
From: Ingo Molnar @ 2005-04-19 13:48 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

is there any 'export commit as patch' support in git-pasky? I didnt find 
any such command (maybe it got added meanwhile), so i'm using the 'ge' 
hack below.

e.g. i typically look at commits via 'git log', and then when i see 
something interesting, i look at the commit via the 'ge' script. E.g.  
"ge 834f6209b22af2941a8640f1e32b0f123c833061" done in the kernel tree 
will output a particular commit's header and the patch.

	Ingo

#!/bin/bash

if [ $# != 1 ]; then
 echo 'ge <commit-ID>'
 exit -1
fi
TREE1=$(cat-file commit 2>/dev/null $1 | head -4 | grep ^tree | cut -d' ' -f2)
if [ "$TREE1" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
PARENT=$(cat-file commit 2>/dev/null $1 | head -4 | grep ^parent | cut -d' ' -f2)
if [ "$PARENT" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
TREE2=$(cat-file commit 2>/dev/null $PARENT | head -4 | grep ^tree | cut -d' ' -f2)
if [ "$TREE2" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi

cat-file commit $1
echo
git diff -r $TREE2:$TREE1

^ permalink raw reply

* Re: naive question
From: David Woodhouse @ 2005-04-19 13:19 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <16997.222.917219.386956@cargo.ozlabs.ibm.com>

On Tue, 2005-04-19 at 23:00 +1000, Paul Mackerras wrote:
> Is there a way to check out a tree without changing the mtime of any
> files that you have already checked out and which are the same as the
> version you are checking out?  It seems that checkout-cache -a doesn't
> overwrite any existing files, and checkout-cache -f -a overwrites all
> files and gives them the current mtime.  This is a pain if you are
> using make and your tree is large (like, for instance, the linux
> kernel :), because it means that after a checkout-cache -f -a you get
> to recompile everything.

Corollary: why aren't we storing mtime in the tree objects?

-- 
dwmw2


^ permalink raw reply

* naive question
From: Paul Mackerras @ 2005-04-19 13:00 UTC (permalink / raw)
  To: git

Is there a way to check out a tree without changing the mtime of any
files that you have already checked out and which are the same as the
version you are checking out?  It seems that checkout-cache -a doesn't
overwrite any existing files, and checkout-cache -f -a overwrites all
files and gives them the current mtime.  This is a pain if you are
using make and your tree is large (like, for instance, the linux
kernel :), because it means that after a checkout-cache -f -a you get
to recompile everything.

Paul.

^ permalink raw reply

* Re: space compression (again)
From: Martin Uecker @ 2005-04-19 12:39 UTC (permalink / raw)
  To: git; +Cc: Martin Uecker
In-Reply-To: <20050416173702.GA12605@macavity>

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

On Sat, Apr 16, 2005 at 07:37:02PM +0200, Martin Uecker wrote:
> On Sat, Apr 16, 2005 at 11:11:00AM -0400, C. Scott Ananian wrote:
 
> > The rsync approach does not use fixed chunk boundaries; this is necessary 
> > to ensure good storage reuse for the expected case (ie; inserting a single 
> > line at the start or in the middle of the file, which changes all the 
> > chunk boundaries).
> 
> Yes. The chunk boundaries should be determined deterministically
> from local properties of the data. Use a rolling checksum over
> some small window and split the file it it hits a special value (0).
> This is what the rsyncable patch to zlib does.

This is certainly uninteresting for source code repositories
but for people who manage repositories of rsyncable binary
packages this would save a lot of space, bandwidth and
cpu time (compared to rsync because the scanning phase is
not necessary anymore). 

Martin

-- 
One night, when little Giana from Milano was fast asleep,
she had a strange dream.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* missing: git api, reference, user manual and mission statement
From: Klaus Robert Suetterlin @ 2005-04-19 12:36 UTC (permalink / raw)
  To: git

Dear all,

please don't bother me with ``read the source dude'' or similar answers to this post.  If it's tone or contents just piss You off, ignore it.

I read a little about git lately, and tried to get it running the
last two days.  I found the following things lacking:

1) There is no clear (e.g. by name) distinction between ``git as done
by Linus'', which is a kind of content addressable database with added
semantics, and ``git as done by the rest of You'', which is a kind of
SCM on top of Linuses stuff.

2) For Linuses stuff I dare to say that it is an evil hack from
hell.  A prototype come alive.  This is not meant as an insult;  I
guess Linus agrees.

What it misses the most is a written reasoning about the WHYs, HOWs and
WHATFORs of git.  There is the README which tells us a little about
the WHAT on a level just above source code.

Linus must have had an idea of the final product, and how to use
that.  The real day to day workflow.  From that, and from his
experience with BK and the glimpse he took at monotone, he deduced
what limits the backend of his new distributed SCM system should have.
That is what he implemented:  A storage backend for an SCM.
(Unfortunately he didn't tell us for which.  This is like having
the answer without the question.  That is gitLinus is just like
``42''.)

Unfortunately what this storage backend does not have is an API or
UI definition.  I.e. there is no definition of git interaction
except for the git source code and the application on BLOB, TREE,
CHANGESET as described in the README.

I do think there should be a well defined API or UI so that the
backend could be replaced / changed / improved as need dictates.

3) As of the gitSCM stuff, I really miss any kind of description
how it works.  That is it completely lacks any concept, except for
``we will use gitLinus as backend''.

Take a look at some other distributed SCM (e.g. monotone -- which
might be too slow for a project like the kernel) and see how much these
people think about usage.  Do not reinvent the wheel, there is prior
art for use cases, too!

Some examples are:
1) What does the typical usage look like?
2) What is a version?
3) What is fork? (Especially in the context of a distributed SCM.)
4) What is a branch?
5) Which questions do we want the SCM to answer?
6) What is our security modell?
7) How do we synchronise?  (Not what command do we use , i.e.
   ``rsync'', but what is the operation, e.g. ``full replication of
   state''.)

I really believe a lot of questions on the git mailing list could
be answered if there was a user manual and a reference for git.
Even before all of it will be implemented.

4) Concerning usability on systems other than Linux...  I guess
this one can be ignored by most.

The source still uses st->st_mtim.tv_nsec which should be ->st_mtimensec, I guess.

git is implemented as mostly sh shell scripts.
gitdiff-do and gitlog.sh rely on bash, more precisely on /bin/bash.
git pull uses rsync
...

The list of dependencies is long and growing.  So if the intent of
doing gitSCM with shell scripts was to make it portable: that goal was missed.

5) gitLinus as library.

First I have to say that between what I saw in git-0.04 and the
current stuff from git-pasky there has been quite a lot of work to
get further away from the evil prototype.

Still gitLinus lacks a clear definition of its interface, so I
guess no one will be able to tell if it works correct.  How could You
do a test case without knowing
a) what the software should do and
b) how You should tell it?

And of course there are still memory leaks.  The obvious
--- i.e. malloc and (missing) free in the same function --- I found
while reading the git-0.04 source yesterday are gone.  Still I found
one of the ``malloc in called function no free in caller'' leaks
in git-pasky as pulled NOW.  And all I did was `grep malloc *'.
Someone should sit down and read all the source top to bottom.  And
the software should either check its resource usage or someone
should use a good tool on it.

Thanks for Your time and patience,

--Robert Suetterlin (robert@mpe.mpg.de)
phone: (+49)89 / 30000-3546   fax: (+49)89 / 30000-3950

^ permalink raw reply

* Re: [darcs-devel] Darcs and git: plan of action
From: Petr Baudis @ 2005-04-19 12:25 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: darcs-devel, Git Mailing List
In-Reply-To: <7i4qe3x8ig.fsf@lanthane.pps.jussieu.fr>

Dear diary, on Tue, Apr 19, 2005 at 02:20:55PM CEST, I got a letter
where Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> told me that...
> > The problem is that there is no sequence of alien versions that one can
> > differentiate.  Git has a branched history, with each version that follows
> > a merge having multiple parents.
> 
> Yep.  I've just realised that this morning.  Is there some notion of
> ``primary parent'' as in Arch?  Can a changeset have 0 parents?

Yes, the root commit. Usually, there is only one, but there may be
multiple of them theoretically.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Darcs and git: plan of action
From: Juliusz Chroboczek @ 2005-04-19 12:20 UTC (permalink / raw)
  To: darcs-devel, Git Mailing List
In-Reply-To: <20050419110407.GB28269@abridgegame.org>

[Removing Linus from CC, keeping the Git list -- or should we remove it?]

> I'm not clear why it would be necesary, and it takes the only immutable
> piece of information regarding a patch, and makes it variable.

Er... I'm not suggesting to make it variable, just to make it an
opaque blob of bytes (still immutable).  I see from the examples you
give below that you agree that the format needs extending, so I
suspect we're actually agreeing here, just failing to communicate.

about having multiple ids per patch:

> Or alternatively, we could have a one-to-one mapping between git IDs and
> darcs IDs, which is what I'd do.

Okay, you've convinced me.  It's much simpler that way, we'll see how
well it works.

> The problem is that there is no sequence of alien versions that one can
> differentiate.  Git has a branched history, with each version that follows
> a merge having multiple parents.

Yep.  I've just realised that this morning.  Is there some notion of
``primary parent'' as in Arch?  Can a changeset have 0 parents?

> If we do it right (automatically tagging like crazy people), darcs
> users between themselves can cherry-pick all they like, without
> introducing inconsistencies or losing interoperability with git.

You've lost me here.  How can you cherry-pick if every tag depends on
the preceding patches?  Or are you thinking of pulling just the patch
and not the tag -- in that case, what happens when you push to git a
Darcs patch that depends on a patch that originated with git?

I've started interfacing Haskell with git this week-end, that's
something we'll need whichever model we choose.  We should be able to
start playing with actually modifying Darcs after next week-end.

                                        Juliusz

^ permalink raw reply

* Re: [darcs-devel] Darcs and git: plan of action
From: David Roundy @ 2005-04-19 11:05 UTC (permalink / raw)
  To: Ray Lee; +Cc: Kevin Smith, git, darcs-devel
In-Reply-To: <1113874931.23938.111.camel@orca.madrabbit.org>

On Mon, Apr 18, 2005 at 06:42:11PM -0700, Ray Lee wrote:
> On Mon, 2005-04-18 at 21:05 -0400, Kevin Smith wrote:
> > You could guess, but that's not good enough for darcs to be able to
> > reliably commute the patches later.
>
> Who said anything about guessing? If a user replaces all instances of
> foo with bar, that's as close to proof as you can ever get, without
> recording intent of the user at the time it's done. Now, I realize that
> darcs *does* record intent, but I claim that's immaterial.

The problem is, how do you know how to define a token? That's also included
in a darcs patch.  And a darcs user may choose not to use a replace patch,
if (for example) he's renaming a local variable, since he might not want to
mess with other functions in the same file.

Guessing the author's intent cannot reliably reproduce the author's stated
intent.  Either we need to include that information in one form or another
(and in one location or another), or we've got to simply disallow replaces
(and moves?) when interacting with git.
-- 
David Roundy
http://www.darcs.net

^ permalink raw reply

* Re: [darcs-devel] Darcs and git: plan of action
From: David Roundy @ 2005-04-19 11:04 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: darcs-devel, Linus Torvalds, Git Mailing List
In-Reply-To: <7iy8bf7fh2.fsf@lanthane.pps.jussieu.fr>

On Tue, Apr 19, 2005 at 02:55:05AM +0200, Juliusz Chroboczek wrote:
> [Using git as a backend for Darcs.]
...
> >>  1. remove the assumption that patch IDs have a fixed format.  Patch
> >>  IDs should be opaque blobs of binary data that Darcs only compares
> >>  for equality.
> 
> > I'm not really comfortable with this,
> 
> Why?

I'm not clear why it would be necesary, and it takes the only immutable
piece of information regarding a patch, and makes it variable.  Just seems
dangerous and complicated, and I'm not sure why we'd need to do it.

> Suppose I record a patch in Darcs; it gets a Darcs id.  I push it into
> git, at which point it gets a git id, whether we want it to or not.
> What do we do when we pull that patch back into darcs?
> 
> Either we arbitrarily discard one of the ids (which one?), or we keep
> both.  If there's more pulling/pushing going on on the git side, we
> definitely need to keep both.

Or alternatively, we could have a one-to-one mapping between git IDs and
darcs IDs, which is what I'd do.

> > I think when dealing with git (and probably also with *any* other SCM
> > (arch being a possible exception), we need to consider the exchange
> > medium to be not a patch, but a tag.
> 
> We're thinking in opposite directions -- you're thinking of the alien
> versions as integrals of Darcs patches, I'm thinking of Darcs patches
> as derivatives of alien versions.
> 
>   You:  alien version = Darcs tag
> 
>   Me:   Darcs patch = pair of successive alien versions
> 
> My gut instinct is that the second model can be made to work almost
> seamlessly, unlike the first one.  But that's just a guess.

The problem is that there is no sequence of alien versions that one can
differentiate.  Git has a branched history, with each version that follows
a merge having multiple parents.  How do you define that change?  It's easy
enough to do if we tag each git version in darcs, since we know what the
two parents are, and we know what the final state is, but there *is* no
translation from a single git ID either to a single patch(1) patch, or to a
single darcs patch--unless you treat its parents as tags.

The key is that we can't make git work like darcs, so we'll have to make
darcs work like git.  If we do it right (automatically tagging like crazy
people), darcs users between themselves can cherry-pick all they like,
without introducing inconsistencies or losing interoperability with git.

To summarize how I'd see the mapping between git information and darcs, a
git commit would be composed of one darcs patch and one darcs tag.  With
this mapping, I don't believe we lose any information, and I believe we'll
be able to (except that patches would have to be uniquely determined by a
pair of trees) simply translate the darcs system right back again, since
it's a one-to-one correspondence of information.

My proposed mapping:

tree 6ff0e9f3d131bd110d32829f0b14f07da8313c45
# This is a darcs tag ID
parent abd62b9caee377595a9bf75f363328c82a38f86e
# This is the context of both a patch and tag.
author James Bottomley <James.Bottomley@SteelEye.com> 1113879319 -0700
# This is the author and date of the patch
committer Linus Torvalds <torvalds@ppc970.osdl.org.(none)> 1113879319 -0700
# This is the author and date of the tag
# Everything below would be the name and long comment of the patch

[PATCH] SCSI trees, merges and git status

Doing the latest SCSI merge exposed two bugs in your merge script:

1) It doesn't like a completely new directory (the misc tree contains a
   new drivers/scsi/lpfc)
2) the merge testing logic is wrong.  You only want to exit 1 if the
   merge fails. 

-- 
David Roundy
http://www.darcs.net

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Petr Baudis @ 2005-04-19 10:50 UTC (permalink / raw)
  To: Martin Schlemmer; +Cc: David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <1113905110.1262.1.camel@nosferatu.lan>

Dear diary, on Tue, Apr 19, 2005 at 12:05:10PM CEST, I got a letter
where Martin Schlemmer <azarah@nosferatu.za.org> told me that...
> On Tue, 2005-04-19 at 11:28 +0200, Petr Baudis wrote:
> > Dear diary, on Tue, Apr 19, 2005 at 11:18:55AM CEST, I got a letter
> > where David Greaves <david@dgreaves.com> told me that...
> >
> > Dunno. I do it personally all the time, with git at least.
> > 
> > What do others think? :-)
> > 
> 
> I think pull is pull.  If you are doing lots of local stuff and do not
> want it overwritten, it should have been in a forked branch.

I disagree. This already forces you to have two branches (one to pull
from to get the data, mirroring the remote branch, one for your real
work) uselessly and needlessly.

I think there is just no good name for what pull is doing now, and
update seems like a great name for what pull-and-merge really is. Pull
really is pull - it _pulls_ the data, while update also updates the
given tree. No surprises.

(We should obviously have also update-without-pull but that is probably
not going to be so common so a parameter for update (like -n) should be
fine for that.)

These naming issues may appear silly but I think they matter big time
for usability, intuitiveness, and learning curve (I don't want git-pasky
become another GNU arch).

Kind regards,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [darcs-devel] Darcs and git: plan of action
From: David Roundy @ 2005-04-19 10:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: darcs-devel, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504180832330.7211@ppc970.osdl.org>

On Mon, Apr 18, 2005 at 08:38:25AM -0700, Linus Torvalds wrote:
> On Mon, 18 Apr 2005, David Roundy wrote:
> > .... In particular, it would make life (that is, life interacting back
> > and forth with git) easier if we were to embed darcs patches in their
> > entirety in the git comment block.
> 
> Hell no.

I was afraid that would be the response...

> The commit _does_ specify the patch uniquely and exactly, so I really 
> don't see the point. You can always get the patch by just doing a
> 
> 	git diff $parent_tree $thistree
> 
> so putting the patch in the comment is not an option.

The issue is that in darcs the parent and child trees *don't* uniquely or
exactly specify the patch.  In fact, even the output of git diff will
depend on what version of diff you're using (e.g. if someone were to use
BSD diff rather than GNU diff).

> > As I say, it's a bit ugly, and before we explore the idea further, it would
> > be nice to know if this would cause Linus to vomit in disgust and/or refuse
> > patches from darcs users.
> 
> That's definitely the case. I will _not_ be taking random files etc just 
> to keep other peoples stuff straightened up.

Okay.

> > Another slightly less noxious possibility would be to store the darcs
> > patch as a "hidden" file, if git were given the concept of
> > commit-specific files.
> 
> No, git will not track commit-specific files. There's the comment
> section, and that _is_ the commit-specific file. But I will refuse to
> take any comments that aren't just human-readable explanations, together
> with maybe one extra line of
> 
> 	# Darcs ID: 780c057447d4feef015a905aaf6c87db894ff58c
> 
> (others will want to track _their_ PR numbers etc) and that's it. The 
> actual darcs data that that ID refers to can obviously be maintained in 
> _another_ git archive, but it's not one I'm going to carry about.

The trouble is that the philosophy of darcs and git are about as orthogonal
as one can come.  Git treats the content as fundamental, where in darcs the
changes are fundamental.  Since in darcs there can be different changes
that lead from the same parent to the same child--and these differences are
meaningful when merges happen---when interacting with git, we either need
to restrict darcs to only describe changes in a way that can be uniquely
determined by a parent and child, or we need to have extra metadata
somewhere.

For bidirectional functionality, we either need to avoid the use of
advanced darcs features, or we need to include that information in git
somehow, or we need to keep a parallel darcs archive holding that
information.

Would a small amount of human-readable change information be acceptable in
the free-form comment area? In the rename thread I got the impression this
would be okay for renames.  For example,

rename foo bar

or (this is less important, but you might consider it to be a useful
human-readable comment)

replace [_a-zA-Z0-9] old_variable new_variable file/path

Currently these two patch types account for almost the sum total of the
cases where different patches lead to the same resulting trees.
-- 
David Roundy

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Martin Schlemmer @ 2005-04-19 10:05 UTC (permalink / raw)
  To: Petr Baudis; +Cc: David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050419092812.GE2393@pasky.ji.cz>

[-- Attachment #1: Type: text/plain, Size: 533 bytes --]

On Tue, 2005-04-19 at 11:28 +0200, Petr Baudis wrote:
> Dear diary, on Tue, Apr 19, 2005 at 11:18:55AM CEST, I got a letter
> where David Greaves <david@dgreaves.com> told me that...
>
> Dunno. I do it personally all the time, with git at least.
> 
> What do others think? :-)
> 

I think pull is pull.  If you are doing lots of local stuff and do not
want it overwritten, it should have been in a forked branch.

> I start to like the pull/update distinction, and I think I'll go for it.
> 

-- 
Martin Schlemmer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Petr Baudis @ 2005-04-19  9:28 UTC (permalink / raw)
  To: David Greaves; +Cc: dwheeler, Daniel Barkalow, git
In-Reply-To: <4264CCFF.30400@dgreaves.com>

Dear diary, on Tue, Apr 19, 2005 at 11:18:55AM CEST, I got a letter
where David Greaves <david@dgreaves.com> told me that...
> What's the most common thing to do? pull or update?

update for normal users.

> which is easier to type?
> what are people used to?

I think 'git up' is easier to type than 'git pull'. It's the CVS/SVN
tradition, though, probably not the BK tradition.

> I'm not sure but I suggest that pull and get would be better choices.
> 
> git pull
> git get

I don't like git get; it is something completely new - not in CVS/SVN
and means something completely different in BK, apparently.

> is it rare enough to justify:
> git --download-only pull

Dunno. I do it personally all the time, with git at least.

What do others think? :-)

I start to like the pull/update distinction, and I think I'll go for it.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: David Greaves @ 2005-04-19  9:18 UTC (permalink / raw)
  To: dwheeler; +Cc: Petr Baudis, Daniel Barkalow, git
In-Reply-To: <42646967.9030903@dwheeler.com>

David A. Wheeler wrote:
> I propose changing "pull" to ONLY download, and "update" to pull AND merge.

> Why? It seems oddly inconsistent that "pull" sometimes merges
> in changes, but at other times it doesn't.
true

> I propose that there be two subcommands, "pull" and "update"
> (now that "update" isn't a reserved word again).
> A "git pull" ONLY downloads; a "git update" pulls AND merges.

What's the most common thing to do? pull or update?
which is easier to type?
what are people used to?

I'm not sure but I suggest that pull and get would be better choices.

git pull
git get

is it rare enough to justify:
git --download-only pull

David

-- 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox