Git development
 help / color / mirror / Atom feed
* Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)
From: David Woodhouse @ 2005-04-20 22:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: H. Peter Anvin, Git Mailing List, Chris Mason
In-Reply-To: <Pine.LNX.4.58.0504200731590.6467@ppc970.osdl.org>

On Wed, 2005-04-20 at 07:59 -0700, Linus Torvalds wrote:
>         external-parent <commit-hash> <external-parent-ID>
>                 comment for this parent
> 
> and the nice thing about that is that now that information allows you to 
> add external parents at any point. 
> 
> Why do it like this? First off, I think that the "initial import" ends up
> being just one special case of the much more _generic_ issue of having
> patches come in from other source control systems 

This isn't about patches coming in from other systems -- it's about
_history_, and the fact that it's imported from another system is just
an implementation detail. It's git history now, and what we have here is
just a special case of wanting to prune ancient git history to keep the
size of our working trees down. You refer to this yourself...

> Secondly, we do need something like this for pruning off history anyway, 
> so that the tools have a better way of saying "history has been pruned 
> off" than just hitting a missing commit. 

Having a more explicit way of saying "history is pruned" than just a
reference to a missing commit is a reasonable request -- but I really
don't see how we can do that by changing the now-oldest commit object to
contain an 'external-parent' field. Doing that would change the sha1 of
the commit object in question, and then ripple through all the
subsequent commits.

Come this time next year, if I decide I want to prune anything older
than 2.6.40 from all the trees on my laptop, it has to happen _without_
changing the commit objects which occur after my arbitrarily-chosen
cutoff point.

If we want to have an explicit record of pruning rather than just
copying with a missing object, then I think we'd need to do it with an
external note to say "It's OK that commit XXXXXXXXXXX is missing".

> Thirdly, I don't actually want my new tree to depend on a conversion of
> the old BK tree.
> 
> Two reasons: if it's a really full conversion, there are definitely going
> to be issues with BitMover. They do not want people to try to reverse
> engineer how they do namespace merges

Don't think of it as "a conversion of the old BK tree". It's just an
import of Linux's development history. This isn't going to help
reverse-engineer how BK does merges; it's just our own revision history.
I'm not sure exactly how Thomas is extracting it, but AIUI it's all
obtainable from the SCCS files anyway without actually resorting to
using BK itself. 

There's nothing here for Larry to worry about. It's not as if we're
actually using BK to develop git by observing BK's behaviour w.r.t
merges and trying to emulate it. Besides -- if we wanted to do that,
we'd need to use the _BK_ version of the tree; the git version wouldn't
help us much anyway.

And given that BK's merges are based on individual files and we're not
going that route with git, it's not clear how much we could lift
directly from BK even if we _were_ going to try that.

> The other reason is just the really obvious one: in the last week, I've
> already changed the format _twice_ in ways that change the hash. As long
> as it's 119MB of data, it's not going to be too nasty to do again.

That's fine. But by the time we settle on a format and actually start
using it in anger, it'd be good to be sure that it _is_ possible to
track development from current trees all the way back -- be that with
explicit reference to pruned history as you suggest, or with absent
parents as I still prefer.

> it's not that it's necessarily the wrong thing to do, but I think it
> is the wrogn thing to do _now_.

OK, time for us to keep arguing over the implementation details of how
we prune history then :)

-- 
dwmw2


^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Petr Baudis @ 2005-04-20 22:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Greg KH, git
In-Reply-To: <Pine.LNX.4.58.0504201503050.6467@ppc970.osdl.org>

Dear diary, on Thu, Apr 21, 2005 at 12:09:06AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> told me that...
> Yeah, yeah, it looks different from "cvs update", but dammit, wouldn't it 
> be cool to just write "cg-<tab><tab>" and see the command choices? Or 
> "cg-up<tab>" and get cg-update done for you..

I like this idea! :-) I guess that is in fact exactly what I have been
looking for, and (as probably apparent from the current git-pasky
structure) I prefer to have the scripts separated anyway.

I think I will go for it. I also thought about having this _and_ a 'cg'
command which would act as a completely dumb multiplexer, but I decided
to toss that idea since it would only create usage ambiguity and other
problems on the long run.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* chunking (Re: [ANNOUNCEMENT] /Arch/ embraces `git')
From: Linus Torvalds @ 2005-04-20 22:22 UTC (permalink / raw)
  To: C. Scott Ananian
  Cc: Petr Baudis, Tom Lord, gnu-arch-users, gnu-arch-dev,
	Git Mailing List, talli
In-Reply-To: <Pine.LNX.4.61.0504201754450.2630@cag.csail.mit.edu>



On Wed, 20 Apr 2005, C. Scott Ananian wrote:
> 
> I'm hoping my 'chunking' patches will fix this.  This ought to reduce the 
> size of the object store by (in effect) doing delta compression; rsync
> will then Do The Right Thing and only transfer the needed deltas.
> Running some benchmarks right now to see how well it lives up to this 
> promise...

What's the disk usage results? I'm on ext3, for example, which means that
even small files invariably take up 4.125kB on disk (with the inode).

Even uncompressed, most source files tend to be small. Compressed, I'm 
seeing the median blob size being ~1.6kB in my trivial checks. That's 
blobs only, btw.

My point being that about 75% of all blobs already take up less than the
minimal amount of space that most filesystems can sanely allocate. And I'm
_not_ going to say "you have to use reiserfs" with git.

So the disk fragmentation really does matter. It doesn't help to make a 
file smaller than 4kB, it hurts - while that can be offset by sharing 
chunks, it might not be.

Also, while network performance is important, so is the handshaking on
which objects to get. Lots of small objects potentially need lots of
handshaking to figure out _which_ of the objects to do.

		Linus

^ permalink raw reply

* on when to checksum
From: Tom Lord @ 2005-04-20 22:25 UTC (permalink / raw)
  To: git; +Cc: torvalds



Linus, 

I think you have made a mistake by moving the sha1 checksum from the
zipped form to the inflated form.  Here is why:

What you have set in motion with `git' is an ad-hoc p2p network for
sharing filesystem trees -- a global distributed filesystem.  I
believe your starter here has a good chance of taking off to be much,
much larger than just a tool for the kernel.

A subset of your work: blobs and blob databaes, has much wider application
than just sharing trees:  Those parts of `git' can form a very solid 
foundation for many other applications as well.   To the extent `git'
succeeds in the context of the kernel, it will be invested in and
extended and generalized --- and the kernel project will benefit.
So don't ignore those wider applications even though they are not your
focus today: they will generate investment that feeds back to your project.

Your `git' is silent on transports and mirroring of blob databases --
tasks for scripting, sure -- but those elements won't be far behind.

Eventually, slinging around blobs as atomic elements
of payloads will become very common.

The blob handle (aka "address")/payload model of a blob db is very
clean and simple.   In a network of nodes speaking to one and other
by exchanging blobs, I forsee a prominent need for intermediate
nodes that process blobs "blindly" and as quickly as possible.

Blob compression is mostly goofy if regarded just as a way to 
save on (diminishingly cheap) disk space but it is mostly 
sane if regarded as a way to cut the cost of network bandwidth
roughly in half.

Must intermediate nodes inflate the payloads passing through them
or which they cache just to validate them?   That's not a desirable otucome
for many obvious reasonhs.

There *are* concerns about checksumming zips: it is necessary to nail
down the zip process and make sure it is absolutely and permanently
deterministic for this application.   But *that* is the problem to 
solve, not avoid by moving what the checksum refers to.

Thanks,
-t

^ permalink raw reply

* Re: Git hangs while executing commit-tree
From: David Greaves @ 2005-04-20 22:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rhys Hardwick, git
In-Reply-To: <Pine.LNX.4.58.0504201446510.6467@ppc970.osdl.org>

Linus Torvalds wrote:
> 
> On Wed, 20 Apr 2005, Rhys Hardwick wrote:
> 
>>rhys@metatron:~/repo/tmp.repo$ commit-tree  c80156fafbac377ab35beb076090c8320f874f91
>>Committing initial tree c80156fafbac377ab35beb076090c8320f874f91
>> 
>>At this point, the command seems to be just waiting.
> 
> 
> That's _exactly_ what it's doing. It's waiting for you to write a commit 
> message.
> 
> Something like
> 
> 	This is my initial commit of Hello World!
> 	^D
> 
> will make it happy.
> 
> Alternatively, you can certainly just write your message beforehand with 
> an editor and just pipe it into commit-tree.
> 
> 			Linus

When someone commits the docs I'll submit the next patch for the README:

commit-tree
	commit-tree <sha1> [-p <parent sha1>...] < changelog

Creates a new commit object based on the provided tree object and
emits the new commit object id on stdout. If no parent is given then
it is considered to be an initial tree.

A commit comment is read from stdin (max 999 chars)

A commit object usually has 1 parent (a commit after a change) or 2
parents (a merge) although there is no reason it cannot have more than
2 parents.

While a tree represents a particular directory state of a working
directory, a commit represents that state in "time", and explains how
to get there.

Normally a commit would identify a new "HEAD" state, and while git
doesn't care where you save the note about that state, in practice we
tend to just write the result to the file ".git/HEAD", so that we can
always see what the last committed state was.

Options

<sha1>
	An existing tree object

-p <parent sha1>
	Each -p indicates a the id of a parent commit object.
	

Commit Information

A commit encapsulates:
	all parent object ids
	author name, email and date
	committer name and email and the commit time.

If not provided, commit-tree uses your name, hostname and domain to
provide author and committer info. This can be overridden using the
following environment variables.
	AUTHOR_NAME
	AUTHOR_EMAIL
	AUTHOR_DATE
	COMMIT_AUTHOR_NAME
	COMMIT_AUTHOR_EMAIL
(nb <,> and CRs are stripped)

see also: write-tree

David
-- 

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Steven Cole @ 2005-04-20 22:07 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Petr Baudis, greg, git
In-Reply-To: <20050420145419.6412414f.rddunlap@osdl.org>

Randy.Dunlap wrote:
> On Wed, 20 Apr 2005 23:51:18 +0200 Petr Baudis wrote:
> 
> | Dear diary, on Wed, Apr 20, 2005 at 11:19:19PM CEST, I got a letter
> | where Greg KH <greg@kroah.com> told me that...
> | > On Wed, Apr 20, 2005 at 10:56:33PM +0200, Petr Baudis wrote:
> | > >   The short command version will change from 'git' to 'cg', which should
> | > > be shorter to type and free the 'git' command for possible eventual
> | > > entry gate for the git commands (so that they are more
> | > > namespace-friendly, and it might make most sense anyway if we get fully
> | > > libgitized; but this is more of long-term ideas).
> | > 
> | > Hm, but there already is a 'cg' program out there:
> | > 	http://uzix.org/cgvg.html
> | > I use it every day :(
> | > 
> | > How about 'cog' instead?
> | 
> | Grm. Cg is also name of some scary NVidia thing, and cog is GNOME
> | Configurator. CGT are Chimera Grid Tools, but I think we can clash
> | with those - at least *I* wouldn't mind. ;-)
> 
> I'd rather see you go back to 'tig'...
> 
> is there a tig out there?
> 
> ---
> ~Randy

Since I was the one who came up with the "cogito" name, I'll suggest
some alternatives if cogito is unworkable.  This was posted once before,
mostly as a joke, but here goes.

agitato  ag     Since Beethoven's "Moonlight" 3rd mvmt is Presto agitato
                 and very, very fast, just like git.

legit le or lg  Since git is GPLv2, it's now legit.


Steven

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Linus Torvalds @ 2005-04-20 22:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Greg KH, git
In-Reply-To: <20050420215117.GJ19112@pasky.ji.cz>



On Wed, 20 Apr 2005, Petr Baudis wrote:
> 
> Grm. Cg is also name of some scary NVidia thing, and cog is GNOME
> Configurator. CGT are Chimera Grid Tools, but I think we can clash
> with those - at least *I* wouldn't mind. ;-)

I realize that there is probably a law that there has to be a space, but I 
actually personally use tab-completion all the time, and in many ways 
prefer a name that can be completed without having to play games with 
magic bash completion files.

So how about using a dash instead of a space, and making things be

	cg-pull
	cg-update

etc? You can link them all to the same script if you don't like having 
multiple scripts, and just match with

	case "$0" in
	*-pull)
		...
		;;
	*-update)
		...
		;;

or something.

Yeah, yeah, it looks different from "cvs update", but dammit, wouldn't it 
be cool to just write "cg-<tab><tab>" and see the command choices? Or 
"cg-up<tab>" and get cg-update done for you..

Just because rcs/cvs/everybody-and-his-dog thinks it is cool to have a 
space there and have different meaning for flags depending on whether they 
are before the command or after the command doesn't mean that they are 
necessarily right..

Just an idea,

		Linus

^ permalink raw reply

* Re: [ANNOUNCEMENT] /Arch/ embraces `git'
From: C. Scott Ananian @ 2005-04-20 21:55 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Tom Lord, gnu-arch-users, gnu-arch-dev, git, talli, torvalds
In-Reply-To: <20050420213114.GF19112@pasky.ji.cz>

On Wed, 20 Apr 2005, Petr Baudis wrote:

> I think one thing git's objects database is not very well suited for are
> network transports. You want to have something smart doing the
> transports, comparing trees so that it can do some delta compression;
> that could probably reduce the amount of data needed to be sent
> significantly.

I'm hoping my 'chunking' patches will fix this.  This ought to reduce the 
size of the object store by (in effect) doing delta compression; rsync
will then Do The Right Thing and only transfer the needed deltas.
Running some benchmarks right now to see how well it lives up to this 
promise...
  --scott

terrorist AEROPLANE munitions PAPERCLIP MI5 Morwenstow WSHOOFS CABOUNCE 
colonel Yakima AES MI6 nuclear NSA Cocaine Columbia plastique LICOZY
                          ( http://cscott.net/ )

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Joshua T. Corbin @ 2005-04-20 21:58 UTC (permalink / raw)
  To: git
In-Reply-To: <4266CED2.60806@timesys.com>

On 20 April 2005 17:51, Mike Taht wrote:
> I keep thinking perversely that we need something as obtuse as possible
> in the unix tradition, but easy to type... git requires that the fingers
> move off the home row...
>
> how about "asdf" or "jkl"?  :)
>
> cg is singularly uncomfortable to type. I think that's why it isn't
> commonly used.....
Hmm...got to disagree, cg is perfectly comfortable to type here on my dvorak, 
whilst asdf ad jkl are uncomfortable deviations accross the board ;-)

-- 
Regards,
Joshua T. Corbin <jcorbin@wunjo.org>

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Randy.Dunlap @ 2005-04-20 21:54 UTC (permalink / raw)
  To: Petr Baudis; +Cc: greg, git
In-Reply-To: <20050420215117.GJ19112@pasky.ji.cz>

On Wed, 20 Apr 2005 23:51:18 +0200 Petr Baudis wrote:

| Dear diary, on Wed, Apr 20, 2005 at 11:19:19PM CEST, I got a letter
| where Greg KH <greg@kroah.com> told me that...
| > On Wed, Apr 20, 2005 at 10:56:33PM +0200, Petr Baudis wrote:
| > >   The short command version will change from 'git' to 'cg', which should
| > > be shorter to type and free the 'git' command for possible eventual
| > > entry gate for the git commands (so that they are more
| > > namespace-friendly, and it might make most sense anyway if we get fully
| > > libgitized; but this is more of long-term ideas).
| > 
| > Hm, but there already is a 'cg' program out there:
| > 	http://uzix.org/cgvg.html
| > I use it every day :(
| > 
| > How about 'cog' instead?
| 
| Grm. Cg is also name of some scary NVidia thing, and cog is GNOME
| Configurator. CGT are Chimera Grid Tools, but I think we can clash
| with those - at least *I* wouldn't mind. ;-)

I'd rather see you go back to 'tig'...

is there a tig out there?

---
~Randy

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Mike Taht @ 2005-04-20 21:51 UTC (permalink / raw)
  Cc: git
In-Reply-To: <20050420211919.GA20129@kroah.com>


I keep thinking perversely that we need something as obtuse as possible
in the unix tradition, but easy to type... git requires that the fingers
move off the home row...

how about "asdf" or "jkl"?  :)

cg is singularly uncomfortable to type. I think that's why it isn't 
commonly used.....

Greg KH wrote:
> On Wed, Apr 20, 2005 at 10:56:33PM +0200, Petr Baudis wrote:
> 
>>  The short command version will change from 'git' to 'cg', which should
>>be shorter to type and free the 'git' command for possible eventual
>>entry gate for the git commands (so that they are more
>>namespace-friendly, and it might make most sense anyway if we get fully
>>libgitized; but this is more of long-term ideas).
> 
> 
> Hm, but there already is a 'cg' program out there:
> 	http://uzix.org/cgvg.html
> I use it every day :(
> 
> How about 'cog' instead?
> 
> Or I can just rename my local copy of cg and try to retrain my
> fingers...
> 
> thanks,
> 
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 

Mike Taht


   "New systems generate new problems."

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Petr Baudis @ 2005-04-20 21:51 UTC (permalink / raw)
  To: Greg KH; +Cc: git
In-Reply-To: <20050420211919.GA20129@kroah.com>

Dear diary, on Wed, Apr 20, 2005 at 11:19:19PM CEST, I got a letter
where Greg KH <greg@kroah.com> told me that...
> On Wed, Apr 20, 2005 at 10:56:33PM +0200, Petr Baudis wrote:
> >   The short command version will change from 'git' to 'cg', which should
> > be shorter to type and free the 'git' command for possible eventual
> > entry gate for the git commands (so that they are more
> > namespace-friendly, and it might make most sense anyway if we get fully
> > libgitized; but this is more of long-term ideas).
> 
> Hm, but there already is a 'cg' program out there:
> 	http://uzix.org/cgvg.html
> I use it every day :(
> 
> How about 'cog' instead?

Grm. Cg is also name of some scary NVidia thing, and cog is GNOME
Configurator. CGT are Chimera Grid Tools, but I think we can clash
with those - at least *I* wouldn't mind. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Git hangs while executing commit-tree
From: Linus Torvalds @ 2005-04-20 21:48 UTC (permalink / raw)
  To: Rhys Hardwick; +Cc: git
In-Reply-To: <200504202228.35652.rhys@rhyshardwick.co.uk>



On Wed, 20 Apr 2005, Rhys Hardwick wrote:
>
> rhys@metatron:~/repo/tmp.repo$ commit-tree  c80156fafbac377ab35beb076090c8320f874f91
> Committing initial tree c80156fafbac377ab35beb076090c8320f874f91
>  
> At this point, the command seems to be just waiting.

That's _exactly_ what it's doing. It's waiting for you to write a commit 
message.

Something like

	This is my initial commit of Hello World!
	^D

will make it happy.

Alternatively, you can certainly just write your message beforehand with 
an editor and just pipe it into commit-tree.

			Linus

^ permalink raw reply

* Re: Git hangs while executing commit-tree
From: Rhys Hardwick @ 2005-04-20 21:38 UTC (permalink / raw)
  To: git

Cheers for the help!

Rhys

On Wednesday 20 Apr 2005 22:35, Petr Baudis wrote:
> Dear diary, on Wed, Apr 20, 2005 at 11:28:35PM CEST, I got a letter
> where Rhys Hardwick <rhys@rhyshardwick.co.uk> told me that...
>
> > Hey,
>
> Hi,
>
> > rhys@metatron:~/repo/tmp.repo$ commit-tree
> > c80156fafbac377ab35beb076090c8320f874f91
> > Committing initial tree c80156fafbac377ab35beb076090c8320f874f91
> >
> >
> >
> > At this point, the command seems to be just waiting.  I have had it
> > waiting for around 2 hours now!  I have tried removing ~/repo/tmp.repo
> > and starting over, with exactly the same results.
>
> just type in your commit message and press ctrl-D now. ;-)
>
> If you can't get along by peeking at the source when you get stuck, etc,
> you might prefer using git-pasky (http://pasky.or.cz/~pasky/dev/git/),
> which will guide you nicely.


^ permalink raw reply

* Re: Git hangs while executing commit-tree
From: Petr Baudis @ 2005-04-20 21:35 UTC (permalink / raw)
  To: Rhys Hardwick; +Cc: git
In-Reply-To: <200504202228.35652.rhys@rhyshardwick.co.uk>

Dear diary, on Wed, Apr 20, 2005 at 11:28:35PM CEST, I got a letter
where Rhys Hardwick <rhys@rhyshardwick.co.uk> told me that...
> Hey,

Hi,

> rhys@metatron:~/repo/tmp.repo$ commit-tree 
> c80156fafbac377ab35beb076090c8320f874f91
> Committing initial tree c80156fafbac377ab35beb076090c8320f874f91
>  
> 
> 
> At this point, the command seems to be just waiting.  I have had it waiting 
> for around 2 hours now!  I have tried removing ~/repo/tmp.repo and starting 
> over, with exactly the same results.

just type in your commit message and press ctrl-D now. ;-)

If you can't get along by peeking at the source when you get stuck, etc,
you might prefer using git-pasky (http://pasky.or.cz/~pasky/dev/git/),
which will guide you nicely.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [ANNOUNCEMENT] /Arch/ embraces `git'
From: Petr Baudis @ 2005-04-20 21:31 UTC (permalink / raw)
  To: Tom Lord; +Cc: gnu-arch-users, gnu-arch-dev, git, talli, torvalds
In-Reply-To: <200504201000.DAA04988@emf.net>

Dear diary, on Wed, Apr 20, 2005 at 12:00:36PM CEST, I got a letter
where Tom Lord <lord@emf.net> told me that...
> >From the /Arch/ perspective: `git' technology will form the
> basis of a new archive/revlib/cache format and the basis
> of new network transports.

I think one thing git's objects database is not very well suited for are
network transports. You want to have something smart doing the
transports, comparing trees so that it can do some delta compression;
that could probably reduce the amount of data needed to be sent
significantly.

> >From the `git' perspective, /Arch/ will replace the lame "directory
> cache" component of `git' with a proper revision control system.

I'm not sure if you fully grasped the git's philosophy yet. The
"directory cache" component is not by itself any revision control system
- it is merely a staging area for any revision system on top of it (IOW:
subordinate, not competitor).

> I started here:
> 
>    http://www.seyza.com/=clients/linus/tree/index.html
> 
> and for those interested in `git'-theory, a good place to start is
> 
>    http://www.seyza.com/=clients/linus/tree/src/liblob/index.html

These pages are surely very nice, unfortunately I have to enjoy them
only from the "HTML source" view. The HTML seems completely broken,
containing unterminated comments like "<!-- BEGIN  the main body>". :-(

You didn't go into surely interesting details regarding what will you be
fixing regarding ancestry graphs.

Also, I have some concerns about your naming scheme. First, why do you
include the size in the filename? Second, with ..../..../ you are
_seriously_ worse off than with ../. The first will put 1/256 of project
files to each directory, where with the second you will have
1/4294967296 of project files per directory. I think the point of
directory is that it is a container grouping certain files in a certain
way; in the objects database it is done purely for performance (and
compatibility, to a degree) reasons, but your way it will have worse
performance characteristics at least until the project accumulates
4294967296 files in the database.

Kind regards,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Git hangs while executing commit-tree
From: Rhys Hardwick @ 2005-04-20 21:28 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2380 bytes --]

Hey,

The following is a copy of the terminal session in question:

rhys@metatron:~/repo/tmp.repo$ ls
rhys@metatron:~/repo/tmp.repo$ init-db
defaulting to local storage area
rhys@metatron:~/repo/tmp.repo$ ls -l .git
total 4
drwxr-xr-x  258 rhys rhys 4096 2005-04-20 20:52 objects
rhys@metatron:~/repo/tmp.repo$ ls .git/objects/
00  0d  1a  27  34  41  4e  5b  68  75  82  8f  9c  a9  b6  c3  d0  dd  ea  f7
01  0e  1b  28  35  42  4f  5c  69  76  83  90  9d  aa  b7  c4  d1  de  eb  f8
02  0f  1c  29  36  43  50  5d  6a  77  84  91  9e  ab  b8  c5  d2  df  ec  f9
03  10  1d  2a  37  44  51  5e  6b  78  85  92  9f  ac  b9  c6  d3  e0  ed  fa
04  11  1e  2b  38  45  52  5f  6c  79  86  93  a0  ad  ba  c7  d4  e1  ee  fb
05  12  1f  2c  39  46  53  60  6d  7a  87  94  a1  ae  bb  c8  d5  e2  ef  fc
06  13  20  2d  3a  47  54  61  6e  7b  88  95  a2  af  bc  c9  d6  e3  f0  fd
07  14  21  2e  3b  48  55  62  6f  7c  89  96  a3  b0  bd  ca  d7  e4  f1  fe
08  15  22  2f  3c  49  56  63  70  7d  8a  97  a4  b1  be  cb  d8  e5  f2  ff
09  16  23  30  3d  4a  57  64  71  7e  8b  98  a5  b2  bf  cc  d9  e6  f3
0a  17  24  31  3e  4b  58  65  72  7f  8c  99  a6  b3  c0  cd  da  e7  f4
0b  18  25  32  3f  4c  59  66  73  80  8d  9a  a7  b4  c1  ce  db  e8  f5
0c  19  26  33  40  4d  5a  67  74  81  8e  9b  a8  b5  c2  cf  dc  e9  f6
rhys@metatron:~/repo/tmp.repo$ find . -type f
rhys@metatron:~/repo/tmp.repo$ mkdir src
rhys@metatron:~/repo/tmp.repo$ pico src/hello.c
rhys@metatron:~/repo/tmp.repo$ pico Makefile
rhys@metatron:~/repo/tmp.repo$ update-cache -add Makefile src/hello.c
fatal: unknown option -add
rhys@metatron:~/repo/tmp.repo$ update-cache --add Makefile src/hello.c
rhys@metatron:~/repo/tmp.repo$ write-tree
c80156fafbac377ab35beb076090c8320f874f91
rhys@metatron:~/repo/tmp.repo$ commit-tree 
c80156fafbac377ab35beb076090c8320f874f91
Committing initial tree c80156fafbac377ab35beb076090c8320f874f91
 


At this point, the command seems to be just waiting.  I have had it waiting 
for around 2 hours now!  I have tried removing ~/repo/tmp.repo and starting 
over, with exactly the same results.

I was testing git by following the tutorial posted by Tony Luck on this list.  
I updated and built the latest version of git, using git, at around 2000 GMT 
today.  I have attached the Makefile and hello.c if anyone finds them useful.

Thanks for any help,

Rhys

[-- Attachment #2: Makefile --]
[-- Type: text/x-makefile, Size: 48 bytes --]

hello: src/hello.c
	cc -o hello -O src/hello.c


[-- Attachment #3: hello.c --]
[-- Type: text/x-csrc, Size: 59 bytes --]

#include <stdio.h>

main()
{
	printf("Hello, world!\n");
}

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Greg KH @ 2005-04-20 21:19 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050420205633.GC19112@pasky.ji.cz>

On Wed, Apr 20, 2005 at 10:56:33PM +0200, Petr Baudis wrote:
>   The short command version will change from 'git' to 'cg', which should
> be shorter to type and free the 'git' command for possible eventual
> entry gate for the git commands (so that they are more
> namespace-friendly, and it might make most sense anyway if we get fully
> libgitized; but this is more of long-term ideas).

Hm, but there already is a 'cg' program out there:
	http://uzix.org/cgvg.html
I use it every day :(

How about 'cog' instead?

Or I can just rename my local copy of cg and try to retrain my
fingers...

thanks,

greg k-h

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Petr Baudis @ 2005-04-20 21:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Schlemmer, David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050420203235.GA13270@elte.hu>

Dear diary, on Wed, Apr 20, 2005 at 10:32:35PM CEST, I got a letter
where Ingo Molnar <mingo@elte.hu> told me that...
> 
> * Petr Baudis <pasky@ucw.cz> wrote:
> 
> > > yet another thing: what is the canonical 'pasky way' of simply nuking 
> > > the current files and checking out the latest tree (according to 
> > > .git/HEAD). Right now i'm using a script to:
> > > 
> > >   read-tree $(tree-id $(cat .git/HEAD))
> > >   checkout-cache -a
> > > 
> > > (i first do an 'rm -f *' in the working directory)
> > > 
> > > i guess there's an existing command for this already?
> > 
> > git cancel
> 
> hm, that's a pretty unintuitive name though. How about making it 'git 
> checkout' and providing a 'git checkout -f' option to force the 
> checkout? (or something like this)

Since it does not really checkout. Ok, it does, but that's only small
part of it. It just cancels whatever local changes are you doing in the
tree and bring it to consistent state. When you have a merge in progress
and after you see the sheer number of conflicts you decide to get your
hands off, you type just git cancel. Doing basically anything with your
tree (not only local changes checkout would fix, but also various git
operations, including git add/rm and git seek) can be easily fixed by
git cancel.

Dear diary, on Wed, Apr 20, 2005 at 10:45:51PM CEST, I got a letter
where Ingo Molnar <mingo@elte.hu> told me that...
> 
> * Petr Baudis <pasky@ucw.cz> wrote:
> 
> > Dear diary, on Wed, Apr 20, 2005 at 09:01:57AM CEST, I got a letter
> > where Ingo Molnar <mingo@elte.hu> told me that...
> > >  [...]
> > >  fatal: unable to execute 'gitmerge-file.sh'
> > >  fatal: merge program failed
> > 
> > Pure stupidity of mine, I forgot to add gitmerge-file.sh to the list of
> > scripts which get installed.
> 
> another thing is this annoying message:
> 
>  rsync: link_stat "/linux/kernel/people/torvalds/git.git/tags" (in pub) 
>  failed: No such file or directory (2)
>  rsync error: some files could not be transferred (code 23) at 
>  main.c(812)
>  client: nothing to do: perhaps you need to specify some filenames or 
>  the --recursive option?
> 
> you said before that it's "harmless", but it's annoying nevertheless as 
> one doesnt know for sure whether the pull went fine.

Already fixed. (Well, "fixed"... sent to /dev/null. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Petr Baudis @ 2005-04-20 21:03 UTC (permalink / raw)
  To: git
In-Reply-To: <20050420205633.GC19112@pasky.ji.cz>

Dear diary, on Wed, Apr 20, 2005 at 10:56:33PM CEST, I got a letter
where Petr Baudis <pasky@ucw.cz> told me that...
>   cg pull will now always only pull, never merge.
> 
>   cg update will do pull + merge.

Note that what you will probably do _most_ by far is cg update.
You generally do cg pull only when you want to make sure you have the
latest and greatest when doing some cg diff or whatever, or on your
notebook when getting on an airplane. And you do direct cg merge generally
only on the airplane.

I also forgot one last usage change:

  cg fork BNAME BRANCH_DIR [COMMIT_ID]
  ->
  cg fork BRANCH_DIR [BNAME] [COMMIT_ID]

This will bring its usage in sync to both cg export and cg tag.
The branch name will also default to the last element in the
BRANCH_DIR path (that annoyed me a lot, basically writing a thing
two times at single line).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* [ANNOUNCE] git-pasky-0.6.2 && heads-up on upcoming changes
From: Petr Baudis @ 2005-04-20 20:56 UTC (permalink / raw)
  To: git

  Hello,

  so I've "released" git-pasky-0.6.2 (my SCMish layer on top of Linus
Torvalds' git tree history storage system), find it at the usual

	http://pasky.or.cz/~pasky/dev/git/

  git-pasky-0.6 has couple of big changes; mainly enhanced git diff,
git patch (to be renamed to cg mkpatch), enhanced git pull and
completely reworked git merge - it now uses the git-core facilities for
merging, and does the merges in-tree. Plenty of smaller stuff, some
bugfixes and some new bugs, and of course regular merging with Linus.

  The most important change for current users is the objects database
SHA1 keys change and (comparatively minor) directory cache format
change. This makes "pulling up" from older revisions rather difficult.
Linus' instructions _should_ work for you too, basically (you should
replace cat .git/HEAD with cat .git/heads/* or equivalent - note that
convert-tree does not accept multiple arguments so you need to invoke it
multiple times), but I didn't test it well (I did it the lowlevel way
completely since I needed to simultaneously merge with Linus).

  But if you can't be bothered by this or fear touching stuff like that,
and you do not have any local commits in your tree (it would be pretty
strange if you had and still fear), just fetch the tarball (which is
preferrable than git init for me since it eats up _significantly_
smaller portion of my bandwidth).

  I had to release git-pasky-0.6.1 since Linus changed the directory
cache format during me releasing git-pasky-0.6. And git-pasky-0.6.2
fixes gitmerge-file.sh script missing in the list of scripts for
install.


  So, now for the heads-up part. We will undergo at least two major
changes now. First, I'll probably make git-pasky to use the directory
cache for the add/rm queues now that we have diff-cache.

  Second, I've decided to straighten up the naming now that we still
have a chance. There will be no git-pasky-0.7, sorry. You'll get
cogito-0.7 instead. I've decided for it since after some consideration
having it named differently is the right thing (tm).

  The short command version will change from 'git' to 'cg', which should
be shorter to type and free the 'git' command for possible eventual
entry gate for the git commands (so that they are more
namespace-friendly, and it might make most sense anyway if we get fully
libgitized; but this is more of long-term ideas).

  The usage changes:

  cg patch -> cg mkpatch	('patch' is the program which _applies_ it)
  cg apply -> cg patch		(analogically to diff | patch)

  cg pull will now always only pull, never merge.

  cg update will do pull + merge.

  cg track will either just set the default for cg update if you pass it
no parameters, or disappear altogether; I think it could default to the
'origin' branch (or 'master' branch for non-master branches if no 'origin'
branch is around), and I'd rather set up some "cg admin" where you could
set all this stuff - from this to e.g. the committer details [*1*]. You
likely don't need to change the default every day.

  I must say that I'm pretty happy with the Cogito's command set
otherwise, though. I actually think it has now (almost?) all commands
it needs, and it is not too likely that (many) more will be added -
simple means easy to use, which is Cogito's goal. Compare with
the command set of GNU arch clones. ;-)


  [*1*] The committer details in .git would override the environemnt
variables to discourage people of trying to alter them based on
whatever, since that's not what they are supposed to do. They can always
just change the .git stuff if they _really_ need to.


  Comments welcomed, as well as new ideas. Persuading me to change what
I sketched here will need some good arguments, though. ;-)

  Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Ingo Molnar @ 2005-04-20 20:45 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Martin Schlemmer, David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050420200504.GB19112@pasky.ji.cz>


* Petr Baudis <pasky@ucw.cz> wrote:

> Dear diary, on Wed, Apr 20, 2005 at 09:01:57AM CEST, I got a letter
> where Ingo Molnar <mingo@elte.hu> told me that...
> >  [...]
> >  fatal: unable to execute 'gitmerge-file.sh'
> >  fatal: merge program failed
> 
> Pure stupidity of mine, I forgot to add gitmerge-file.sh to the list of
> scripts which get installed.

another thing is this annoying message:

 rsync: link_stat "/linux/kernel/people/torvalds/git.git/tags" (in pub) 
 failed: No such file or directory (2)
 rsync error: some files could not be transferred (code 23) at 
 main.c(812)
 client: nothing to do: perhaps you need to specify some filenames or 
 the --recursive option?

you said before that it's "harmless", but it's annoying nevertheless as 
one doesnt know for sure whether the pull went fine.

	Ingo

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Ingo Molnar @ 2005-04-20 20:32 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Martin Schlemmer, David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050420200504.GB19112@pasky.ji.cz>


* Petr Baudis <pasky@ucw.cz> wrote:

> > yet another thing: what is the canonical 'pasky way' of simply nuking 
> > the current files and checking out the latest tree (according to 
> > .git/HEAD). Right now i'm using a script to:
> > 
> >   read-tree $(tree-id $(cat .git/HEAD))
> >   checkout-cache -a
> > 
> > (i first do an 'rm -f *' in the working directory)
> > 
> > i guess there's an existing command for this already?
> 
> git cancel

hm, that's a pretty unintuitive name though. How about making it 'git 
checkout' and providing a 'git checkout -f' option to force the 
checkout? (or something like this)

	Ingo

^ permalink raw reply

* Re: [PATCH] Some documentation...
From: Linus Torvalds @ 2005-04-20 20:15 UTC (permalink / raw)
  To: David Greaves; +Cc: C. Scott Ananian, git
In-Reply-To: <426692D1.20304@dgreaves.com>



On Wed, 20 Apr 2005, David Greaves wrote:
> 
> So maybe it's left as documented behaviour and higher level tools must 
> manage the data they feed to it...

That was the plan.

I agree that "find . -type f | xargs update-cache --add --" in _theory_ is
a nice thing to do. But in practice, you want to make sure that find 
doesn't incldue the ".git" directory and that we always use the canonical 
names for all files etc etc.

I could do it in the low-level tools (ie do pathname cleanup there), and
indeed I did exactly that in the original code sequence. However, it very
quickly became obvious that the low-level code really doesn't want to
care, and that it's a lot easier to just do it at a higher level when 
necessary.

For example, if you have to add a sed-script or something that just 
removes '^./' and "^.git/", then that's trivial to do, and it leaves the 
core tools with a very clear agenda in life.

		Linus

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Petr Baudis @ 2005-04-20 20:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Martin Schlemmer, David Greaves, dwheeler, Daniel Barkalow, git
In-Reply-To: <20050420070157.GA12584@elte.hu>

Dear diary, on Wed, Apr 20, 2005 at 09:01:57AM CEST, I got a letter
where Ingo Molnar <mingo@elte.hu> told me that...
>  [...]
>  fatal: unable to execute 'gitmerge-file.sh'
>  fatal: merge program failed

Pure stupidity of mine, I forgot to add gitmerge-file.sh to the list of
scripts which get installed.

> another thing: it's confusing that during 'git pull', the rsync output 
> is not visible. Especially during large rsyncs, it would be nice to see 
> some progress. So i usually use a raw rsync not 'git pull', due to this.

Fixed. For further reference, you can also set RSYNC_FLAGS and put
whatever pleases you there.

> yet another thing: what is the canonical 'pasky way' of simply nuking 
> the current files and checking out the latest tree (according to 
> .git/HEAD). Right now i'm using a script to:
> 
>   read-tree $(tree-id $(cat .git/HEAD))
>   checkout-cache -a
> 
> (i first do an 'rm -f *' in the working directory)
> 
> i guess there's an existing command for this already?

git cancel

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox