Git development

Git development
 help / color / mirror / Atom feed

* Re: How to get bash to shut up about SIGPIPE?
From: Linus Torvalds @ 2005-05-04  2:50 UTC (permalink / raw)
  To: David A. Wheeler; +Cc: Paul Jackson, Git Mailing List
In-Reply-To: <427833AE.6030505@dwheeler.com>

On Tue, 3 May 2005, David A. Wheeler wrote:
> 
> I wonder, does a top-level trap work? E.g.:
>   trap "" SIGPIPE

No.

But putting the traps _inside_ the loops seems to help. So something like 
the appended at least makes it somewhat useful

And yes, you need them at both levels, it appears. Or maybe that just 
changes the timing enough. Whatever.

			Linus
----
Index: cg-log
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-log  (mode:100755 sha1:aa2abf370753117a350818dbc91991b14d30ec6b)
+++ uncommitted/cg-log  (mode:100755)
@@ -47,10 +47,12 @@
 fi

 $revls | $revsort | while read time commit parents; do
+	trap "exit 1" SIGPIPE
 	[ "$revfmt" = "git-rev-list" ] && commit="$time"
 	echo $colheader""commit ${commit%:*} $coldefault;
 	git-cat-file commit $commit | \
 		while read key rest; do
+			trap "exit 1" SIGPIPE
 			case "$key" in
 			"author"|"committer")
 				if [ "$key" = "author" ]; then

^ permalink raw reply

* Re: How to get bash to shut up about SIGPIPE?
From: David A. Wheeler @ 2005-05-04  2:30 UTC (permalink / raw)
  To: Paul Jackson; +Cc: git
In-Reply-To: <20050502091027.6753998e.pj@sgi.com>

I reported:
>>One approach is to install a trap for SIGPIPE in
>>non-terminating command in a pipeline where the
>>later items might not process all the data, e.g.:
>>   (trap {} SIGPIPE; find .) | head -1

Paul Jackson wrote:
> Both the versions of bash that I looked at (2.05 and 3.0) _still_
> complain even if SIGPIPE is trapped - they just complain with
> a more terse message, unless DONT_REPORT_SIGPIPE is not defined.
...
> What bash do you have that this trap silences?

Actually, I have a working bash, so I can't test any of
these work-arounds.  I was merely foolish enough to quote
the Debian discussion, where someone reported that as a
work-around. I had hopes that it would help.
That'll teach me to pay attention to documentation :-).

I wonder, does a top-level trap work? E.g.:
  trap "" SIGPIPE
  ...

Anyone, good luck to those with broken bashes...

--- David A. Wheeler

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark (/usr/bin/env again)
From: David A. Wheeler @ 2005-05-04  2:10 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Valdis.Kletnieks, Andrea Arcangeli, Matt Mackall, Linus Torvalds,
	linux-kernel, git
In-Reply-To: <4277B778.5020206@tmr.com>

Valdis.Kletnieks@vt.edu wrote:
>> Most likely, his python lives elsewhere than /usr/bin, and the 'env' call
>> results in causing a walk across $PATH to find it....

Bill Davidsen wrote:
> Assuming that he has env in a standard place... I hope this isn't going 
> to start some rash of efforts to make packages run on non-standard 
> toolchains, which add requirements for one tool to get around 
> misplacement of another.

The #!/usr/bin/env prefix is, in my opinion, a very good idea.
There are a very few systems where env isn't in /usr/bin, but they
were extremely rare years ago & are essentially extinct now.
Basically, it's a 99% solution; getting the last 1% is really painful,
but since getting the 99% is easy, let's do it and be happy.

There are LOTS of systems where Python, bash, etc., do NOT
live in whatever place you think of as "standard".
I routinely use an OpenBSD 3.1 system; there is no /usr/bin/bash,
but there _IS_ a /usr/local/bin/bash (in my PATH) and a /usr/bin/env.
So this /usr/bin/env stuff REALLY is useful on a lot of systems, such
as OpenBSD.  It's critical to me, at least!

This is actually really useful on ANY system, though.
Even if some interpreter IS where you think it should be,
that is NOT necessarily the interpreter you want to use.
Using "/usr/bin/env" lets you use PATH
to override things, so you don't HAVE to use the interpreter
in some fixed location.  That's REALLY handy for testing... I
can download the whizbang Python 9.8.2, set it on the path,
and see if everything works as expected.  It's also nice
if someone insists on never upgrading a package; you can
install an interpreter "locally".  Yes, you can patch all the
files up, but resetting a PATH is _much_ easier.

--- David A. Wheeler

^ permalink raw reply

* Sym-links, b/c-special files, pipes, ... Scope Creep
From: Brian O'Mahoney @ 2005-05-04  0:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v8y2wszdy.fsf@assigned-by-dhcp.cox.net>

Caution, let us all carefully understand the Source-Code/
Configuration Management issue.

I for one will be very happy if we get a really good distributed
SCM out of this.

Brian

^ permalink raw reply

* Re: Kernel autobuild now uses Git
From: Darren Williams @ 2005-05-04  0:20 UTC (permalink / raw)
  To: LKML; +Cc: Ia64 Linux, Git Mailing List
In-Reply-To: <20050503222951.GE26031@cse.unsw.EDU.AU>

Hi Darren

On Wed, 04 May 2005, Darren Williams wrote:

> Hi All
>   Our ia64 autobuild system has been moved from using
> BK to Git. Here we do a nightly pull on Linus's 
> (not so mainline) Git tree and test the ia64 build.
> 
> This may be a benefit to the Git developers to see
> the results of a nightly 'cg-pull'.
> 
> Thanks to all for the effort, the conversation from BK
> to Git was relatively painless.
> 
>  - dsw
> 

Hmmm, how about a URL
http://www.gelato.unsw.edu.au/kerncomp/ 

--------------------------------------------------
Darren Williams <dsw AT gelato.unsw.edu.au>
Gelato@UNSW <www.gelato.unsw.edu.au>
--------------------------------------------------

^ permalink raw reply

* [PATCH] Making cg-seek more robust
From: Pavel Roskin @ 2005-05-04  0:08 UTC (permalink / raw)
  To: git

Hello!

This patch does following:

If .git/HEAD is not a link, it's detected early, so that basename is not
called without arguments.

.git/HEAD is not deleted until validity of the branch is verified.

.git/blocked is deleted by "rm -f" to suppress unneeded error message if
it's missing.

Signed-off-by: Pavel Roskin <proski@gnu.org>

--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-seek  (mode:100755 sha1:111f7842e5ec20a1e0714e162ca221da5e06ce33)
+++ uncommitted/cg-seek  (mode:100755)
@@ -29,18 +29,23 @@
 if [ -s .git/blocked ]; then
 	branch=$(grep '^seeked from ' .git/blocked | sed 's/^seeked from //')
 else
-	branch=$(basename $(readlink .git/HEAD)) || die "HEAD is not on branch"
+	tmp=$(readlink .git/HEAD)
+	[ "$tmp" ] || die "HEAD is not on branch"
+	branch=$(basename "$tmp")
 fi
 
 curcommit=$(commit-id)
 
-rm .git/HEAD
 if [ ! "$dstcommit" ] || [ "$dstcommit" = "$branch" ]; then
+	rm .git/HEAD
 	ln -s "refs/heads/$branch" .git/HEAD
-	rm .git/blocked
+	rm -f .git/blocked
 	dstcommit=$(commit-id)
 else
-	echo $(commit-id "$dstcommit") >.git/HEAD
+	tmp=$(commit-id "$dstcommit")
+	[ "$tmp" ] || die "branch $dstcommit not found"
+	rm .git/HEAD
+	echo "$tmp" >.git/HEAD
 	[ -s .git/blocked ] || echo "seeked from $branch" >.git/blocked
 fi
 


-- 
Regards,
Pavel Roskin


^ permalink raw reply

* Re: cogito "origin" vs. HEAD
From: Benjamin Herrenschmidt @ 2005-05-03 23:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Git Mailing List
In-Reply-To: <20050503094723.GA22436@pasky.ji.cz>

On Tue, 2005-05-03 at 11:47 +0200, Petr Baudis wrote:
> Dear diary, on Tue, May 03, 2005 at 09:13:28AM CEST, I got a letter
> where Benjamin Herrenschmidt <benh@kernel.crashing.org> told me that...
> > > when accessing the remote repository, Cogito always looks for remote
> > > refs/heads/master first - if that one isn't there, it takes HEAD, but
> > > there is no correlation between the local and remote branch name. If you
> > > want to fetch a different branch from the remote repository, use the
> > > fragment identifier (see cg-help cg-branch-add).
> > 
> > Ok, that I'm getting. So then, what happen of my local
> > refs/heads/<branchname> and refs/heads/master/ ? I'm still a bit
> > confused by the whole branch mecanism... It's my understanding than when
> > I cg-init, it creates both "master" (a head without matching branch)
> > and "origin" (a branch  + a head) both having the same sha1. It also
> > checks out the tree.
> > 
> > Now, when I cg-update origin, what happens exactly ? I mean, I know it's
> > pulls all objects, then get the master from the remote pointed by the
> > origin branch, but then, I suppose it updates both my local "origin" and
> > my local "master" pointer, right ? I mean, they are always in sync ? Or
> > is this related to what branch my current checkout is tracking ?
> 
> They are in sync as long as you update only from that given branch.
> At the moment you do a local commit, they get out of sync, at least
> until your master branch is merged to the origin branch on the other
> side. Every cg-update will then generate a merging commit, so it will
> look like this:
> > .../...

Thanks for that detailed explanation !

Ben.



^ permalink raw reply

* Re: git and symlinks as tracked content
From: Junio C Hamano @ 2005-05-03 23:42 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Linus Torvalds, Andreas Gal, Kay Sievers, git
In-Reply-To: <427806CA.6030302@zytor.com>

>>>>> "HPA" == H Peter Anvin <hpa@zytor.com> writes:
HPA> Junio C Hamano wrote:

HPA> Owner and permissions are part of the tree object, and apply to
HPA> all file types.

>> Huh?  I am confused...  Do you mean tree object should be
>> changed to record these?  That would make the existing in-cache
>> merging of files, which GIT was built for, quite interesting...

HPA> No, the tree object *ALREADY* records these.

As you quoted (and before I uttered my previous confusion I did
look at the code in write-tree.c which I thought to match this
description) ...

HPA> TREE: The next hierarchical object type is the "tree" object.  A tree
HPA> object is a list of permission/name/blob data, sorted by name.  In other
HPA> words the tree object is uniquely determined by the set contents, and so
HPA> two separate but identical trees will always share the exact same
HPA> object.

... it records permission (but not in the 0660 vs 0600 sense ---
it just records executable bit for file blobs and the treeness
by recording S_IFDIR), name and SHA1.  There is no owner or
group information recorded there [*1*].

I am afraid I am missing something in my reading of write-tree.c

Quite confused...

[Footnote]

*1* Nor there should be.  Otherwise comparing two identical
trees representing the same set of files become meaningless.

The reason why I placed these information in my hypothetical
representation of device nodes is exactly that.  To record owner
and group information is meaningless and harmful for the purpose
of version controlling the source files but it matters _if_ we
wanted to maintain device nodes in GIT.  Since it matters only
for those things, it would be preferable to have it as part of
the data that describes the object (i.e. device nodes), not part
of the data that contains the object (i.e. tree).  And I thought
GIT tree object is already doing the right thing by not
recording them.

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Linus Torvalds @ 2005-05-03 23:42 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, Andreas Gal, Kay Sievers, git
In-Reply-To: <427806CA.6030302@zytor.com>

On Tue, 3 May 2005, H. Peter Anvin wrote:
> 
> No, the tree object *ALREADY* records these.

Not ownership.

Yes, the permissions are there, but if you actually want to track 
ownership (or things like "mtime" etc), you really do have to track it 
outside the tree object.

Also, right now git will actually ignore most of the permission bits too.  
We can change that, and make it a dynamic setting somewhere (some flag in
a ".git/settings" file or something), but it does boil down to the fact
that a software development tree tracker wants different things than
something that tracks system settings.

For example, generating different trees just because different users had
different umask settings clearly didn't work out. Which means that right
now git really only tracks the "owner execute" bit of the permissions, and
always resets the other bits to 0755 or 0644 depending on that _one_ bit.

And similarly, tracking actual uid/gid information would _really_ not work 
for a distributed kernel source management system, so that's not even in 
the tree.

So if you want to track system files, right now "raw git" is _not_ the way 
to do it. You'd want something else. 

Of course, that's actually true largely even of normal /dev contents.  
That's why we've moved towards udev, and having things like device
permissions and ownership not be "filesystem attributes", but really
_rules_ in a udev database. So the fact that git doesn't track them isn't
necessarily a problem for /dev - since modern /dev really wants to track
them at a higher level _anyway_ (and you'd use git to track the _rules_,
not the ownership things themselves).

But if you'd want to track other system directories with git, you'd
probably need to either (a) do serious surgery on git itself, or (probably
preferable) by (b) track the extra things you want "manually" using a file
(that is tracked in git) that describes the ownership and permission data.

Whether git is really suitable for tracking non-source projects is
obviously debatable. It's not what it was designed for, and it _may_ be 
able to do so partly just by luck.

			Linus

^ permalink raw reply

* Re: Careful object writing..
From: Alex Riesen @ 2005-05-03 23:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505031618360.26698@ppc970.osdl.org>

On 5/4/05, Linus Torvalds <torvalds@osdl.org> wrote:
> > > I also change the permission to 0444 before it gets its final name.
> > Maybe umask it first? Just in case.
> 
> I considered it, but it's not worth it.
> 
> If you don't want somebody else to see your objects, you should just
> disable execute permission on your .git directory. ...

Of course. It's leaves even more control to the user. Forget I asked :)

^ permalink raw reply

* Re: Careful object writing..
From: Junio C Hamano @ 2005-05-03 23:22 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <81b0412b05050316045fa31c2a@mail.gmail.com>

>>>>> "AR" == Alex Riesen <raa.lkml@gmail.com> writes:

AR> On 5/3/05, Linus Torvalds <torvalds@osdl.org> wrote:
>> I also change the permission to 0444 before it gets its final name.

AR> Maybe umask it first? Just in case.

In general worrying about umask when you see chmod is a good
practice, but it probably is not applicable to this particular
case.  A person with umask 077 should still get 0444 if the
SHA1_FILE_DIRECTORY is shared with other people, and if it is
not shared, his initial git-init-db would have made it 0700, so
it does not matter it the files files underneath it have 0444.

^ permalink raw reply

* Re: Careful object writing..
From: Linus Torvalds @ 2005-05-03 23:22 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Git Mailing List
In-Reply-To: <81b0412b05050316045fa31c2a@mail.gmail.com>

On Wed, 4 May 2005, Alex Riesen wrote:
>
> On 5/3/05, Linus Torvalds <torvalds@osdl.org> wrote:
> > I also change the permission to 0444 before it gets its final name.
> 
> Maybe umask it first? Just in case.

I considered it, but it's not worth it.

If you don't want somebody else to see your objects, you should just 
disable execute permission on your .git directory. "umask" is actually a 
fairly nasty interface, since it takes effect on create(), and we do want 
to fchmod _after_ the create (some filesystems don't like it when you 
create a non-writable object and then write to it, but more importantly, 
since we use "mkstemp()" for the temp-file handling, we don't even have 
control of the initial umask.

So to make it (0444 & umask) git would actually have to jump through silly
hoops. Without actually buying you anything new, since the access
permissions for git objects really are about the _directory_ anyway.

		Linus

^ permalink raw reply

* Re: git and symlinks as tracked content
From: H. Peter Anvin @ 2005-05-03 23:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Andreas Gal, Kay Sievers, git
In-Reply-To: <7v1x8nuchr.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:
>>>>>>"HPA" == H Peter Anvin <hpa@zytor.com> writes:
> 
> 
> HPA> Owner and permissions are part of the tree object, and apply to all
> HPA> file types.
> 
> Huh?  I am confused...  Do you mean tree object should be
> changed to record these?  That would make the existing in-cache
> merging of files, which GIT was built for, quite interesting...
> 
> Well, doing device nodes _is_ a tangent, so let's drop this
> discussion.
> 

No, the tree object *ALREADY* records these.

BLOB: A "blob" object is nothing but a binary blob of data, and doesn't
refer to anything else.  There is no signature or any other verification
of the data, so while the object is consistent (it _is_ indexed by its
sha1 hash, so the data itself is certainly correct), it has absolutely
no other attributes.  No name associations, no permissions.  It is
purely a blob of data (ie normally "file contents").

TREE: The next hierarchical object type is the "tree" object.  A tree
object is a list of permission/name/blob data, sorted by name.  In other
words the tree object is uniquely determined by the set contents, and so
two separate but identical trees will always share the exact same
object.

	-hpa

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Junio C Hamano @ 2005-05-03 23:16 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Linus Torvalds, Andreas Gal, Kay Sievers, git
In-Reply-To: <42780185.7010204@zytor.com>

>>>>> "HPA" == H Peter Anvin <hpa@zytor.com> writes:

HPA> Owner and permissions are part of the tree object, and apply to all
HPA> file types.

Huh?  I am confused...  Do you mean tree object should be
changed to record these?  That would make the existing in-cache
merging of files, which GIT was built for, quite interesting...

Well, doing device nodes _is_ a tangent, so let's drop this
discussion.

^ permalink raw reply

* Re: Careful object writing..
From: Alex Riesen @ 2005-05-03 23:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505031204030.26698@ppc970.osdl.org>

On 5/3/05, Linus Torvalds <torvalds@osdl.org> wrote:
> I also change the permission to 0444 before it gets its final name.

Maybe umask it first? Just in case.

^ permalink raw reply

* Re: [PATCH 0/3] cogito spec file updates
From: Chris Wright @ 2005-05-03 23:01 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Chris Wright, Mark Allen, git
In-Reply-To: <20050503214401.GE15995@pasky.ji.cz>

* Petr Baudis (pasky@ucw.cz) wrote:
> I wouldn't accept this neither. If git.spec is already version
> controlled, it should be up-to-date in the version control. Therefore,
> you need to update it at the time of release, not at the time of
> generating the tarball.

*shrug*

Well, it's guaranteed to be one step out of sync, but I'll just keep
doing what I've been doing.  Build rpm after you release, and send
patch ex-post-facto.

thanks,
-chris

^ permalink raw reply

* [PATCH] extend cg-export to support release tarball creation
From: Mark Allen @ 2005-05-03 22:57 UTC (permalink / raw)
  To: Petr Baudis, Chris Wright; +Cc: git

Here's a patch for cg-export which supports a "-r RELEASE" option. It will export the
specific ID or HEAD and then create a tar gz and a tar bz2 in your previous working
directory using RELEASE as the tarball name and the directory inside the tarball.

Within cogito itself,

cg-export -r $(cat VERSION) /tmp/foo

seems to work quite nicely to make release tarballs.

Unlike my previous attempt, this one does not make evil tarballs. :-)  One could insert a
script to update the .spec file before the tarfiles get created if desired.

Cheers,

--Mark

Signed-off-by: Mark Allen <mrallen1@yahoo.com>

Index: cg-export
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-export  (mode:100755
sha1:a4497314ab33f6a6387bc278e84f88d4442070ce)
+++ uncommitted/cg-export  (mode:100755 sha1:b0dfdf6049e7a074ef8c8c3f6a43e04a30e5cb89)
@@ -8,10 +8,17 @@

 . cg-Xlib

-destdir=$1
-id=$(tree-id $2)

-([ "$destdir" ] && [ "$id" ]) || die "usage: cg-export DESTDIR [TREE_ID]"
+if [ "$1" = "-r" ]; then
+       release=$2
+       destdir="$3/$2"
+       id=$(tree-id $4)
+else
+       destdir=$1
+       id=$(tree-id $2)
+fi
+
+([ "$destdir" ] && [ "$id" ]) || die "usage: cg-export [-r RELEASE] DESTDIR [TREE_ID]"

 [ -e "$destdir" ] && die "$destdir already exists."

@@ -20,3 +27,14 @@
 git-read-tree $id
 git-checkout-cache "--prefix=$destdir/" -a
 rm $GIT_INDEX_FILE
+
+if [ "$1" = "-r" ]; then
+       origdir=$PWD
+       cd $destdir
+       cd ..
+       tar czf $origdir/$release.tar.gz $release
+       tar cjf $origdir/$release.tar.bz2 $release
+       cd $origdir
+else
+       exit 0
+fi



^ permalink raw reply

* Re: git and symlinks as tracked content
From: H. Peter Anvin @ 2005-05-03 22:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Andreas Gal, Kay Sievers, git
In-Reply-To: <7vr7got2tz.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:
> 
> Introducing "dev" type, as Andreas suggests, is wrong.  This
> this should be done in the same way as you suggested for the
> symlink case.  Store a blob object with those chrdev or blkdev
> modes whose contents are of form:
> 
>     major=14
>     minor=4
>     owner=root
>     group=audio
>     perm=0660
> 
> This would impact the diff side least, and for the cache side it
> does not matter in storing and merging.  checkout-cache still
> needs to know about this.
> 

Owner and permissions are part of the tree object, and apply to all file 
types.  The only thing equivalent to file data is the major,minor; 
storing it as a comma-separated decimal ASCII string is probably the 
cleanest, i.e. for your exaple:

14,4

	-hpa

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Junio C Hamano @ 2005-05-03 22:44 UTC (permalink / raw)
  To: Andreas Gal; +Cc: Linus Torvalds, Kay Sievers, git
In-Reply-To: <Pine.LNX.4.58.0505031446220.31626@sam.ics.uci.edu>

>>>>> "AG" == Andreas Gal <gal@uci.edu> writes:

AG> Whether you use an explicit "dev" type or an implicit "dev"
AG> type that calls itself "blob" and uses a magic mode flag to
AG> tell checkout that it needs special treatment doesn't make a
AG> difference.

True.  The use of word "wrong" in my message was _wrong_.  But
my gut feeling is that the code that has to deal with the dev
and symlink stuff would be simpler if we just stick to the blob
type.

AG> When was the last time you tried to version control /dev? ;)

Tried?  Never.  Wished?  Number of times.  It's just that there
is no such SCM that does this natively, so I keep "ls -l /dev"
output under CVS control as a rough approximation.

^ permalink raw reply

* Re: Careful object writing..
From: H. Peter Anvin @ 2005-05-03 22:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Harkes, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505031531270.26698@ppc970.osdl.org>

Linus Torvalds wrote:
> 
> On Tue, 3 May 2005, Jan Harkes wrote:
> 
>>I tried to pull in the latest version of your tree, but it doesn't look
>>like this commit has propagated to rsync.kernel.org yet.
> 
> 
> Hmm.. It's still not there a few hours later. I wonder what the mirroring
> rules are. Or maybe mirroring is just broken right now. Peter?
> 
> One change introduced by me is that the new objects changed from 0664
> (-rw-rw-r--) to (0444) -r--r--r-- due to the object writing rules. Maybe
> the mirroring decides that such objects shouldn't be mirrored, since they
> are "private"?
> 
> Or maybe it's just that Peter has shut down mirroring in preparation for 
> the imminent memory upgrade on master.kernel.org. 
> 

No, I had stopped the cron job to fix a script bug and forgot to turn it 
back on.  It's pushing now.

	-hpa


^ permalink raw reply

* Re: Careful object writing..
From: Linus Torvalds @ 2005-05-03 22:37 UTC (permalink / raw)
  To: Jan Harkes, Peter Anvin; +Cc: Git Mailing List
In-Reply-To: <20050503200034.GA16104@delft.aura.cs.cmu.edu>

On Tue, 3 May 2005, Jan Harkes wrote:
> 
> I tried to pull in the latest version of your tree, but it doesn't look
> like this commit has propagated to rsync.kernel.org yet.

Hmm.. It's still not there a few hours later. I wonder what the mirroring
rules are. Or maybe mirroring is just broken right now. Peter?

One change introduced by me is that the new objects changed from 0664
(-rw-rw-r--) to (0444) -r--r--r-- due to the object writing rules. Maybe
the mirroring decides that such objects shouldn't be mirrored, since they
are "private"?

Or maybe it's just that Peter has shut down mirroring in preparation for 
the imminent memory upgrade on master.kernel.org. 

			Linus

^ permalink raw reply

* Kernel autobuild now uses Git
From: Darren Williams @ 2005-05-03 22:29 UTC (permalink / raw)
  To: LKML, Ia64 Linux; +Cc: Git Mailing List

Hi All
  Our ia64 autobuild system has been moved from using
BK to Git. Here we do a nightly pull on Linus's 
(not so mainline) Git tree and test the ia64 build.

This may be a benefit to the Git developers to see
the results of a nightly 'cg-pull'.

Thanks to all for the effort, the conversation from BK
to Git was relatively painless.

 - dsw

--------------------------------------------------
Darren Williams <dsw AT gelato.unsw.edu.au>
Gelato@UNSW <www.gelato.unsw.edu.au>
--------------------------------------------------

^ permalink raw reply

* Re: Careful object writing..
From: Linus Torvalds @ 2005-05-03 22:13 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Git Mailing List
In-Reply-To: <20050503205957.GA25253@delft.aura.cs.cmu.edu>

On Tue, 3 May 2005, Jan Harkes wrote:
> 
> Short summary:
> 
>     rc = link(old, new);
>     if (rc == -1 && errno == EXDEV)
> 	rc = rename(old, new);

Ok, that is safe enough. Will do.

> On Coda, the cross-directory link fails, the following cross-directory
> rename will work fine.  On a normal filesystem, if the link fails with
> EXDEV, the rename will fail with the same.

Yup. I do suspect that since you handle the rename anyway, you probably 
could handle the git link/unlink patterns too, but it's easy enough to 
just do the rename fallback in git itself.

The only reason not to use rename in the first place is literally just to 
be able to check for collisions. Which we don't actually _do_ right now, 
but I like to be able to do so in theory.

		Linus

^ permalink raw reply

* [PATCH] Optimize diff-cache -p --cached
From: Junio C Hamano @ 2005-05-03 22:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs to create temporary files when either the blob
recorded in the cache or in the compared tree matches the
corresponding file in the work tree.

Here is an informal benchmark on my Duron 750.  The tests were
done by running unpatched and patched alternately number of
times, and these are the last three pairs:

 real 0m0.738s user 0m0.630s sys 0m0.100s unpatched
 real 0m0.695s user 0m0.590s sys 0m0.100s patched
 real 0m0.733s user 0m0.560s sys 0m0.170s unpatched
 real 0m0.705s user 0m0.590s sys 0m0.110s patched
 real 0m0.732s user 0m0.550s sys 0m0.180s unpatched
 real 0m0.692s user 0m0.590s sys 0m0.100s patched

The benchmark was run in a fully checked out linux-2.6 GIT
repository.  The work tree matched one commit, and comparison
was done against another commit which was 20-or-so commits
before the work tree.

$ new=a6ad57fb4b5e9d68553f4440377b99f75588fa88
$ old=cd63499cbe37e53e6cc084c8a35d911a4613c797
$ git-read-tree $new
$ git-checkout-cache -f -a
$ git-update-cache --refresh
$ git-rev-tree $new ^$old | wc -l
19
$ export GIT_EXTERNAL_DIFF=true
$ time git-diff-cache -p --cached $old

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diff-tree-helper.c |    6 ++--
diff.c             |   67 ++++++++++++++++++++++++++++++++++++++++++-----------
diff.h             |   13 +++-------
3 files changed, 62 insertions(+), 24 deletions(-)

# - 2: Use GIT_EXTERNAL_DIFF exit status to terminate diff early.
# + 5: diff-cache --cached case optimization.
--- a/diff-tree-helper.c
+++ b/diff-tree-helper.c
@@ -35,7 +35,7 @@ static int parse_oneside_change(const ch
 	if (strncmp(cp, "\tblob\t", 6))
 		return -1;
 	cp += 6;
-	if (get_sha1_hex(cp, one->u.sha1))
+	if (get_sha1_hex(cp, one->blob_sha1))
 		return -1;
 	cp += 40;
 	if (*cp++ != '\t')
@@ -83,13 +83,13 @@ static int parse_diff_tree_output(const 
 		if (strncmp(cp, "\tblob\t", 6))
 			return -1;
 		cp += 6;
-		if (get_sha1_hex(cp, old.u.sha1))
+		if (get_sha1_hex(cp, old.blob_sha1))
 			return -1;
 		cp += 40;
 		if (strncmp(cp, "->", 2))
 			return -1;
 		cp += 2;
-		if (get_sha1_hex(cp, new.u.sha1))
+		if (get_sha1_hex(cp, new.blob_sha1))
 			return -1;
 		cp += 40;
 		if (*cp++ != '\t')
--- a/diff.c
+++ b/diff.c
@@ -132,11 +132,50 @@ static void builtin_diff(const char *nam
 	execlp("/bin/sh","sh", "-c", cmd, NULL);
 }
 
+/*
+ * Given a name and sha1 pair, if the dircache tells us the file in
+ * the work tree has that object contents, return true, so that
+ * prepare_temp_file() does not have to inflate and extract.
+ */
+static int work_tree_matches(const char *name, const unsigned char *sha1)
+{
+	struct cache_entry *ce;
+	struct stat st;
+	int pos, len;
+	
+	/* We do not read the cache ourselves here, because the
+	 * benchmark with my previous version that always reads cache
+	 * shows that it makes things worse for diff-tree comparing
+	 * two linux-2.6 kernel trees in an already checked out work
+	 * tree.  This is because most diff-tree comparison deals with
+	 * only a small number of files, while reading the cache is
+	 * expensive for a large project, and its cost outweighs the
+	 * savings we get by not inflating the object to a temporary
+	 * file.  Practically, this code only helps when we are used
+	 * by diff-cache --cached, which does read the cache before
+	 * calling us.
+	 */ 
+	if (!active_cache)
+		return 0;
+
+	len = strlen(name);
+	pos = cache_name_pos(name, len);
+	if (pos < 0)
+		return 0;
+	ce = active_cache[pos];
+	if ((stat(name, &st) < 0) ||
+	    cache_match_stat(ce, &st) ||
+	    memcmp(sha1, ce->sha1, 20))
+		return 0;
+	return 1;
+}
+
 static void prepare_temp_file(const char *name,
 			      struct diff_tempfile *temp,
 			      struct diff_spec *one)
 {
 	static unsigned char null_sha1[20] = { 0, };
+	int use_work_tree = 0;
 
 	if (!one->file_valid) {
 	not_a_valid_file:
@@ -150,20 +189,22 @@ static void prepare_temp_file(const char
 	}
 
 	if (one->sha1_valid &&
-	    !memcmp(one->u.sha1, null_sha1, sizeof(null_sha1))) {
-		one->sha1_valid = 0;
-		one->u.name = name;
-	}
+	    (!memcmp(one->blob_sha1, null_sha1, sizeof(null_sha1)) ||
+	     work_tree_matches(name, one->blob_sha1)))
+		use_work_tree = 1;
 
-	if (!one->sha1_valid) {
+	if (!one->sha1_valid || use_work_tree) {
 		struct stat st;
-		temp->name = one->u.name;
+		temp->name = name;
 		if (stat(temp->name, &st) < 0) {
 			if (errno == ENOENT)
 				goto not_a_valid_file;
 			die("stat(%s): %s", temp->name, strerror(errno));
 		}
-		strcpy(temp->hex, sha1_to_hex(null_sha1));
+		if (!one->sha1_valid)
+			strcpy(temp->hex, sha1_to_hex(null_sha1));
+		else
+			strcpy(temp->hex, sha1_to_hex(one->blob_sha1));
 		sprintf(temp->mode, "%06o",
 			S_IFREG |ce_permissions(st.st_mode));
 	}
@@ -173,10 +214,10 @@ static void prepare_temp_file(const char
 		char type[20];
 		unsigned long size;
 
-		blob = read_sha1_file(one->u.sha1, type, &size);
+		blob = read_sha1_file(one->blob_sha1, type, &size);
 		if (!blob || strcmp(type, "blob"))
 			die("unable to read blob object for %s (%s)",
-			    name, sha1_to_hex(one->u.sha1));
+			    name, sha1_to_hex(one->blob_sha1));
 
 		strcpy(temp->tmp_path, ".diff_XXXXXX");
 		fd = mkstemp(temp->tmp_path);
@@ -187,7 +228,7 @@ static void prepare_temp_file(const char
 		close(fd);
 		free(blob);
 		temp->name = temp->tmp_path;
-		strcpy(temp->hex, sha1_to_hex(one->u.sha1));
+		strcpy(temp->hex, sha1_to_hex(one->blob_sha1));
 		temp->hex[40] = 0;
 		sprintf(temp->mode, "%06o", one->mode);
 	}
@@ -285,7 +326,7 @@ void diff_addremove(int addremove, unsig
 	char concatpath[PATH_MAX];
 	struct diff_spec spec[2], *one, *two;
 
-	memcpy(spec[0].u.sha1, sha1, 20);
+	memcpy(spec[0].blob_sha1, sha1, 20);
 	spec[0].mode = mode;
 	spec[0].sha1_valid = spec[0].file_valid = 1;
 	spec[1].file_valid = 0;
@@ -310,9 +351,9 @@ void diff_change(unsigned old_mode, unsi
 	char concatpath[PATH_MAX];
 	struct diff_spec spec[2];
 
-	memcpy(spec[0].u.sha1, old_sha1, 20);
+	memcpy(spec[0].blob_sha1, old_sha1, 20);
 	spec[0].mode = old_mode;
-	memcpy(spec[1].u.sha1, new_sha1, 20);
+	memcpy(spec[1].blob_sha1, new_sha1, 20);
 	spec[1].mode = new_mode;
 	spec[0].sha1_valid = spec[0].file_valid = 1;
 	spec[1].sha1_valid = spec[1].file_valid = 1;
--- a/diff.h
+++ b/diff.h
@@ -20,15 +20,12 @@ extern void diff_unmerge(const char *pat
 /* These are for diff-tree-helper */
 
 struct diff_spec {
-	union {
-		const char *name;       /* path on the filesystem */
-		unsigned char sha1[20]; /* blob object ID */
-	} u;
+	unsigned char blob_sha1[20];
 	unsigned short mode;	 /* file mode */
-	unsigned sha1_valid : 1; /* if true, use u.sha1 and trust mode.
-				  * (however with a NULL SHA1, read them
-				  * from the file!).
-				  * if false, use u.name and read mode from
+	unsigned sha1_valid : 1; /* if true, use blob_sha1 and trust mode;
+				  * however with a NULL SHA1, read them
+				  * from the file system.
+				  * if false, use the name and read mode from
 				  * the filesystem.
 				  */
 	unsigned file_valid : 1; /* if false the file does not even exist */


^ permalink raw reply

* Re: git and symlinks as tracked content
From: Andreas Gal @ 2005-05-03 21:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Kay Sievers, git
In-Reply-To: <7vr7got2tz.fsf@assigned-by-dhcp.cox.net>


Whether you use an explicit "dev" type or an implicit "dev" type that 
calls itself "blob" and uses a magic mode flag to tell checkout that it 
needs special treatment doesn't make a difference (whatever you 
prefer, really). I was only trying to make the point that hashes should remain 
hashes and not become a placeholder for minors/majors. However, as 
somebody already suggested, the entire issue is probably moot. When was the last 
time you tried to version control /dev? ;)

Andreas

On Tue, 3 May 2005, Junio C Hamano wrote:

> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> LT> On Tue, 3 May 2005, Andreas Gal wrote:
> 
> >> Yuck. Thats really ugly. Right now all files have a uniform
> >> touch to them.  For every hash you can locate the file,
> >> determine its type/tag, unpack it, and check the SHA1
> >> hash. The proposal above breaks all that. Why not just
> >> introduce a new object type "dev" and put major minor in
> >> there. It will still always hash to the same SHA1 hash value,
> >> but fits much better in the overall design.
> 
> LT> Hey, I don't personally care that much. I don't see anybody using 
> LT> character device nodes in the kernel tree, and I don't think most SCM's 
> LT> support stuff like that anyway ;)
> 
> LT> If you want to make it a blob (and have a use for it), go wild. 
> 
> Introducing "dev" type, as Andreas suggests, is wrong.  This
> this should be done in the same way as you suggested for the
> symlink case.  Store a blob object with those chrdev or blkdev
> modes whose contents are of form:
> 
>     major=14
>     minor=4
>     owner=root
>     group=audio
>     perm=0660
> 
> This would impact the diff side least, and for the cache side it
> does not matter in storing and merging.  checkout-cache still
> needs to know about this.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox