Git development
 help / color / mirror / Atom feed
* Re: Howto update a 'dirty' entry in the cache from the object database
From: Linus Torvalds @ 2005-05-04 17:23 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: GIT
In-Reply-To: <20050504142351.GL18380@cip.informatik.uni-erlangen.de>



On Wed, 4 May 2005, Thomas Glanzmann wrote:
>
> I edited a bunch of files and notice that I want to revert back one of
> the files to the original one (the one that comes with HEAD). But I
> already updated the cache. Is there a way to 'patch' the cache with the
> original file.

Absolutely.

You can do it several ways. The easiest one is

	git-read-tree -m HEAD
	git-checkout-cache -f filename

but this means that you revert your entire index file to the old HEAD 
information, so you'll need to do git-update-cache on the files that you 
changed.

So:

> Or do I need todo a read-tree into a temporary index
> file. Check out the one file from this alternate index file and use
> update-cache to update my original index file?

Yes. If you want to touch just that one file, and you actually _want_ to
save all the other index information updates except for this one file. If
so, you can do

	# read a new index file with the HEAD information
	GIT_INDEX_FILE=tmp-index git-read-tree HEAD

	# check out just the one file you want to have
	GIT_INDEX_FILE=tmp-index git-checkout-cache -f filename

	# remove the now useless temporary index
	rm tmp-index

	# update your _real_ index file with the file information
	git-update-cache filename

and you're done - you've now updated only that _one_ file in your working 
area, and you've re-set the index file for that entry.

And yes, scripting it so that you don't make stupid mistakes is probably a
good idea. For exmple, if you misspell GIT_INDEX_FILE (like I did when I
write that at first), you'd have reverted your real index file after all.

In other words, the git core certainly is very flexible and can easily do 
what you want, but it's also certainly very very easy to screw up with.

		Linus

^ permalink raw reply

* Re: Howto update a 'dirty' entry in the cache from the object database
From: Junio C Hamano @ 2005-05-04 17:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, GIT
In-Reply-To: <Pine.LNX.4.58.0505041014510.2328@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> On Wed, 4 May 2005, Thomas Glanzmann wrote:
>> 
>> I edited a bunch of files and notice that I want to revert back one of
>> the files to the original one (the one that comes with HEAD). But I
>> already updated the cache. Is there a way to 'patch' the cache with the
>> original file.

LT> Absolutely.

LT> You can do it several ways. The easiest one is

LT> 	git-read-tree -m HEAD
LT> 	git-checkout-cache -f filename

LT> but this means that you revert your entire index file to the old HEAD 
LT> information, so you'll need to do git-update-cache on the files that you 
LT> changed.

I think the easiest one that does not lose the cache for other
files is:

    git-diff-cache -z HEAD |
    GIT_EXTERNAL_DIFF=git-apply-patch-script \
        git-diff-tree-helper -z -R filename

This means: find diff from HEAD to the cache, and apply the diff
in the reverse, only to the filename.


^ permalink raw reply

* [PATCH] Fix memory leaks in git-fsck-cache
From: Sergey Vlasov @ 2005-05-04 17:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch fixes memory leaks in parse_object() and related functions;
these leaks were very noticeable when running git-fsck-cache.

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>

---

 blob.c   |    1 +
 commit.c |    4 +++-
 object.c |    3 ++-
 tag.c    |   21 ++++++++++++++-------
 tree.c   |    9 +++++++--
 5 files changed, 27 insertions(+), 11 deletions(-)

Index: blob.c
===================================================================
--- 51a882a2dc62e0d3cdc79e0badc61559fb723481/blob.c  (mode:100644 sha1:d4af4a309433744d2fe819886d66741ab016f62b)
+++ uncommitted/blob.c  (mode:100644)
@@ -34,6 +34,7 @@ int parse_blob(struct blob *item)
         if (!buffer)
                 return error("Could not read %s",
                              sha1_to_hex(item->object.sha1));
+	free(buffer);
         if (strcmp(type, blob_type))
                 return error("Object %s not a blob",
                              sha1_to_hex(item->object.sha1));
Index: commit.c
===================================================================
--- 51a882a2dc62e0d3cdc79e0badc61559fb723481/commit.c  (mode:100644 sha1:3956c7ba961781f72b39c42368df1e76b2d035dd)
+++ uncommitted/commit.c  (mode:100644)
@@ -54,9 +54,11 @@ int parse_commit(struct commit *item)
 	if (!buffer)
 		return error("Could not read %s",
 			     sha1_to_hex(item->object.sha1));
-	if (strcmp(type, commit_type))
+	if (strcmp(type, commit_type)) {
+		free(buffer);
 		return error("Object %s not a commit",
 			     sha1_to_hex(item->object.sha1));
+	}
 	get_sha1_hex(bufptr + 5, parent);
 	item->tree = lookup_tree(parent);
 	if (item->tree)
Index: object.c
===================================================================
--- 51a882a2dc62e0d3cdc79e0badc61559fb723481/object.c  (mode:100644 sha1:ca4af8fa2dc0672b92310a3ebdd4d14bf070dd69)
+++ uncommitted/object.c  (mode:100644)
@@ -107,11 +107,12 @@ struct object *parse_object(unsigned cha
 		char type[100];
 		unsigned long size;
 		void *buffer = unpack_sha1_file(map, mapsize, type, &size);
+		munmap(map, mapsize);
 		if (!buffer)
 			return NULL;
 		if (check_sha1_signature(sha1, buffer, size, type) < 0)
 			printf("sha1 mismatch %s\n", sha1_to_hex(sha1));
-		munmap(map, mapsize);
+		free(buffer);
 		if (!strcmp(type, "blob")) {
 			struct blob *ret = lookup_blob(sha1);
 			parse_blob(ret);
Index: tag.c
===================================================================
--- 51a882a2dc62e0d3cdc79e0badc61559fb723481/tag.c  (mode:100644 sha1:406dba2aa5607332fe022fcba1beb045fa61c5f4)
+++ uncommitted/tag.c  (mode:100644)
@@ -37,37 +37,44 @@ int parse_tag(struct tag *item)
         if (!data)
                 return error("Could not read %s",
                              sha1_to_hex(item->object.sha1));
-        if (strcmp(type, tag_type))
+        if (strcmp(type, tag_type)) {
+		free(data);
                 return error("Object %s not a tag",
                              sha1_to_hex(item->object.sha1));
+	}
 
 	if (size < 64)
-		return -1;
+		goto err;
 	if (memcmp("object ", data, 7) || get_sha1_hex(data + 7, object))
-		return -1;
+		goto err;
 
 	item->tagged = parse_object(object);
 
 	type_line = data + 48;
 	if (memcmp("\ntype ", type_line-1, 6))
-		return -1;
+		goto err;
 
 	tag_line = strchr(type_line, '\n');
 	if (!tag_line || memcmp("tag ", ++tag_line, 4))
-		return -1;
+		goto err;
 
 	sig_line = strchr(tag_line, '\n');
 	if (!sig_line)
-		return -1;
+		goto err;
 	sig_line++;
 
 	typelen = tag_line - type_line - strlen("type \n");
 	if (typelen >= 20)
-		return -1;
+		goto err;
 	taglen = sig_line - tag_line - strlen("tag \n");
 	item->tag = xmalloc(taglen + 1);
 	memcpy(item->tag, tag_line + 4, taglen);
 	item->tag[taglen] = '\0';
 
+	free(data);
 	return 0;
+
+err:
+	free(data);
+	return -1;
 }
Index: tree.c
===================================================================
--- 51a882a2dc62e0d3cdc79e0badc61559fb723481/tree.c  (mode:100644 sha1:72305a357b6694cdb1b29dad9975902fdef86457)
+++ uncommitted/tree.c  (mode:100644)
@@ -101,9 +101,11 @@ int parse_tree(struct tree *item)
 	if (!buffer)
 		return error("Could not read %s",
 			     sha1_to_hex(item->object.sha1));
-	if (strcmp(type, tree_type))
+	if (strcmp(type, tree_type)) {
+		free(buffer);
 		return error("Object %s not a tree",
 			     sha1_to_hex(item->object.sha1));
+	}
 	list_p = &item->entries;
 	while (size) {
 		struct object *obj;
@@ -113,8 +115,10 @@ int parse_tree(struct tree *item)
 		char *path = strchr(bufptr, ' ');
 		unsigned int mode;
 		if (size < len + 20 || !path || 
-		    sscanf(bufptr, "%o", &mode) != 1)
+		    sscanf(bufptr, "%o", &mode) != 1) {
+			free(buffer);
 			return -1;
+		}
 
 		entry = xmalloc(sizeof(struct tree_entry_list));
 		entry->name = strdup(path + 1);
@@ -138,5 +142,6 @@ int parse_tree(struct tree *item)
 		*list_p = entry;
 		list_p = &entry->next;
 	}
+	free(buffer);
 	return 0;
 }

^ permalink raw reply

* Re: [PATCH] add the ability to create and retrieve delta objects
From: Chris Mason @ 2005-05-04 17:44 UTC (permalink / raw)
  To: C. Scott Ananian; +Cc: Nicolas Pitre, Linus Torvalds, Alon Ziv, git
In-Reply-To: <Pine.LNX.4.61.0505041202270.22203@cag.csail.mit.edu>

On Wednesday 04 May 2005 12:12, C. Scott Ananian wrote:
> On Wed, 4 May 2005, Chris Mason wrote:
> > 3) create a git-pack tool that can pack/unpack existing changesets,trees
> > and files, optionally adding/removing deltas.
>
> A 'git-pull' tool might be more use.  I can imagine Linus maintaining his
> local tree uncompressed, but the 'kernel.org' tree set up to
> git-pull-delta from him every hour or whatever, so that the
> network-accessible version is always network-efficient.  'git-pack'
> would then simplify to a git-pull-delta from an existing local repository.

I had pictured the opposite, after the regular pull you would run git-pack to 
redelta/pack things to your local settings.  There's a ton of things to try 
in git pull for delta transmission, and I'd rather keep it separate from the 
local repository packing for now.

Later on if both tools are working properly we could look at combining them.

>
> Ideally, you'd also be able to git-pull from a network packed repository
> and (transparently) unpack and undelta-fy the pulled files as they're
> added to your local repo.  This would keep Linus from accidentally getting
> packed files in his tree when he pulled from a maintainer's
> packed/delta-ed network-accessible tree.
>
Yes, if Linus does take the patches, it's really important for people to be 
able to easily continue without deltas/packing if they want.

> I'd also be interested in seeing the speed/space numbers for some other
> delta chain lengths between 1 and 16.  Maybe some intermediate point is
> optimal.  [Also, limiting delta chains to a certain number of other
> *packed* objects -- instead of just 'objects' -- might be an improvement.
> Right now you're packing entire commits together, right?  Maybe defining a
> delta chain as 'N other commits max' might improve i/o performance, as
> you'd just have to keep N other unpacked files around, instead of an
> arbitrary number.]

It's easy to do additional runs, but I would rather do this kind of fine 
tuning with the git-pack tool.  That way we could experiment with packing 
multiple commits into a single file, deltas on tree objects, or even 
combining commits based on which files tye modify.

-chris

^ permalink raw reply

* [PATCH] cogito: replace "git-* $(tree-id)" with "git-* HEAD"
From: Rene Scharfe @ 2005-05-04 17:48 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

tree-id, when called without parameters, prints the tree ID in the
commit object whose ID is in .git/HEAD.  Both git-diff-cache and
git-read-tree (in general all GIT tools that use get_sha1()) can get
this ID from .git/HEAD themselves when they are called with "HEAD" as
ID parameter.  So this patch replaces occurences of "git-* $(tree-id)"
with "git-* HEAD".  This speeds things up a bit and also makes the
intent of the git-* calls a bit more clear.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>

Index: cg-cancel
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-cancel  (mode:100755 sha1:4d1e6d9d112f050b402a454a4c04de4d4e379524)
+++ 0ff2b865104800b6a0dbbe17093bfca6e8b52898/cg-cancel  (mode:100755 sha1:dab527132a33922901bc184a8e7dad57b757b991)
@@ -26,7 +26,7 @@
 fi
 
 rm -f .git/blocked .git/merging .git/merging-sym .git/merge-base
-git-read-tree -m $(tree-id) || git-read-tree $(tree-id)
+git-read-tree -m HEAD || git-read-tree HEAD
 
 git-checkout-cache -f -a
 git-update-cache --refresh
Index: cg-commit
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-commit  (mode:100755 sha1:e80567a50e297d2a060773b5f10e193243ac340e)
+++ 0ff2b865104800b6a0dbbe17093bfca6e8b52898/cg-commit  (mode:100755 sha1:8f2514eac77fb9a23d6eba4e2f748914162da3d8)
@@ -37,7 +37,7 @@
 	# be committed along automagically as well.
 
 	if [ ! "$ignorecache" ]; then
-		changedfiles=$(git-diff-cache $(tree-id) | cut -f 4-)
+		changedfiles=$(git-diff-cache HEAD | cut -f 4-)
 		commitfiles="$addedfiles $remfiles $changedfiles"
 	fi
 
@@ -79,7 +79,7 @@
 		echo $commitfiles | xargs git-update-cache --add --remove \
 			|| die "update-cache failed"
 		export GIT_INDEX_FILE=$(mktemp -t gitci.XXXXXX)
-		git-read-tree $(tree-id)
+		git-read-tree HEAD
 	fi
 	# TODO: Do the proper separation of adds, removes, and changes.
 	echo $commitfiles | xargs git-update-cache --add --remove \
Index: cg-init
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-init  (mode:100755 sha1:229b64901d7d8e67d900b9c8e558f18ceb3b8a5f)
+++ 0ff2b865104800b6a0dbbe17093bfca6e8b52898/cg-init  (mode:100755 sha1:90a9d5497a0a70830094d2ee552d974819858ec4)
@@ -22,7 +22,7 @@
 	cg-pull origin || die "pull failed"
 
 	cp .git/refs/heads/origin .git/refs/heads/master
-	git-read-tree $(tree-id)
+	git-read-tree HEAD
 	git-checkout-cache -a
 	git-update-cache --refresh
 
Index: cg-merge
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-merge  (mode:100755 sha1:8c3f68b26346ad02474f595b651be77fc8958aa5)
+++ 0ff2b865104800b6a0dbbe17093bfca6e8b52898/cg-merge  (mode:100755 sha1:13b4dcf70dfa7f878c9599276c0d77908190b834)
@@ -79,7 +79,7 @@
 
 
 [ "$(git-diff-files -s)" ] && git-update-cache --refresh
-if [ "$(git-diff-files -s)" ] || [ "$(git-diff-cache $(tree-id))" ]; then
+if [ "$(git-diff-files -s)" ] || [ "$(git-diff-cache HEAD)" ]; then
 	die "merge blocked: local changes"
 fi
 
@@ -139,5 +139,5 @@
 readtree=
 cg-commit -C || { readtree=1 ; echo "cg-merge: COMMIT FAILED, retry manually" >&2; }
 
-[ "$readtree" ] && git-read-tree -m $(tree-id)
+[ "$readtree" ] && git-read-tree -m HEAD
 git-update-cache --refresh
Index: cg-seek
===================================================================
--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-seek  (mode:100755 sha1:111f7842e5ec20a1e0714e162ca221da5e06ce33)
+++ 0ff2b865104800b6a0dbbe17093bfca6e8b52898/cg-seek  (mode:100755 sha1:648c2b0db6e935d4768ca4820c91e7863e8487f4)
@@ -45,7 +45,7 @@
 fi
 
 if [ "$curcommit" != "$dstcommit" ]; then
-	git-read-tree -m $(tree-id)
+	git-read-tree -m HEAD
 	cg-diff -r $curcommit:$dstcommit | cg-patch
 	git-update-cache --refresh
 fi

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Bill Davidsen @ 2005-05-04 17:51 UTC (permalink / raw)
  To: Rene Scharfe
  Cc: Matt Mackall,
	Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>,
	Linus Torvalds, Ryan Anderson, Andrea Arcangeli, linux-kernel,
	git
In-Reply-To: <4277B15F.1020102@lsrfire.ath.cx>

Rene Scharfe wrote:
> Bill Davidsen schrieb:
> 
>>On the theory that my first post got lost, why use /usr/bin/env at 
>>all, when bash already does that substitution? To support people who 
>>use other shells?
>>
>>ie.: FOO=xx perl -e '$a=$ENV{FOO}; print "$a\n"'
> 
> 
> /usr/bin/env is used in scripts in the shebang line (the very first line
> of the script, starting with "#!", which denotes the interpreter to use
> for that script) to make a PATH search for the real interpreter.
> Some folks keep their python (or Perl, or Bash etc.) in /usr/local/bin
> or in $HOME, that's why this construct is needed at all.
> 
> Changing environment variables is not the goal, insofar this usage
> exploits only a side-effect of env.  It is portable in practice because
> env is in /usr/bin on most modern systems.
> 
> So you could replace this first line of a bash script:
> 
>    #!/usr/bin/env python
> 
> with this:
> 
>    #!python
> 
> except that the latter doesn't work because you need to specify an
> absolute path there. :]

Assuming that you want the PATH search rather than a symlink in 
/usr/bin, of course. This opens the door to forgetting you just loaded 
the CVS daily of python into your test directory and doing an unplanned 
test of alpha software, but if people think the application should work 
with non-standard tool chains, and realize it has possible unwanted 
effects, that's a design decision.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply

* Mercurial v0.4d
From: Matt Mackall @ 2005-05-04 18:18 UTC (permalink / raw)
  To: linux-kernel, git, Linus Torvalds
In-Reply-To: <20050504025852.GK22038@waste.org>

A new version of Mercurial is available at:
 
  http://selenic.com/mercurial/

This fixes a handful of bugs reported last night, most notably failing
to pull from the http repo. This turned out to be a failure to quote
'%' characters. Thanks to everyone for their feedback.
 
Once you've got the new version installed, to pull the repo:

  hg init
  hg merge http://selenic.com/hg
  hg checkout    # 'hg co' works too

The web protocol is painfully slow, mostly because it makes an http
round trip per file revision to pull. I'm about to start working on a
replacement that minimizes round trips.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Re: Howto update a 'dirty' entry in the cache from the object database
From: Thomas Glanzmann @ 2005-05-04 19:20 UTC (permalink / raw)
  To: GIT
In-Reply-To: <Pine.LNX.4.58.0505041014510.2328@ppc970.osdl.org>

Hello,

> 	# read a new index file with the HEAD information
> 	GIT_INDEX_FILE=tmp-index git-read-tree HEAD

> 	# check out just the one file you want to have
> 	GIT_INDEX_FILE=tmp-index git-checkout-cache -f filename

> 	# remove the now useless temporary index
> 	rm tmp-index

> 	# update your _real_ index file with the file information
> 	git-update-cache filename

thanks. That is exactly what I was looking for.

	Thomas

^ permalink raw reply

* Re: [PATCH] add the ability to create and retrieve delta objects
From: Geert Bosch @ 2005-05-04 21:47 UTC (permalink / raw)
  To: Chris Mason; +Cc: Nicolas Pitre, Linus Torvalds, Alon Ziv, git
In-Reply-To: <200505041156.19499.mason@suse.com>

 From your tests it would seem that the zdelta version is the only one
to provide a uniform improvement over plain git. As it also seems the
simplest approach, I wonder why the consensus is that using xdiff
would be better?

   -Geert

On May 4, 2005, at 11:56, Chris Mason wrote:
> I did two additional runs, first where I fixed the delta chain  
> length at 1 as
> in the zdelta patch.   In this mode, if it tried to diff against a  
> delta it
> would find the delta's parent and diff against that instead.  Even  
> though
> zdelta had the same speeds for applying patches as xdiff(1), zdelta  
> used
> significantly more cpu (53m vs 40m).
>
> The next run was with the patch I've attached below, it allows  
> chains up to 16
> deltas in length.
>                              git         zdelta       xdiff  
> (1)      xdiff(16)
> apply                  150m       117m       117m         104m
> checkout             4m30s      3m41      4m43s        7m11s
> checkout (hot)     56s           12s         14s             16s
> space usage        2.5G         1G           1.2G           800m
>


^ permalink raw reply

* Re: [PATCH] add the ability to create and retrieve delta objects
From: Linus Torvalds @ 2005-05-04 22:03 UTC (permalink / raw)
  To: Chris Mason; +Cc: C. Scott Ananian, Nicolas Pitre, Alon Ziv, git
In-Reply-To: <200505041344.51637.mason@suse.com>



On Wed, 4 May 2005, Chris Mason wrote:
>
> Yes, if Linus does take the patches, it's really important for people to be 
> able to easily continue without deltas/packing if they want.

I'll happily take the patch and just not use the delta packing myself (at 
least until I trust it). But before I take the patch I want to make sure 
that people agree on it, and that it's been tested well enough that it 
won't cause people to corrupt their repositories.

For example, I do _not_ want to be in the situation SVN is in, where if
you corrupt your SVN database, you're totally screwed. There's a real
advantage to not having fancy data structures or complicated consistency
rules.

			Linus

^ permalink raw reply

* Re: [git pull] jfs update
From: Dave Kleikamp @ 2005-05-04 22:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505041437060.2328@ppc970.osdl.org>

cc'ing the git mailing list, dropping linux-kernel

On Wed, 2005-05-04 at 14:37 -0700, Linus Torvalds wrote:
> 
> On Wed, 4 May 2005, Dave Kleikamp wrote:
> >
> > I think I've got this set up right.  I have created a HEAD-for-linus and
> > HEAD-for-mm in the same git repo.
> 
> Ok, my scripts don't handle that very well yet (they just want HEAD), but 
> that was easy enough to hack around. I'll be able to work with that 
> format.

After looking at the documentation and playing around, I'm wondering if
I should instead have created jfs-2.6.git/refs/heads/for-linus.  Then
the url would be:
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6.git#for-linus

I think this currently works.  Which would be the preferred format?

> 
> > Please pull from
> > 
> > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6.git/HEAD-for-linus
> 
> Pulled, and pushed out.
> 
> 		Linus
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH] add the ability to create and retrieve delta objects
From: Chris Mason @ 2005-05-04 22:34 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Nicolas Pitre, Linus Torvalds, Alon Ziv, git
In-Reply-To: <DE5D04E8-B182-45B1-AB9A-6AA178005FFD@adacore.com>

On Wednesday 04 May 2005 17:47, Geert Bosch wrote:
>  From your tests it would seem that the zdelta version is the only one
> to provide a uniform improvement over plain git. As it also seems the
> simplest approach, I wonder why the consensus is that using xdiff
> would be better?

zdelta seems to be a research project.  It does compress better than the xdiff 
lib, but the speed improvements against xdiff(1) are probably because the 
resulting tree is smaller.  I favor the xdiff code because it's so much 
smaller, and seems easier for us to maintain.

For performance, there's still quite a bit of tuning that can be done in terms 
of when and how we delta.

-chris

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Kay Sievers @ 2005-05-04 22:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505031151240.26698@ppc970.osdl.org>

On Tue, May 03, 2005 at 12:02:33PM -0700, Linus Torvalds wrote:
> 
> 
> On Tue, 3 May 2005, Kay Sievers wrote:
> >
> > Is there a sane model to make git aware of tracking symlinks in the
> > repository? In the bk udev tree we've had a test sysfs-tree with a lot
> > of symlinks in it.
> > 
> > Where can we store the link-target? In its own blob-object or directly
> > in the tree-object?
> 
> I'd suggest you create a blob object with the symlink name, and then in
> the tree you point to that blob, but with the S_IFLNK value in the mode
> field (0120000).
> 
> So you have
> 
>  - directories: S_IFDIR (0040000) point to "tree" objects for contents
>  - symlinks: S_IFLNK (0120000) point to "blob" objects
>  - executables: S_IFREG | 0755 (0100755) point to "blob" objects
>  - regular files: S_IFREG | 0644 (0100644) point to "blob" objects
> 
> which seems very sane and regular. 

Here is a first try, that is able to track symlinks. Please have a look
if something like this is acceptable, then I will finish it.

Thanks,
Kay

--- a/check-files.c
+++ b/check-files.c
@@ -28,8 +28,8 @@ static void check_file(const char *path)
 		die("preparing to update existing file '%s' not in cache", path);
 	ce = active_cache[pos];
 
-	if (fstat(fd, &st) < 0)
-		die("fstat(%s): %s", path, strerror(errno));
+	if (lstat(path, &st) < 0)
+		die("lstat(%s): %s", path, strerror(errno));
 
 	changed = cache_match_stat(ce, &st);
 	if (changed)
--- a/checkout-cache.c
+++ b/checkout-cache.c
@@ -72,23 +72,37 @@ static int write_entry(struct cache_entr
 	unsigned long size;
 	long wrote;
 	char type[20];
+	unsigned int mode;
 
 	new = read_sha1_file(ce->sha1, type, &size);
 	if (!new || strcmp(type, "blob")) {
 		return error("checkout-cache: unable to read sha1 file of %s (%s)",
 			path, sha1_to_hex(ce->sha1));
 	}
-	fd = create_file(path, ntohl(ce->ce_mode));
-	if (fd < 0) {
+	mode = ntohl(ce->ce_mode);
+	if (S_ISLNK(mode)) {
+		char target[1024];
+		memcpy(target, new, size);
+		target[size] = '\0';
+		if (symlink(target, path)) {
+			free(new);
+			return error("checkout-cache: unable to create link %s (%s)",
+				path, strerror(errno));
+		}
+		free(new);
+	} else {
+		fd = create_file(path, mode);
+		if (fd < 0) {
+			free(new);
+			return error("checkout-cache: unable to create file %s (%s)",
+				path, strerror(errno));
+		}
+		wrote = write(fd, new, size);
+		close(fd);
 		free(new);
-		return error("checkout-cache: unable to create %s (%s)",
-			path, strerror(errno));
+		if (wrote != size)
+			return error("checkout-cache: unable to write %s", path);
 	}
-	wrote = write(fd, new, size);
-	close(fd);
-	free(new);
-	if (wrote != size)
-		return error("checkout-cache: unable to write %s", path);
 	return 0;
 }
 
@@ -101,7 +115,7 @@ static int checkout_entry(struct cache_e
 	memcpy(path, base_dir, len);
 	strcpy(path + len, ce->name);
 
-	if (!stat(path, &st)) {
+	if (!lstat(path, &st)) {
 		unsigned changed = cache_match_stat(ce, &st);
 		if (!changed)
 			return 0;
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -24,7 +24,7 @@ static int get_stat_data(struct cache_en
 		static unsigned char no_sha1[20];
 		int changed;
 		struct stat st;
-		if (stat(ce->name, &st) < 0)
+		if (lstat(ce->name, &st) < 0)
 			return -1;
 		changed = cache_match_stat(ce, &st);
 		if (changed) {
--- a/ls-files.c
+++ b/ls-files.c
@@ -199,7 +199,7 @@ static void show_files(void)
 			struct stat st;
 			if (excluded(ce->name) != show_ignored)
 				continue;
-			if (!stat(ce->name, &st))
+			if (!lstat(ce->name, &st))
 				continue;
 			printf("%s%c", ce->name, line_terminator);
 		}
--- a/tree.c
+++ b/tree.c
@@ -13,7 +13,10 @@ static int read_one_entry(unsigned char 
 
 	memset(ce, 0, size);
 
-	ce->ce_mode = create_ce_mode(mode);
+	if (mode & S_IFMT)
+		ce->ce_mode = htonl(mode);
+	else
+		ce->ce_mode = create_ce_mode(mode);
 	ce->ce_flags = create_ce_flags(baselen + len, stage);
 	memcpy(ce->name, base, baselen);
 	memcpy(ce->name + baselen, pathname, len+1);
--- a/update-cache.c
+++ b/update-cache.c
@@ -58,30 +58,37 @@ static int add_file_to_cache(char *path)
 	struct stat st;
 	int fd;
 
-	fd = open(path, O_RDONLY);
-	if (fd < 0) {
+	if (lstat(path, &st) < 0) {
 		if (errno == ENOENT || errno == ENOTDIR) {
 			if (allow_remove)
 				return remove_file_from_cache(path);
 		}
 		return -1;
 	}
-	if (fstat(fd, &st) < 0) {
-		close(fd);
-		return -1;
-	}
 	namelen = strlen(path);
 	size = cache_entry_size(namelen);
 	ce = xmalloc(size);
 	memset(ce, 0, size);
 	memcpy(ce->name, path, namelen);
 	fill_stat_cache_info(ce, &st);
-	ce->ce_mode = create_ce_mode(st.st_mode);
+	if (S_ISREG(st.st_mode)) {
+		fd = open(path, O_RDONLY);
+		if (fd < 0)
+			return -1;
+		ce->ce_mode = create_ce_mode(st.st_mode);
+		if (index_fd(ce->sha1, fd, &st) < 0)
+			return -1;
+	} else if (S_ISLNK(st.st_mode)) {
+		unsigned int len;
+		char target[1024];
+		ce->ce_mode = htonl(S_IFLNK);
+		len = readlink(path, target, sizeof(target));
+		if (len == -1 || len+1 > sizeof(target))
+			return -1;
+		if (write_sha1_file(target, len, "blob", ce->sha1))
+			return -1;
+	}
 	ce->ce_flags = htons(namelen);
-
-	if (index_fd(ce->sha1, fd, &st) < 0)
-		return -1;
-
 	return add_cache_entry(ce, allow_add);
 }
 
@@ -137,7 +144,7 @@ static struct cache_entry *refresh_entry
 	struct cache_entry *updated;
 	int changed, size;
 
-	if (stat(ce->name, &st) < 0)
+	if (lstat(ce->name, &st) < 0)
 		return ERR_PTR(-errno);
 
 	changed = cache_match_stat(ce, &st);


^ permalink raw reply

* Re: [PATCH] add the ability to create and retrieve delta objects
From: Chris Mason @ 2005-05-04 22:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: C. Scott Ananian, Nicolas Pitre, Alon Ziv, git
In-Reply-To: <Pine.LNX.4.58.0505041501220.2328@ppc970.osdl.org>

On Wednesday 04 May 2005 18:03, Linus Torvalds wrote:
> On Wed, 4 May 2005, Chris Mason wrote:
> > Yes, if Linus does take the patches, it's really important for people to
> > be able to easily continue without deltas/packing if they want.
>
> I'll happily take the patch and just not use the delta packing myself (at
> least until I trust it). But before I take the patch I want to make sure
> that people agree on it, and that it's been tested well enough that it
> won't cause people to corrupt their repositories.
>
> For example, I do _not_ want to be in the situation SVN is in, where if
> you corrupt your SVN database, you're totally screwed. There's a real
> advantage to not having fancy data structures or complicated consistency
> rules.

Fair enough ;)  I'm pretty flexible about most of the details, so I'm hoping 
it won't be hard to get a consensus.  The current code might be a little too 
simple in format (in the packed and delta headers), but so far it seems 
sufficient for git's needs.

The git-pack utility would probably be the best way to get other people to 
experiment and make suggestions, so I'll start there.

-chris

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Daniel Barkalow @ 2005-05-04 23:03 UTC (permalink / raw)
  To: David A. Wheeler
  Cc: H. Peter Anvin, Junio C Hamano, Linus Torvalds, Andreas Gal,
	Kay Sievers, git
In-Reply-To: <4278EEC6.2090607@dwheeler.com>

On Wed, 4 May 2005, David A. Wheeler wrote:

> Once you're there, it wouldn't be hard to add logic to add options to
> (1) record the REAL permission bits, (2) record "." files, and
> (3) recover the permission bits.  That would be enough to
> store & recover in a distributed way a single person's home directory.
> THAT might be darn useful, for those of us who float between
> different systems & would like to use a single system for multiple purposes.
> That's clearly beyond the scope of a typical SCM, but since
> it's easy to get there, that'd make sense.

The status quo with respect to the permissions is actually the correct
thing for an SCM, because you want to generate the corresponding tree for
a different user (e.g., with the other user's umask applied, etc.), not
the same tree.

This is a situation in which doing 90% of one thing, and then supporting
90% of something else separately is best. What you really want is to have
a "directory" object type that stores the exact permissions, and the
uid/gid, and even xattr stuff. Then you use those for distributing your
home directory, but not for distributing source trees, where that stuff is
useless and somewhat wrong. You could probably have the same kind of
commit objects, although you still need some way of figuring out what kind
of object is desired for the directories in a commit.

(on the other hand, it might make sense for git to handle files starting
with '.', and only skip .git).

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply

* Re: git and symlinks as tracked content
From: Junio C Hamano @ 2005-05-04 23:16 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Linus Torvalds, git
In-Reply-To: <20050504223532.GA22967@vrfy.org>

It seems to follow the original suggestion by Linus and looks
good.  Some comments:

 * It continues to assume that S_IFREG, S_IFDIR and S_IFLNK have
   the same bit pattern everywhere.  In the same spirit as we
   store mode bits in network byte order, it may be a good time
   to introduce something like this:

   -#define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644)
   -#define create_ce_mode(mode) htonl(S_IFREG | ce_permissions(mode))
   +#define CE_IFREG  0100000
   +#define CE_IFDIR  0040000
   +#define CE_IFLNK  0120000
   +#define CE_IFMASK 0770000
   +
   +#define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644) /* REG only */ 
   +#define create_ce_mode(mode) htonl(S_ISREG(mode) ?
   +				   (CE_IFREG | ce_permissions(mode)) :
   +				   S_ISLNK(mode) ?
   +				   CE_IFLNK :
   +				   0) /* what would we do for unknowns? */

 * read-cache.c:cache_match_stat() needs to know about the
   object type.  It was allowed to assume that anything thrown
   at it was a file, but not anymore.  How about something like
   this:

     int cache_match_stat(struct cache_entry 
     {
            unsigned int changed = 0;

    +       switch (ntohl(ce->ce_mode) & CE_IFMASK) {
    +       case CE_IFREG:
    +               changed |= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;
    +               break;
    +       case CE_IFLNK:
    +               changed |= !S_ISLNK(st->st_mode) ? TYPE_CHANGED : 0;
    +               break;
    +       default:
    +               die("internal error: ce_mode is %o", ntohl(ce->ce_mode));
    +       }

    (in cache.h) 
     #define INODE_CHANGED   0x0010
     #define DATA_CHANGED    0x0020
    +#define TYPE_CHANGED    0x0040

  * update-cache.c:refresh_entry() needs to know that if the
    type of the path changed, it would never match:

            /*
    -        * If the mode has changed, there's no point in trying
    +        * If the mode or type has changed, there's no point in trying
             * to refresh the entry - it's not going to match
             */
    -       if (changed & MODE_CHANGED)
    +       if (changed & (MODE_CHANGED | TYPE_CHANGED))
                    return ERR_PTR(-EINVAL);

            if (compare_data(ce, st.st_size))

  * (this is just a minor nit).  Since you have st here,
    st.st_size can be used to see how big a buffer you need to
    prepare for readlink() here:

    +               unsigned int len;
    +               char target[1024];
    +               ce->ce_mode = htonl(S_IFLNK);
    +               len = readlink(path, target, sizeof(target));
    +               if (len == -1 || len+1 > sizeof(target))
    +                       return -1;

  * Probably diff.c needs to be made aware of this change.


^ permalink raw reply

* [PATCH] Fix memory leaks in read_tree_recursive()
From: Jonas Fonseca @ 2005-05-04 23:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch fixes memory leaks in the error path of
read_tree_recursive().

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>

--- 571c0f4599fbeefd995bbc24480add1575c36c94/tree.c  (mode:100644 sha1:4a26603f6e32866c0db8a01ac1c228be801f76c6)
+++ uncommitted/tree.c  (mode:100644)
@@ -39,14 +39,18 @@
 		if (S_ISDIR(mode)) {
 			int retval;
 			int pathlen = strlen(path);
-			char *newbase = xmalloc(baselen + 1 + pathlen);
+			char *newbase;
 			void *eltbuf;
 			char elttype[20];
 			unsigned long eltsize;
 
 			eltbuf = read_sha1_file(sha1, elttype, &eltsize);
-			if (!eltbuf || strcmp(elttype, "tree"))
+			if (!eltbuf || strcmp(elttype, "tree")) {
+				if (eltbuf) mem_free(eltbuf);
 				return -1;
+			}
+
+			newbase = xmalloc(baselen + 1 + pathlen);
 			memcpy(newbase, base, baselen);
 			memcpy(newbase + baselen, path, pathlen);
 			newbase[baselen + pathlen] = '/';

-- 
Jonas Fonseca

^ permalink raw reply

* [PATCH] Fix memory leaks in read_tree_recursive()
From: Jonas Fonseca @ 2005-05-04 23:39 UTC (permalink / raw)
  To: Linus Torvalds, git
In-Reply-To: <20050504231959.GA25475@diku.dk>

This patch fixes memory leaks in read_tree_recursive().

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>

---

Sorry, typo in the previous patch. I will try to be more careful.
This patch has already been applied to cogito.

--- 571c0f4599fbeefd995bbc24480add1575c36c94/tree.c  (mode:100644 sha1:4a26603f6e32866c0db8a01ac1c228be801f76c6)
+++ uncommitted/tree.c  (mode:100644)
@@ -39,14 +39,18 @@
 		if (S_ISDIR(mode)) {
 			int retval;
 			int pathlen = strlen(path);
-			char *newbase = xmalloc(baselen + 1 + pathlen);
+			char *newbase;
 			void *eltbuf;
 			char elttype[20];
 			unsigned long eltsize;
 
 			eltbuf = read_sha1_file(sha1, elttype, &eltsize);
-			if (!eltbuf || strcmp(elttype, "tree"))
+			if (!eltbuf || strcmp(elttype, "tree")) {
+				if (eltbuf) free(eltbuf);
 				return -1;
+			}
+
+			newbase = xmalloc(baselen + 1 + pathlen);
 			memcpy(newbase, base, baselen);
 			memcpy(newbase + baselen, path, pathlen);
 			newbase[baselen + pathlen] = '/';

-- 
Jonas Fonseca

^ permalink raw reply

* Re: [PATCH] Fix memory leaks in read_tree_recursive()
From: Junio C Hamano @ 2005-05-04 23:43 UTC (permalink / raw)
  To: Jonas Fonseca; +Cc: git
In-Reply-To: <20050504231959.GA25475@diku.dk>

>>>>> "JF" == Jonas Fonseca <fonseca@diku.dk> writes:

JF> This patch fixes memory leaks in the error path of
JF> read_tree_recursive().

The leak seems to be real but what is "mem_free"?  Has it been
compile tested?

JF> @@ -39,14 +39,18 @@
JF>  		if (S_ISDIR(mode)) {
JF>  			int retval;
 
JF> ...
JF> -			if (!eltbuf || strcmp(elttype, "tree"))
JF> +			if (!eltbuf || strcmp(elttype, "tree")) {
JF> +				if (eltbuf) mem_free(eltbuf);
JF>  				return -1;

Btw, who is putting this header in your mail?  It does not make
sense to me unless Jonas is pseudonym for Linus...

  Mail-Followup-To: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org 



^ permalink raw reply

* [PATCH] Optimize diff-cache -p --cached
From: Junio C Hamano @ 2005-05-04 23:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git


  ** Linus, I am sending this one in the mail partly because my
  ** next patch to prepare the diff side for the symlink work by
  ** Kay Siever is on top of this one.  If you pull from
  ** http://members.cox.net/junkio/git-jc.git/ it already
  ** contains this patch.

This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs into temporary files when the blob recorded in the
cache matches the corresponding file in the work tree.  The file
in the work tree is passed as the comparison source in such a
case instead.

This optimization kicks in only when we have already read the
cache this optimization and this is deliberate.  Especially,
diff-tree does not use this code, because changes are contained
in small number of files relative to the project size most of
the time, and reading cache is so expensive for a large project
that the cost of reading it outweighs the savings by not
inflating blobs.

Also this patch cleans up the structure passed from diff clients
by removing one unused structure member.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

--- a/diff-tree-helper.c
+++ b/diff-tree-helper.c
@@ -35,7 +35,7 @@ static int parse_oneside_change(const ch
 	if (strncmp(cp, "\tblob\t", 6))
 		return -1;
 	cp += 6;
-	if (get_sha1_hex(cp, one->u.sha1))
+	if (get_sha1_hex(cp, one->blob_sha1))
 		return -1;
 	cp += 40;
 	if (*cp++ != '\t')
@@ -83,13 +83,13 @@ static int parse_diff_tree_output(const 
 		if (strncmp(cp, "\tblob\t", 6))
 			return -1;
 		cp += 6;
-		if (get_sha1_hex(cp, old.u.sha1))
+		if (get_sha1_hex(cp, old.blob_sha1))
 			return -1;
 		cp += 40;
 		if (strncmp(cp, "->", 2))
 			return -1;
 		cp += 2;
-		if (get_sha1_hex(cp, new.u.sha1))
+		if (get_sha1_hex(cp, new.blob_sha1))
 			return -1;
 		cp += 40;
 		if (*cp++ != '\t')

--- a/diff.c
+++ b/diff.c
@@ -132,11 +132,50 @@ static void builtin_diff(const char *nam
 	execlp("/bin/sh","sh", "-c", cmd, NULL);
 }
 
+/*
+ * Given a name and sha1 pair, if the dircache tells us the file in
+ * the work tree has that object contents, return true, so that
+ * prepare_temp_file() does not have to inflate and extract.
+ */
+static int work_tree_matches(const char *name, const unsigned char *sha1)
+{
+	struct cache_entry *ce;
+	struct stat st;
+	int pos, len;
+	
+	/* We do not read the cache ourselves here, because the
+	 * benchmark with my previous version that always reads cache
+	 * shows that it makes things worse for diff-tree comparing
+	 * two linux-2.6 kernel trees in an already checked out work
+	 * tree.  This is because most diff-tree comparison deals with
+	 * only a small number of files, while reading the cache is
+	 * expensive for a large project, and its cost outweighs the
+	 * savings we get by not inflating the object to a temporary
+	 * file.  Practically, this code only helps when we are used
+	 * by diff-cache --cached, which does read the cache before
+	 * calling us.
+	 */ 
+	if (!active_cache)
+		return 0;
+
+	len = strlen(name);
+	pos = cache_name_pos(name, len);
+	if (pos < 0)
+		return 0;
+	ce = active_cache[pos];
+	if ((stat(name, &st) < 0) ||
+	    cache_match_stat(ce, &st) ||
+	    memcmp(sha1, ce->sha1, 20))
+		return 0;
+	return 1;
+}
+
 static void prepare_temp_file(const char *name,
 			      struct diff_tempfile *temp,
 			      struct diff_spec *one)
 {
 	static unsigned char null_sha1[20] = { 0, };
+	int use_work_tree = 0;
 
 	if (!one->file_valid) {
 	not_a_valid_file:
@@ -150,20 +189,22 @@ static void prepare_temp_file(const char
 	}
 
 	if (one->sha1_valid &&
-	    !memcmp(one->u.sha1, null_sha1, sizeof(null_sha1))) {
-		one->sha1_valid = 0;
-		one->u.name = name;
-	}
+	    (!memcmp(one->blob_sha1, null_sha1, sizeof(null_sha1)) ||
+	     work_tree_matches(name, one->blob_sha1)))
+		use_work_tree = 1;
 
-	if (!one->sha1_valid) {
+	if (!one->sha1_valid || use_work_tree) {
 		struct stat st;
-		temp->name = one->u.name;
+		temp->name = name;
 		if (stat(temp->name, &st) < 0) {
 			if (errno == ENOENT)
 				goto not_a_valid_file;
 			die("stat(%s): %s", temp->name, strerror(errno));
 		}
-		strcpy(temp->hex, sha1_to_hex(null_sha1));
+		if (!one->sha1_valid)
+			strcpy(temp->hex, sha1_to_hex(null_sha1));
+		else
+			strcpy(temp->hex, sha1_to_hex(one->blob_sha1));
 		sprintf(temp->mode, "%06o",
 			S_IFREG |ce_permissions(st.st_mode));
 	}
@@ -173,10 +214,10 @@ static void prepare_temp_file(const char
 		char type[20];
 		unsigned long size;
 
-		blob = read_sha1_file(one->u.sha1, type, &size);
+		blob = read_sha1_file(one->blob_sha1, type, &size);
 		if (!blob || strcmp(type, "blob"))
 			die("unable to read blob object for %s (%s)",
-			    name, sha1_to_hex(one->u.sha1));
+			    name, sha1_to_hex(one->blob_sha1));
 
 		strcpy(temp->tmp_path, ".diff_XXXXXX");
 		fd = mkstemp(temp->tmp_path);
@@ -187,7 +228,7 @@ static void prepare_temp_file(const char
 		close(fd);
 		free(blob);
 		temp->name = temp->tmp_path;
-		strcpy(temp->hex, sha1_to_hex(one->u.sha1));
+		strcpy(temp->hex, sha1_to_hex(one->blob_sha1));
 		temp->hex[40] = 0;
 		sprintf(temp->mode, "%06o", one->mode);
 	}
@@ -286,7 +327,7 @@ void diff_addremove(int addremove, unsig
 	char concatpath[PATH_MAX];
 	struct diff_spec spec[2], *one, *two;
 
-	memcpy(spec[0].u.sha1, sha1, 20);
+	memcpy(spec[0].blob_sha1, sha1, 20);
 	spec[0].mode = mode;
 	spec[0].sha1_valid = spec[0].file_valid = 1;
 	spec[1].file_valid = 0;
@@ -311,9 +352,9 @@ void diff_change(unsigned old_mode, unsi
 	char concatpath[PATH_MAX];
 	struct diff_spec spec[2];
 
-	memcpy(spec[0].u.sha1, old_sha1, 20);
+	memcpy(spec[0].blob_sha1, old_sha1, 20);
 	spec[0].mode = old_mode;
-	memcpy(spec[1].u.sha1, new_sha1, 20);
+	memcpy(spec[1].blob_sha1, new_sha1, 20);
 	spec[1].mode = new_mode;
 	spec[0].sha1_valid = spec[0].file_valid = 1;
 	spec[1].sha1_valid = spec[1].file_valid = 1;

--- a/diff.h
+++ b/diff.h
@@ -20,15 +20,12 @@ extern void diff_unmerge(const char *pat
 /* These are for diff-tree-helper */
 
 struct diff_spec {
-	union {
-		const char *name;       /* path on the filesystem */
-		unsigned char sha1[20]; /* blob object ID */
-	} u;
+	unsigned char blob_sha1[20];
 	unsigned short mode;	 /* file mode */
-	unsigned sha1_valid : 1; /* if true, use u.sha1 and trust mode.
-				  * (however with a NULL SHA1, read them
-				  * from the file!).
-				  * if false, use u.name and read mode from
+	unsigned sha1_valid : 1; /* if true, use blob_sha1 and trust mode;
+				  * however with a NULL SHA1, read them
+				  * from the file system.
+				  * if false, use the name and read mode from
 				  * the filesystem.
 				  */
 	unsigned file_valid : 1; /* if false the file does not even exist */




^ permalink raw reply

* [PATCH] Prepare diff side for upcoming symlink work.
From: Junio C Hamano @ 2005-05-05  0:00 UTC (permalink / raw)
  To: Linus Torvalds, Kay Sievers; +Cc: git

This patch prepares the external diff interface engine for the
change to store the symbolic links in the cache, being worked on
by Kay Sievers.

The main thing it does is when comparing with the work tree, it
prepares the counterpart to the blob being compared by doing a
readlink followed by sending that result to a temporary file to
be diffed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

--- a/diff.c
+++ b/diff.c
@@ -163,13 +163,35 @@ static int work_tree_matches(const char 
 	if (pos < 0)
 		return 0;
 	ce = active_cache[pos];
-	if ((stat(name, &st) < 0) ||
+	if ((lstat(name, &st) < 0) ||
+	    !S_ISREG(st.st_mode) ||
 	    cache_match_stat(ce, &st) ||
 	    memcmp(sha1, ce->sha1, 20))
 		return 0;
 	return 1;
 }
 
+static void prep_temp_blob(struct diff_tempfile *temp,
+			   void *blob,
+			   unsigned long size,
+			   unsigned char *sha1,
+			   int mode)
+{
+	int fd;
+
+	strcpy(temp->tmp_path, ".diff_XXXXXX");
+	fd = mkstemp(temp->tmp_path);
+	if (fd < 0)
+		die("unable to create temp-file");
+	if (write(fd, blob, size) != size)
+		die("unable to write temp-file");
+	close(fd);
+	temp->name = temp->tmp_path;
+	strcpy(temp->hex, sha1_to_hex(sha1));
+	temp->hex[40] = 0;
+	sprintf(temp->mode, "%06o", mode);
+}
+
 static void prepare_temp_file(const char *name,
 			      struct diff_tempfile *temp,
 			      struct diff_spec *one)
@@ -196,20 +218,35 @@ static void prepare_temp_file(const char
 	if (!one->sha1_valid || use_work_tree) {
 		struct stat st;
 		temp->name = name;
-		if (stat(temp->name, &st) < 0) {
+		if (lstat(temp->name, &st) < 0) {
 			if (errno == ENOENT)
 				goto not_a_valid_file;
 			die("stat(%s): %s", temp->name, strerror(errno));
 		}
+		if (S_ISLNK(st.st_mode)) {
+			int ret;
+			char *buf, buf_[1024];
+			buf = ((sizeof(buf_) < st.st_size) ?
+			       xmalloc(st.st_size) : buf_);
+			ret = readlink(name, buf, st.st_size);
+			if (ret < 0)
+				die("readlink(%s)", name);
+			prep_temp_blob(temp, buf, st.st_size,
+				       (one->sha1_valid ?
+					one->blob_sha1 : null_sha1),
+				       (one->sha1_valid ?
+					one->mode : 
+					S_IFREG|ce_permissions(st.st_mode)));
+		}
 		if (!one->sha1_valid)
 			strcpy(temp->hex, sha1_to_hex(null_sha1));
 		else
 			strcpy(temp->hex, sha1_to_hex(one->blob_sha1));
 		sprintf(temp->mode, "%06o",
-			S_IFREG |ce_permissions(st.st_mode));
+			S_IFREG | ce_permissions(st.st_mode));
+		return;
 	}
 	else {
-		int fd;
 		void *blob;
 		char type[20];
 		unsigned long size;
@@ -218,19 +255,8 @@ static void prepare_temp_file(const char
 		if (!blob || strcmp(type, "blob"))
 			die("unable to read blob object for %s (%s)",
 			    name, sha1_to_hex(one->blob_sha1));
-
-		strcpy(temp->tmp_path, ".diff_XXXXXX");
-		fd = mkstemp(temp->tmp_path);
-		if (fd < 0)
-			die("unable to create temp-file");
-		if (write(fd, blob, size) != size)
-			die("unable to write temp-file");
-		close(fd);
+		prep_temp_blob(temp, blob, size, one->blob_sha1, one->mode);
 		free(blob);
-		temp->name = temp->tmp_path;
-		strcpy(temp->hex, sha1_to_hex(one->blob_sha1));
-		temp->hex[40] = 0;
-		sprintf(temp->mode, "%06o", one->mode);
 	}
 }
 


^ permalink raw reply

* Kernel nightly snapshots..
From: Linus Torvalds @ 2005-05-05  0:28 UTC (permalink / raw)
  To: Git Mailing List, David Woodhouse; +Cc: Peter Anvin


I forget who it is that used to do the nightly snapshots for the BK
kernels, but I _think_ it was David Woodhouse (every time I've blamed it
on somebody, I've blamed the wrong person, so I'm probably off on this one
too, but maybe I finally got it right).

I was wondering how to get that re-started.. It should be technically
pretty easy, except I realized that my tree doesn't even have plain 2.6.11
in it. But I just fixed that in the tree, since I need such a baseline 
myself for my next release..

Anyway, I just pushed out a kernel tree that contains a tag of a _tree_ 
that points to the tree at the point of 2.6.11. I also had to teach 
fsck-cache about the fact that you can give it any kind of object to start 
your references at, and to make fsck-cache happy, you need to

	git-fsck-cache --unreachable HEAD v2.6.11

to tell it that the kernel tree now has an unconnected tree (described by
the tag "v2.6.11-tree", and I made the appropriate entry for it in
.git/refs/tags).

I also updated git-prune-script to not remove these kinds of things.

With this, it should be trivial to create snapshots with

	git-diff-tree -p v2.6.11 HEAD

or similar.

		Linus

^ permalink raw reply

* Re: [PATCH] Fix memory leaks in read_tree_recursive()
From: Jonas Fonseca @ 2005-05-05  0:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpsw6muay.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote Wed, May 04, 2005:
> >>>>> "JF" == Jonas Fonseca <fonseca@diku.dk> writes:
> 
> JF> This patch fixes memory leaks in the error path of
> JF> read_tree_recursive().
> 
> The leak seems to be real but what is "mem_free"?

That mem_free() was a bad habbit from the project I usually work on.

> Has it been compile tested?

No, but the second patch has.

BTW, when compiling git on a FreeBSD box I get these warnings:

date.c: In function `parse_date':
date.c:414: warning: long unsigned int format, time_t arg (arg 4)
date.c:414: warning: long unsigned int format, time_t arg (arg 4)
date.c: In function `datestamp':
date.c:427: warning: long unsigned int format, time_t arg (arg 4)
date.c:427: warning: long unsigned int format, time_t arg (arg 4)
tar-tree.c: In function `write_header':
tar-tree.c:249: warning: long unsigned int format, time_t arg (arg 3)
local-pull.c: In function `fetch':
local-pull.c:73: warning: long int format, different type arg (arg 4)

because time_t is defined as int32_t.  Don't know if they are worth
fixing at this point.

> JF> @@ -39,14 +39,18 @@
> JF>  		if (S_ISDIR(mode)) {
> JF>  			int retval;
>  
> JF> ...
> JF> -			if (!eltbuf || strcmp(elttype, "tree"))
> JF> +			if (!eltbuf || strcmp(elttype, "tree")) {
> JF> +				if (eltbuf) mem_free(eltbuf);
> JF>  				return -1;
> 
> Btw, who is putting this header in your mail?  It does not make
> sense to me unless Jonas is pseudonym for Linus...
> 
>   Mail-Followup-To: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org 

Maybe because Mutt knows I am subscribe to this mailing list and assumes
I don't want to have mail addressed directly to my email address?

I just hit 'g'.

-- 
Jonas Fonseca

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Kay Sievers @ 2005-05-05  1:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git
In-Reply-To: <7vwtqemvjt.fsf@assigned-by-dhcp.cox.net>

Allow to store and track symlink in the repository. A symlink is stored
the same way as a regular file, but with the appropriate mode bits set.
The symlink target is stored in its own blob object. This will hopefully
make our udev repository fully functional. :)

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
---

On Wed, May 04, 2005 at 04:16:22PM -0700, Junio C Hamano wrote:
> It seems to follow the original suggestion by Linus and looks
> good.  Some comments:
>
>  * It continues to assume that S_IFREG, S_IFDIR and S_IFLNK have
>    the same bit pattern everywhere.  In the same spirit as we
>    store mode bits in network byte order, it may be a good time
>    to introduce something like this:
...

>  * read-cache.c:cache_match_stat() needs to know about the
>    object type.  It was allowed to assume that anything thrown
>    at it was a file, but not anymore.  How about something like
>    this:

Both included and updated.

>   * (this is just a minor nit).  Since you have st here,
>     st.st_size can be used to see how big a buffer you need to
>     prepare for readlink() here:

Sounds nice, but is this reliable? I just remember some exotic filesystems
to reported bogus values here.

>   * Probably diff.c needs to be made aware of this change.

You already did this. :) Very nice.

Thanks,
Kay

--- a/cache.h
+++ b/cache.h
@@ -86,8 +86,19 @@ struct cache_entry {
 #define ce_size(ce) cache_entry_size(ce_namelen(ce))
 #define ce_stage(ce) ((CE_STAGEMASK & ntohs((ce)->ce_flags)) >> CE_STAGESHIFT)
 
+#define CE_IFREG  0100000
+#define CE_IFDIR  0040000
+#define CE_IFLNK  0120000
+#define CE_IFMASK 0770000
 #define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644)
-#define create_ce_mode(mode) htonl(S_IFREG | ce_permissions(mode))
+static inline unsigned int create_ce_mode(unsigned int mode)
+{
+	if (S_ISREG(mode))
+		return htonl(S_IFREG | ce_permissions(mode));
+	if (S_ISLNK(mode))
+		return htonl(CE_IFLNK);
+	return 0;
+}
 
 #define cache_entry_size(len) ((offsetof(struct cache_entry,name) + (len) + 8) & ~7)
 
@@ -124,6 +135,7 @@ extern int index_fd(unsigned char *sha1,
 #define MODE_CHANGED    0x0008
 #define INODE_CHANGED   0x0010
 #define DATA_CHANGED    0x0020
+#define TYPE_CHANGED    0x0040
 
 /* Return a statically allocated filename matching the sha1 signature */
 extern char *sha1_file_name(const unsigned char *sha1);
--- a/check-files.c
+++ b/check-files.c
@@ -28,8 +28,8 @@ static void check_file(const char *path)
 		die("preparing to update existing file '%s' not in cache", path);
 	ce = active_cache[pos];
 
-	if (fstat(fd, &st) < 0)
-		die("fstat(%s): %s", path, strerror(errno));
+	if (lstat(path, &st) < 0)
+		die("lstat(%s): %s", path, strerror(errno));
 
 	changed = cache_match_stat(ce, &st);
 	if (changed)
--- a/checkout-cache.c
+++ b/checkout-cache.c
@@ -72,23 +72,37 @@ static int write_entry(struct cache_entr
 	unsigned long size;
 	long wrote;
 	char type[20];
+	unsigned int mode;
 
 	new = read_sha1_file(ce->sha1, type, &size);
 	if (!new || strcmp(type, "blob")) {
 		return error("checkout-cache: unable to read sha1 file of %s (%s)",
 			path, sha1_to_hex(ce->sha1));
 	}
-	fd = create_file(path, ntohl(ce->ce_mode));
-	if (fd < 0) {
+	mode = ntohl(ce->ce_mode);
+	if (S_ISLNK(mode)) {
+		char target[1024];
+		memcpy(target, new, size);
+		target[size] = '\0';
+		if (symlink(target, path)) {
+			free(new);
+			return error("checkout-cache: unable to create link %s (%s)",
+				path, strerror(errno));
+		}
+		free(new);
+	} else {
+		fd = create_file(path, mode);
+		if (fd < 0) {
+			free(new);
+			return error("checkout-cache: unable to create file %s (%s)",
+				path, strerror(errno));
+		}
+		wrote = write(fd, new, size);
+		close(fd);
 		free(new);
-		return error("checkout-cache: unable to create %s (%s)",
-			path, strerror(errno));
+		if (wrote != size)
+			return error("checkout-cache: unable to write %s", path);
 	}
-	wrote = write(fd, new, size);
-	close(fd);
-	free(new);
-	if (wrote != size)
-		return error("checkout-cache: unable to write %s", path);
 	return 0;
 }
 
@@ -101,7 +115,7 @@ static int checkout_entry(struct cache_e
 	memcpy(path, base_dir, len);
 	strcpy(path + len, ce->name);
 
-	if (!stat(path, &st)) {
+	if (!lstat(path, &st)) {
 		unsigned changed = cache_match_stat(ce, &st);
 		if (!changed)
 			return 0;
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -24,7 +24,7 @@ static int get_stat_data(struct cache_en
 		static unsigned char no_sha1[20];
 		int changed;
 		struct stat st;
-		if (stat(ce->name, &st) < 0)
+		if (lstat(ce->name, &st) < 0)
 			return -1;
 		changed = cache_match_stat(ce, &st);
 		if (changed) {
--- a/diff.c
+++ b/diff.c
@@ -165,7 +165,7 @@ static void prepare_temp_file(const char
 		}
 		strcpy(temp->hex, sha1_to_hex(null_sha1));
 		sprintf(temp->mode, "%06o",
-			S_IFREG |ce_permissions(st.st_mode));
+			S_IFREG | ce_permissions(st.st_mode));
 	}
 	else {
 		int fd;
--- a/ls-files.c
+++ b/ls-files.c
@@ -199,7 +199,7 @@ static void show_files(void)
 			struct stat st;
 			if (excluded(ce->name) != show_ignored)
 				continue;
-			if (!stat(ce->name, &st))
+			if (!lstat(ce->name, &st))
 				continue;
 			printf("%s%c", ce->name, line_terminator);
 		}
--- a/read-cache.c
+++ b/read-cache.c
@@ -13,6 +13,16 @@ int cache_match_stat(struct cache_entry 
 {
 	unsigned int changed = 0;
 
+	switch (ntohl(ce->ce_mode) & CE_IFMASK) {
+	case CE_IFREG:
+		changed |= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;
+		break;
+	case CE_IFLNK:
+		changed |= !S_ISLNK(st->st_mode) ? TYPE_CHANGED : 0;
+		break;
+	default:
+		die("internal error: ce_mode is %o", ntohl(ce->ce_mode));
+	}
 	if (ce->ce_mtime.sec != htonl(st->st_mtime))
 		changed |= MTIME_CHANGED;
 	if (ce->ce_ctime.sec != htonl(st->st_ctime))
--- a/tree.c
+++ b/tree.c
@@ -13,7 +13,10 @@ static int read_one_entry(unsigned char 
 
 	memset(ce, 0, size);
 
-	ce->ce_mode = create_ce_mode(mode);
+	if (mode & S_IFMT)
+		ce->ce_mode = htonl(mode);
+	else
+		ce->ce_mode = create_ce_mode(mode);
 	ce->ce_flags = create_ce_flags(baselen + len, stage);
 	memcpy(ce->name, base, baselen);
 	memcpy(ce->name + baselen, pathname, len+1);
--- a/update-cache.c
+++ b/update-cache.c
@@ -58,30 +58,37 @@ static int add_file_to_cache(char *path)
 	struct stat st;
 	int fd;
 
-	fd = open(path, O_RDONLY);
-	if (fd < 0) {
+	if (lstat(path, &st) < 0) {
 		if (errno == ENOENT || errno == ENOTDIR) {
 			if (allow_remove)
 				return remove_file_from_cache(path);
 		}
 		return -1;
 	}
-	if (fstat(fd, &st) < 0) {
-		close(fd);
-		return -1;
-	}
 	namelen = strlen(path);
 	size = cache_entry_size(namelen);
 	ce = xmalloc(size);
 	memset(ce, 0, size);
 	memcpy(ce->name, path, namelen);
 	fill_stat_cache_info(ce, &st);
-	ce->ce_mode = create_ce_mode(st.st_mode);
+	if (S_ISREG(st.st_mode)) {
+		fd = open(path, O_RDONLY);
+		if (fd < 0)
+			return -1;
+		ce->ce_mode = create_ce_mode(st.st_mode);
+		if (index_fd(ce->sha1, fd, &st) < 0)
+			return -1;
+	} else if (S_ISLNK(st.st_mode)) {
+		unsigned int len;
+		char target[1024];
+		ce->ce_mode = htonl(S_IFLNK);
+		len = readlink(path, target, sizeof(target));
+		if (len == -1 || len+1 > sizeof(target))
+			return -1;
+		if (write_sha1_file(target, len, "blob", ce->sha1))
+			return -1;
+	}
 	ce->ce_flags = htons(namelen);
-
-	if (index_fd(ce->sha1, fd, &st) < 0)
-		return -1;
-
 	return add_cache_entry(ce, allow_add);
 }
 
@@ -137,7 +144,7 @@ static struct cache_entry *refresh_entry
 	struct cache_entry *updated;
 	int changed, size;
 
-	if (stat(ce->name, &st) < 0)
+	if (lstat(ce->name, &st) < 0)
 		return ERR_PTR(-errno);
 
 	changed = cache_match_stat(ce, &st);
@@ -145,10 +152,10 @@ static struct cache_entry *refresh_entry
 		return ce;
 
 	/*
-	 * If the mode has changed, there's no point in trying
+	 * If the mode or type has changed, there's no point in trying
 	 * to refresh the entry - it's not going to match
 	 */
-	if (changed & MODE_CHANGED)
+	if (changed & (MODE_CHANGED | TYPE_CHANGED))
 		return ERR_PTR(-EINVAL);
 
 	if (compare_data(ce, st.st_size))


^ permalink raw reply

* [PATCH] shut up gcc4's signed pointer warnings
From: Kay Sievers @ 2005-05-05  1:30 UTC (permalink / raw)
  To: git

Suppress these stupid warnings if gcc supports it:

  ls-tree.c: In function ‘list_recursive’:
  ls-tree.c:34: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness
  ls-tree.c:34: warning: pointer targets in passing argument 1 of ‘__builtin_strcmp’ differ in signedness
  ls-tree.c:34: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness
  ls-tree.c:34: warning: pointer targets in passing argument 1 of ‘__builtin_strcmp’ differ in signedness
  ls-tree.c:34: warning: pointer targets in passing argument 1 of ‘__builtin_strcmp’ differ in signedness
  ls-tree.c:34: warning: pointer targets in passing argument 1 of ‘__builtin_strcmp’ differ in signedness
  ls-tree.c:66: warning: pointer targets in passing argument 2 of ‘list_recursive’ differ in signedness

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
---

--- a/Makefile
+++ b/Makefile
@@ -7,7 +7,11 @@
 # BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
 # break unless your underlying filesystem supports those sub-second times
 # (my ext3 doesn't).
+
+cc-supports = ${shell if $(CC) ${1} -S -o /dev/null -xc /dev/null > /dev/null 2>&1; then echo "$(1)"; fi;}
+
 CFLAGS=-g -O2 -Wall
+CFLAGS+= $(call cc-supports,-Wno-pointer-sign)
 
 CC=gcc
 AR=ar


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox