performance problem: "git commit filename"

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* performance problem: "git commit filename"
@ 2008-01-12 22:46 Linus Torvalds
  2008-01-13  1:46 ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2008-01-12 22:46 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List; +Cc: Kristian Høgsberg

I thought we had fixed this long long ago, but if we did, it has 
re-surfaced.

Using an explicit filename with "git commit" is _extremely_ slow. Lookie 
here:

	[torvalds@woody linux]$ time git commit fs/exec.c
	no changes added to commit (use "git add" and/or "git commit -a")

	real    0m1.671s
	user    0m1.200s
	sys     0m0.328s

that's closer to two seconds on a fast machine, with the whole tree 
cached!

And for the uncached case, it's just unbearably slow: two and a half 
*minutes*.

In contrast, without the filename, it's much faster:

	[torvalds@woody linux]$ time git commit
	no changes added to commit (use "git add" and/or "git commit -a")

	real    0m0.387s
	user    0m0.220s
	sys     0m0.168s

with the cold-cache case now being "just" 18s (which is still long, but 
we're talking eight times faster, and certainly not unbearable!)

Doing an "strace -c" on the thing shows why. In the filename case, we 
have:

	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 32.69    0.000868           0     92299        37 lstat
	 17.40    0.000462           0     29958      3993 open
	 15.78    0.000419           0      5522           getdents
	 15.56    0.000413           0     23165           mmap
	 11.37    0.000302           0     23118           munmap
	  5.76    0.000153           0     25966         2 close
	  1.43    0.000038           0      2845           fstat
	...

and in the non-filename case we have

	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 53.67    0.000600           0     69227        31 lstat
	 23.35    0.000261           0      5522           getdents
	 11.09    0.000124           2        55           munmap
	  4.20    0.000047           0       285           write
	  3.31    0.000037           0      5537      2638 open
	  2.33    0.000026           0      2899         1 close
	  2.06    0.000023           0      2844           fstat
	...

notice how the expensive case has a lot of successful open/mmap/munmap 
calls: it is *literally* ignoring the valid entries in the old index 
entirely, and re-hashing every single file in the tree! No wonder it is 
slow!

Just counting "lstat()" calls, it's worth noticing that the non-filename 
case seems to do three lstat's for each index entry (and yes, that's two
too many), but the named file case has upped that to *four* lstats per 
entry, and then added the one open/mmap/munmap/close on top of that!

I'm pretty sure we didn't use to do things this badly. And if this is a 
regression like I think it is, it should be fixed before a real 1.5.4 
release.

I'll try to see if I can see what's up, but I thought I'd better let 
others know too, in case I don't have time. I *suspect* (but have nothing 
what-so-ever to back that up) that this happened as part of making commit 
a builtin.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-12 22:46 performance problem: "git commit filename" Linus Torvalds
@ 2008-01-13  1:46 ` Linus Torvalds
  2008-01-13  4:04   ` Linus Torvalds
  2008-01-14 23:15   ` Kristian Høgsberg
  0 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-01-13  1:46 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List; +Cc: Kristian H?gsberg

On Sat, 12 Jan 2008, Linus Torvalds wrote:
> 
> I thought we had fixed this long long ago, but if we did, it has 
> re-surfaced.

It's new, and yes, it seems to be due to the new builtin-commit.c.

I think I know what is going on.

In the old git-commit.sh, this case used to be handled with

	TMP_INDEX="$GIT_DIR/tmp-index$$"

	GIT_INDEX_FILE="$THIS_INDEX" \
	git read-tree --index-output="$TMP_INDEX" -i -m HEAD

which is a one-way merge of the *old* index and HEAD, taking the index 
information from the old index, but the actual file information from HEAD 
(to then later be updated by the named files).

This logic is implemented by builtin-read-tree.c with

	struct unpack_trees_options opts;
	..
	opts.fn = oneway_merge;
	..
	unpack_trees(nr_trees, t, &opts);

where all the magic is done by that "oneway_merge()" function being called 
for each entry by unpack_trees(). This does everything right, and the 
result is that any index entry that was up-to-date in the old index and 
unchanged in the base tree will be up-to-date in the new index too

HOWEVER. When that logic was converted from that shell-script into a 
builtin-commit.c, that conversion was not done correctly. The old "git 
read-tree -i -m" was not translated as a "unpack_trees()" call, but as 
this in prepare_index():

	discard_cache()
	..
	tree = parse_tree_indirect(head_sha1);
	..
	read_tree(tree, 0, NULL)

which is very wrong, because it replaces the old index entirely, and 
doesn't do that stat information merging.

As a result, the index that is created by read-tree is totally bogus in 
the stat cache, and yes, everything will have to be re-computed.

Kristian?

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  1:46 ` Linus Torvalds
@ 2008-01-13  4:04   ` Linus Torvalds
  2008-01-13  5:38     ` Daniel Barkalow
                       ` (2 more replies)
  2008-01-14 23:15   ` Kristian Høgsberg
  1 sibling, 3 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-01-13  4:04 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List; +Cc: Kristian H?gsberg

On Sat, 12 Jan 2008, Linus Torvalds wrote:
> 
> HOWEVER. When that logic was converted from that shell-script into a 
> builtin-commit.c, that conversion was not done correctly. The old "git 
> read-tree -i -m" was not translated as a "unpack_trees()" call, but as 
> this in prepare_index():
> 
> 	discard_cache()
> 	..
> 	tree = parse_tree_indirect(head_sha1);
> 	..
> 	read_tree(tree, 0, NULL)
> 
> which is very wrong, because it replaces the old index entirely, and 
> doesn't do that stat information merging.

This patch may or may not fix it.

It makes builtin-commit.c use the same logic that "git read-tree -i -m" 
does (which is what the old shell script did), and it seems to pass the 
test-suite, and it looks pretty obvious.

It also brings down the number of open/mmap/munmap/close calls to where it 
should be, although it still does *way* too many "lstat()" operations (ie 
it does 4*lstat for each file in the index - one more than the 
non-filename one does).

With that fixed, performance is also roughly where it should be (ie the 
17-18s for the cold-cache case), because it no longer needs to rehash all 
the files!

HOWEVER. This was just a quick hack, and while it all looks sane, this is 
some damn core code. Somebody else should double- and triple-check this.

[ That 4x lstat thing bothers me. I think we should add a flag to the 
  index saying "we checked this once already, it's clean", so that if we 
  do multiple passes over the index, we can still do just a single lstat() 
  on just the first pass. But that's a separate issue.

  On Linux, a cached lstat() is almost free. Well, at least compared to 
  all the crap operating systems out there. And obviously, if you do 
  multiple lstat's per file, all but the first one *will* be cached.

  However, "almost free" still isn't zero, and with the kernel having 23k 
  files in it, doing almost a hundred thousand lstat's is still something 
  that only takes about half a second or so for me. We _really_ should do 
  only ~23k or so of them, and the cached cache should take on the order 
  of 0.15s, rather than half a second!

  So this is worth optimizing. With bigger repositories, it's going to be 
  more noticeable, and with other operating systems, all those lstat()'s 
  will cost much _much_ more. Of course, any IO overhead will be much 
  bigger, so this is mostly a cached-case issue, but cached-case is still 
  important.. ]

Anyway, consider this being conditionally signed-off-by: me, assuming 
a few other people spend a bit of time double-checking all my logic.

Please?

			Linus

---
 builtin-commit.c |   37 ++++++++++++++++++++++++++++---------
 1 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/builtin-commit.c b/builtin-commit.c
index 73f1e35..cc5134e 100644
--- a/builtin-commit.c
+++ b/builtin-commit.c
@@ -21,6 +21,7 @@
 #include "utf8.h"
 #include "parse-options.h"
 #include "path-list.h"
+#include "unpack-trees.h"

 static const char * const builtin_commit_usage[] = {
 	"git-commit [options] [--] <filepattern>...",
@@ -177,10 +178,34 @@ static void add_remove_files(struct path_list *list)
 	}
 }

+static void create_base_index(void)
+{
+	struct tree *tree;
+	struct unpack_trees_options opts;
+	struct tree_desc t;
+
+	if (initial_commit) {
+		discard_cache();
+		return;
+	}
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.index_only = 1;
+	opts.merge = 1;
+	
+	opts.fn = oneway_merge;
+	tree = parse_tree_indirect(head_sha1);
+	if (!tree)
+		die("failed to unpack HEAD tree object");
+	parse_tree(tree);
+	init_tree_desc(&t, tree->buffer, tree->size);
+	unpack_trees(1, &t, &opts);
+}
+
 static char *prepare_index(int argc, const char **argv, const char *prefix)
 {
 	int fd;
-	struct tree *tree;
 	struct path_list partial;
 	const char **pathspec = NULL;

@@ -278,14 +303,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix)

 	fd = hold_lock_file_for_update(&false_lock,
 				       git_path("next-index-%d", getpid()), 1);
-	discard_cache();
-	if (!initial_commit) {
-		tree = parse_tree_indirect(head_sha1);
-		if (!tree)
-			die("failed to unpack HEAD tree object");
-		if (read_tree(tree, 0, NULL))
-			die("failed to read HEAD tree object");
-	}
+
+	create_base_index();
 	add_remove_files(&partial);
 	refresh_cache(REFRESH_QUIET);

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  4:04   ` Linus Torvalds
@ 2008-01-13  5:38     ` Daniel Barkalow
  2008-01-13  8:14       ` Junio C Hamano
  2008-01-13 16:57       ` Linus Torvalds
  2008-01-13  8:12     ` Junio C Hamano
  2008-01-14 23:46     ` performance problem: "git commit filename" Kristian Høgsberg
  2 siblings, 2 replies; 33+ messages in thread
From: Daniel Barkalow @ 2008-01-13  5:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Kristian H?gsberg

On Sat, 12 Jan 2008, Linus Torvalds wrote:

> It makes builtin-commit.c use the same logic that "git read-tree -i -m" 
> does (which is what the old shell script did), and it seems to pass the 
> test-suite, and it looks pretty obvious.

The only issue I know about with using unpack_trees in C as a replacement 
for read-tree in shell is that unpack_trees leaves "deletion" index 
entries in memory which are not written to disk, but may surprise some 
code (these are used to allow -u to remove the files from the working 
tree). So you may want to make sure that you don't get any weird results 
out of a commit of particular files that involves not committing some 
newly-added files:

$ git add new-file
$ (edit old-file)
$ git commit old-file

This may cause the unpack_trees to leave a misleading entry for new-file 
that the code doesn't expect. I've got a patch to make it saner as part of 
my builtin-checkout series, but I can't say for sure that that change 
won't either confuse something else or have performance problems without a 
bunch of analysis I haven't done recently.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  4:04   ` Linus Torvalds
  2008-01-13  5:38     ` Daniel Barkalow
@ 2008-01-13  8:12     ` Junio C Hamano
  2008-01-13 10:33       ` Junio C Hamano
  2008-01-13 10:38       ` [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste Junio C Hamano
  2008-01-14 23:46     ` performance problem: "git commit filename" Kristian Høgsberg
  2 siblings, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13  8:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Linus Torvalds <torvalds@linux-foundation.org> writes:

> HOWEVER. This was just a quick hack, and while it all looks sane, this is 
> some damn core code. Somebody else should double- and triple-check this.

Double-checked.  The patch looks sane.

> [ That 4x lstat thing bothers me. I think we should add a flag to the 
>   index saying "we checked this once already, it's clean", so that if we 
>   do multiple passes over the index, we can still do just a single lstat() 
>   on just the first pass. But that's a separate issue.

I've thought about this in a different context before, but it
seemed quite tricky, as some codepaths in more complex commands
(commit being one of them) tend to use the cache and discard to
use it for different purpose (like creating temporary index and
then reading the real index).  Besides, I had an impression that
we ran out of the bits of ce_flags in the cache entry, although
we could shorten the maximum path we support from 4k down to 2k
bytes.  I'll have to think about this a bit more.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  5:38     ` Daniel Barkalow
@ 2008-01-13  8:14       ` Junio C Hamano
  2008-01-13 16:57       ` Linus Torvalds
  1 sibling, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13  8:14 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Linus Torvalds, Git Mailing List, Kristian H?gsberg

Daniel Barkalow <barkalow@iabervon.org> writes:

> On Sat, 12 Jan 2008, Linus Torvalds wrote:
>
>> It makes builtin-commit.c use the same logic that "git read-tree -i -m" 
>> does (which is what the old shell script did), and it seems to pass the 
>> test-suite, and it looks pretty obvious.
>
> The only issue I know about with using unpack_trees in C as a replacement 
> for read-tree in shell is that unpack_trees leaves "deletion" index 
> entries in memory which are not written to disk,...

I do not think you have to worry about that one.  That "to be
deleted" was a Linus invention and he surely remembers it.

write_index() function of course knows about skipping them (they
are marked as !ce->ce_mode).  I think the patch is safe.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  8:12     ` Junio C Hamano
@ 2008-01-13 10:33       ` Junio C Hamano
  2008-01-13 10:54         ` [PATCH] builtin-commit.c: do not lstat(2) partially committed paths twice Junio C Hamano
                           ` (2 more replies)
  2008-01-13 10:38       ` [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste Junio C Hamano
  1 sibling, 3 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 10:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Junio C Hamano <gitster@pobox.com> writes:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> HOWEVER. This was just a quick hack, and while it all looks sane, this is 
>> some damn core code. Somebody else should double- and triple-check this.
>
> Double-checked.  The patch looks sane.
>
>> [ That 4x lstat thing bothers me. I think we should add a flag to the 
>>   index saying "we checked this once already, it's clean", so that if we 
>>   do multiple passes over the index, we can still do just a single lstat() 
>>   on just the first pass. But that's a separate issue.
>
> I've thought about this in a different context before, but it
> seemed quite tricky, as some codepaths in more complex commands
> (commit being one of them) tend to use the cache and discard to
> use it for different purpose (like creating temporary index and
> then reading the real index).  Besides, I had an impression that
> we ran out of the bits of ce_flags in the cache entry, although
> we could shorten the maximum path we support from 4k down to 2k
> bytes.  I'll have to think about this a bit more.

Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  I am
not currently looking into reducing them.

Some observations.

 * In the partial commit codepath, we run one file_exists() and
   then one add_file_to_index(), each of which has lstat(2) on
   the same pathname, for the paths to be committed.  This can
   be reduced to one trivially by changing the calling
   convention of add_file_to_index().  This is done in order to
   build the temporary index file to be written out as a tree
   for committing (and the index file needs to be written to be
   shown to the pre-commit hook).

 * Then refresh_cache_ent(), which has one lstat(2), is called
   for everybody in the temporary index, in order to refresh the
   temporary index.  Again this cannot be avoided because we
   promise to show the pre-commit hook a refreshed index.

 * Then we run file_exists() and add_file_to_index() to update
   the real index.

 * And refresh_cache_ent() is run for everybody in the real
   index.

 * The final diffstat phase runs lstat(2) from
   reuse_worktree_file() codepath to see if the work tree is up
   to date, and read from the work tree files instead of having
   to explode them from the object store.

The attached is a quick and dirty hack which may or may not
help.  It all looks sane, this also is some core code, and meant
only for discussion and not application.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.

 * I had to reduce the maximum length of allowed pathname from
   4k down to 2k for the above.  Incidentally I noticed that we
   do not check the length of pathname does not exceed what we
   can express with CE_NAMEMASK, which we may want to fix (and
   make it barf if somebody tries to add too long a path)
   independently from this issue.  A low hanging fruit for
   janitors -- hint, hint.

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

The fact that we write out a temporary index and then rebuild
the real index means CE_UPTODATE flag we populate in the
temporary index is lost and we still need to lstat(2) while
building the real index, which is a bit unfortunate.  I suspect
that we can use the one-way merge to reset the index when
building the real index after we are done building the temporary
index, instead of discarding the in-core temporary index and
re-reading the real index.

---
 cache.h      |    3 ++-
 diff.c       |    4 ++++
 read-cache.c |   10 ++++++++++
 3 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/cache.h b/cache.h
index 39331c2..9b950fc 100644
--- a/cache.h
+++ b/cache.h
@@ -108,7 +108,8 @@ struct cache_entry {
 	char name[FLEX_ARRAY]; /* more */
 };

-#define CE_NAMEMASK  (0x0fff)
+#define CE_NAMEMASK  (0x07ff)
+#define CE_UPTODATE  (0x0800)
 #define CE_STAGEMASK (0x3000)
 #define CE_UPDATE    (0x4000)
 #define CE_VALID     (0x8000)
diff --git a/diff.c b/diff.c
index b18c140..41847ae 100644
--- a/diff.c
+++ b/diff.c
@@ -1510,6 +1510,10 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
 	if (pos < 0)
 		return 0;
 	ce = active_cache[pos];
+
+	if (ce->ce_flags & htons(CE_UPTODATE))
+		return 1;
+
 	if ((lstat(name, &st) < 0) ||
 	    !S_ISREG(st.st_mode) || /* careful! */
 	    ce_match_stat(ce, &st, 0) ||
diff --git a/read-cache.c b/read-cache.c
index 7db5588..3a90db1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -44,6 +44,9 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)

 	if (assume_unchanged)
 		ce->ce_flags |= htons(CE_VALID);
+
+	if (S_ISREG(st->st_mode))
+		ce->ce_flags |= htons(CE_UPTODATE);
 }

 static int ce_compare_data(struct cache_entry *ce, struct stat *st)
@@ -794,6 +797,9 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	int changed, size;
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;

+	if (ce->ce_flags & htons(CE_UPTODATE))
+		return ce;
+
 	if (lstat(ce->name, &st) < 0) {
 		if (err)
 			*err = errno;
@@ -1170,13 +1176,17 @@ int write_index(struct index_state *istate, int newfd)

 	for (i = 0; i < entries; i++) {
 		struct cache_entry *ce = cache[i];
+		unsigned short ce_flags;
 		if (!ce->ce_mode)
 			continue;
 		if (istate->timestamp &&
 		    istate->timestamp <= ntohl(ce->ce_mtime.sec))
 			ce_smudge_racily_clean_entry(ce);
+		ce_flags = ce->ce_flags;
+		ce->ce_flags &= htons(~(CE_UPDATE | CE_UPTODATE));
 		if (ce_write(&c, newfd, ce, ce_size(ce)) < 0)
 			return -1;
+		ce->ce_flags = ce_flags;
 	}

 	/* Write extension data here */

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste
  2008-01-13  8:12     ` Junio C Hamano
  2008-01-13 10:33       ` Junio C Hamano
@ 2008-01-13 10:38       ` Junio C Hamano
  2008-01-14 21:23         ` しらいしななこ
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 10:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

When I did 2888605c649ccd423232161186d72c0e6c458a48
(builtin-commit: fix partial-commit support), I mindlessly cut
and pasted from builtin-ls-files.c, and included the part that
was meant to exclude redundant path after "ls-files --with-tree"
overlayed the HEAD commit on top of the index.  This logic does
not apply to what git-commit does and should not have been
copied, even though it would not hurt.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-commit.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/builtin-commit.c b/builtin-commit.c
index 6d2ca80..265ba6b 100644
--- a/builtin-commit.c
+++ b/builtin-commit.c
@@ -156,8 +156,6 @@ static int list_paths(struct path_list *list, const char *with_tree,
 
 	for (i = 0; i < active_nr; i++) {
 		struct cache_entry *ce = active_cache[i];
-		if (ce->ce_flags & htons(CE_UPDATE))
-			continue;
 		if (!pathspec_match(pattern, m, ce->name, 0))
 			continue;
 		path_list_insert(ce->name, list);

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH] builtin-commit.c: do not lstat(2) partially committed paths twice.
  2008-01-13 10:33       ` Junio C Hamano
@ 2008-01-13 10:54         ` Junio C Hamano
  2008-01-13 11:09         ` performance problem: "git commit filename" Junio C Hamano
  2008-01-13 17:24         ` Linus Torvalds
  2 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 10:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian Høgsberg

Junio C Hamano <gitster@pobox.com> writes:

>  * In the partial commit codepath, we run one file_exists() and
>    then one add_file_to_index(), each of which has lstat(2) on
>    the same pathname, for the paths to be committed.  This can
>    be reduced to one trivially by changing the calling
>    convention of add_file_to_index()....

And this is the "trivial" lstat(2) reduction.

I do not know if it is worth it, though.  Majority of lstat(2)
must be coming from the paths in the index (and not committed),
not from the paths that are listed on the command line to be
partially committed.

---
 builtin-commit.c |    6 ++++--
 cache.h          |    2 ++
 read-cache.c     |   34 ++++++++++++++++++++++------------
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/builtin-commit.c b/builtin-commit.c
index 6d2ca80..770bd25 100644
--- a/builtin-commit.c
+++ b/builtin-commit.c
@@ -170,9 +170,11 @@ static void add_remove_files(struct path_list *list)
 {
 	int i;
 	for (i = 0; i < list->nr; i++) {
+		struct stat st;
 		struct path_list_item *p = &(list->items[i]);
-		if (file_exists(p->path))
-			add_file_to_cache(p->path, 0);
+
+		if (!lstat(p->path, &st))
+			add_file_to_cache_with_stat(p->path, &st, 0);
 		else
 			remove_file_from_cache(p->path);
 	}
diff --git a/cache.h b/cache.h
index 39331c2..73cc83b 100644
--- a/cache.h
+++ b/cache.h
@@ -174,6 +174,7 @@ extern struct index_state the_index;
 #define remove_cache_entry_at(pos) remove_index_entry_at(&the_index, (pos))
 #define remove_file_from_cache(path) remove_file_from_index(&the_index, (path))
 #define add_file_to_cache(path, verbose) add_file_to_index(&the_index, (path), (verbose))
+#define add_file_to_cache_with_stat(path, st, verbose) add_file_to_index_with_stat(&the_index, (path), (st), (verbose))
 #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL)
 #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), (options))
 #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), (options))
@@ -273,6 +274,7 @@ extern struct cache_entry *refresh_cache_entry(struct cache_entry *ce, int reall
 extern int remove_index_entry_at(struct index_state *, int pos);
 extern int remove_file_from_index(struct index_state *, const char *path);
 extern int add_file_to_index(struct index_state *, const char *path, int verbose);
+extern int add_file_to_index_with_stat(struct index_state *, const char *path, struct stat *, int verbose);
 extern struct cache_entry *make_cache_entry(unsigned int mode, const unsigned char *sha1, const char *path, int stage, int refresh);
 extern int ce_same_name(struct cache_entry *a, struct cache_entry *b);
 
diff --git a/read-cache.c b/read-cache.c
index 7db5588..928f49b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -384,21 +384,22 @@ static int index_name_pos_also_unmerged(struct index_state *istate,
 	return pos;
 }
 
-int add_file_to_index(struct index_state *istate, const char *path, int verbose)
+int add_file_to_index_with_stat(struct index_state *istate,
+				const char *path,
+				struct stat *st,
+				int verbose)
 {
 	int size, namelen, pos;
-	struct stat st;
 	struct cache_entry *ce;
 	unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_RACY_IS_DIRTY;
 
-	if (lstat(path, &st))
-		die("%s: unable to stat (%s)", path, strerror(errno));
-
-	if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode) && !S_ISDIR(st.st_mode))
+	if (!S_ISREG(st->st_mode) &&
+	    !S_ISLNK(st->st_mode) &&
+	    !S_ISDIR(st->st_mode))
 		die("%s: can only add regular files, symbolic links or git-directories", path);
 
 	namelen = strlen(path);
-	if (S_ISDIR(st.st_mode)) {
+	if (S_ISDIR(st->st_mode)) {
 		while (namelen && path[namelen-1] == '/')
 			namelen--;
 	}
@@ -406,10 +407,10 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
 	ce = xcalloc(1, size);
 	memcpy(ce->name, path, namelen);
 	ce->ce_flags = htons(namelen);
-	fill_stat_cache_info(ce, &st);
+	fill_stat_cache_info(ce, st);
 
 	if (trust_executable_bit && has_symlinks)
-		ce->ce_mode = create_ce_mode(st.st_mode);
+		ce->ce_mode = create_ce_mode(st->st_mode);
 	else {
 		/* If there is an existing entry, pick the mode bits and type
 		 * from it, otherwise assume unexecutable regular file.
@@ -418,19 +419,19 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
 		int pos = index_name_pos_also_unmerged(istate, path, namelen);
 
 		ent = (0 <= pos) ? istate->cache[pos] : NULL;
-		ce->ce_mode = ce_mode_from_stat(ent, st.st_mode);
+		ce->ce_mode = ce_mode_from_stat(ent, st->st_mode);
 	}
 
 	pos = index_name_pos(istate, ce->name, namelen);
 	if (0 <= pos &&
 	    !ce_stage(istate->cache[pos]) &&
-	    !ie_match_stat(istate, istate->cache[pos], &st, ce_option)) {
+	    !ie_match_stat(istate, istate->cache[pos], st, ce_option)) {
 		/* Nothing changed, really */
 		free(ce);
 		return 0;
 	}
 
-	if (index_path(ce->sha1, path, &st, 1))
+	if (index_path(ce->sha1, path, st, 1))
 		die("unable to index file %s", path);
 	if (add_index_entry(istate, ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
 		die("unable to add %s to index",path);
@@ -439,6 +440,15 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
 	return 0;
 }
 
+int add_file_to_index(struct index_state *istate, const char *path, int verbose)
+{
+	struct stat st;
+	if (lstat(path, &st))
+		die("%s: unable to stat (%s)", path, strerror(errno));
+
+	return add_file_to_index_with_stat(istate, path, &st, verbose);
+}
+
 struct cache_entry *make_cache_entry(unsigned int mode,
 		const unsigned char *sha1, const char *path, int stage,
 		int refresh)

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13 10:33       ` Junio C Hamano
  2008-01-13 10:54         ` [PATCH] builtin-commit.c: do not lstat(2) partially committed paths twice Junio C Hamano
@ 2008-01-13 11:09         ` Junio C Hamano
  2008-01-13 17:24         ` Linus Torvalds
  2 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 11:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Junio C Hamano <gitster@pobox.com> writes:

> The fact that we write out a temporary index and then rebuild
> the real index means CE_UPTODATE flag we populate in the
> temporary index is lost and we still need to lstat(2) while
> building the real index, which is a bit unfortunate.  I suspect
> that we can use the one-way merge to reset the index when
> building the real index after we are done building the temporary
> index, instead of discarding the in-core temporary index and
> re-reading the real index.

This comment is completely bogus.  With your earlier one-way
merge fix, as the way CE_UPTODATE patch was written we preserve
in-core CE_UPTODATE bit across write_index(), the code already
should be taking advantage of an earlier lstat(2).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  5:38     ` Daniel Barkalow
  2008-01-13  8:14       ` Junio C Hamano
@ 2008-01-13 16:57       ` Linus Torvalds
  2008-01-13 19:31         ` Daniel Barkalow
  1 sibling, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2008-01-13 16:57 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Git Mailing List, Kristian H?gsberg



On Sun, 13 Jan 2008, Daniel Barkalow wrote:
> 
> The only issue I know about with using unpack_trees in C as a replacement 
> for read-tree in shell is that unpack_trees leaves "deletion" index 
> entries in memory which are not written to disk, but may surprise some 
> code (these are used to allow -u to remove the files from the working 
> tree).

I certainly agree that this patch should be double-checked. I'm pretty 
sure the issue you mention wouldn't be an issue, since the end result is 
only used for actually updating the index and writing it out as a tree 
(both of which should handle the magic zero ce_mode case ok), but it would 
certainly be good to walk through all cases.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13 10:33       ` Junio C Hamano
  2008-01-13 10:54         ` [PATCH] builtin-commit.c: do not lstat(2) partially committed paths twice Junio C Hamano
  2008-01-13 11:09         ` performance problem: "git commit filename" Junio C Hamano
@ 2008-01-13 17:24         ` Linus Torvalds
  2008-01-13 19:39           ` Junio C Hamano
  2008-01-14  1:00           ` performance problem: "git commit filename" Junio C Hamano
  2 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-01-13 17:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Kristian H?gsberg



On Sun, 13 Jan 2008, Junio C Hamano wrote:
> 
> The attached is a quick and dirty hack which may or may not
> help.  It all looks sane, this also is some core code, and meant
> only for discussion and not application.

I don't think this will help.

You never set CE_UPTODATE, except in the "fill_stat_cache_info()" 
function, but that one will never be called for an old file that already 
matched the stat.

So at a minimum, you should also make ie_match_stat() set CE_UPTODATE if 
it matches. Or something.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13 16:57       ` Linus Torvalds
@ 2008-01-13 19:31         ` Daniel Barkalow
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Barkalow @ 2008-01-13 19:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Kristian H?gsberg

On Sun, 13 Jan 2008, Linus Torvalds wrote:

> On Sun, 13 Jan 2008, Daniel Barkalow wrote:
> > 
> > The only issue I know about with using unpack_trees in C as a replacement 
> > for read-tree in shell is that unpack_trees leaves "deletion" index 
> > entries in memory which are not written to disk, but may surprise some 
> > code (these are used to allow -u to remove the files from the working 
> > tree).
> 
> I certainly agree that this patch should be double-checked. I'm pretty 
> sure the issue you mention wouldn't be an issue, since the end result is 
> only used for actually updating the index and writing it out as a tree 
> (both of which should handle the magic zero ce_mode case ok), but it would 
> certainly be good to walk through all cases.

Yeah, I didn't think it would be an actual problem, but verifying that 
requires looking outside of the context of the patch. It may even be worth 
putting in a comment for now, since I bet wt_status_print and run_status 
could be optimized in a way that would look perfectly reasonable (use 
the in-memory index, instead of reading a file) but would expose the 
magic case to the diff machinary, which (IIRC) doesn't handle it. But I 
agree (having now looked at the rest of builtin-commit) that the odd index 
entries can't escape, and this should be fine for 1.5.4.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13 17:24         ` Linus Torvalds
@ 2008-01-13 19:39           ` Junio C Hamano
  2008-01-13 22:36             ` [PATCH] index: be careful when handling long names Junio C Hamano
  2008-01-14  1:00           ` performance problem: "git commit filename" Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 19:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sun, 13 Jan 2008, Junio C Hamano wrote:
>> 
>> The attached is a quick and dirty hack which may or may not
>> help.  It all looks sane, this also is some core code, and meant
>> only for discussion and not application.
>
> I don't think this will help.
>
> You never set CE_UPTODATE, except in the "fill_stat_cache_info()" 
> function, but that one will never be called for an old file that already 
> matched the stat.
>
> So at a minimum, you should also make ie_match_stat() set CE_UPTODATE if 
> it matches. Or something.

Unfortunately ie_match_stat() is too late.  The caller is
supposed to have already called lstat(2) and give the result to
that function.

When refresh_cache_ent() finds the entry actually matched, we
could mark the path with CE_UPTODATE.  That would be a
relatively contained and safe optimization that might help
git-commit.

About the CE_NAMEMASK limitation (and currently we do not check
it, so I think we would be screwed when a pathname that is
longer than (CE_NAMEMASK+1) and still fits under PATH_MAX is
given), I think we do not have to limit the maximum pathname
length.  Instead we can teach create_ce_flags() and ce_namelen()
that a name longer than 2k (or 4k) has the NAMEMASK bits that
are all 1 and ce->name[] must be counted if so (with an obvious
optimization to start counting at byte position 2k or 4k in
ce_namelen()).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] index: be careful when handling long names
  2008-01-13 19:39           ` Junio C Hamano
@ 2008-01-13 22:36             ` Junio C Hamano
  2008-01-13 22:53               ` Alex Riesen
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 22:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 Junio C Hamano <gitster@pobox.com> writes:

 > About the CE_NAMEMASK limitation (and currently we do not check
 > it, so I think we would be screwed when a pathname that is
 > longer than (CE_NAMEMASK+1) and still fits under PATH_MAX is
 > given), I think we do not have to limit the maximum pathname
 > length.  Instead we can teach create_ce_flags() and ce_namelen()
 > that a name longer than 2k (or 4k) has the NAMEMASK bits that
 > are all 1 and ce->name[] must be counted if so (with an obvious
 > optimization to start counting at byte position 2k or 4k in
 > ce_namelen()).

 This should fix it.  Passes tests including the new test that
 the existing code fails.

 cache.h          |   17 +++++++++++++++--
 t/t0000-basic.sh |   18 ++++++++++++++++++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 39331c2..ad53acc 100644
--- a/cache.h
+++ b/cache.h
@@ -114,8 +114,21 @@ struct cache_entry {
 #define CE_VALID     (0x8000)
 #define CE_STAGESHIFT 12
 
-#define create_ce_flags(len, stage) htons((len) | ((stage) << CE_STAGESHIFT))
-#define ce_namelen(ce) (CE_NAMEMASK & ntohs((ce)->ce_flags))
+static inline unsigned create_ce_flags(size_t len, unsigned stage)
+{
+	if (len >= CE_NAMEMASK)
+		len = CE_NAMEMASK;
+	return htons(len | (stage << CE_STAGESHIFT));
+}
+
+static inline size_t ce_namelen(const struct cache_entry *ce)
+{
+	size_t len = ntohs((ce)->ce_flags) & CE_NAMEMASK;
+	if (len < CE_NAMEMASK) /* likely */
+		return len;
+	return strlen(ce->name + CE_NAMEMASK) + CE_NAMEMASK;
+}
+
 #define ce_size(ce) cache_entry_size(ce_namelen(ce))
 #define ce_stage(ce) ((CE_STAGEMASK & ntohs((ce)->ce_flags)) >> CE_STAGESHIFT)
 
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 4e49d59..40551a3 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -297,4 +297,22 @@ test_expect_success 'absolute path works as expected' '
 	test "$sym" = "$(test-absolute-path $dir2/syml)"
 '
 
+test_expect_success 'very long name in the index handled sanely' '
+
+	a=a && # 1
+	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 16
+	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 256
+	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 4096
+	a=${a}q
+
+	>path4 &&
+	git update-index --add path4 &&
+	(
+		git ls-files -s path4 |
+		sed -e "s/	.*/	/" |
+		tr -d "\012"
+		echo "$a"
+	) | git update-index --index-info
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] index: be careful when handling long names
  2008-01-13 22:36             ` [PATCH] index: be careful when handling long names Junio C Hamano
@ 2008-01-13 22:53               ` Alex Riesen
  2008-01-13 23:08                 ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Alex Riesen @ 2008-01-13 22:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

Junio C Hamano, Sun, Jan 13, 2008 23:36:34 +0100:
> +test_expect_success 'very long name in the index handled sanely' '
> +
> +	a=a && # 1
> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 16
> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 256
> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 4096

I'd expect it to fail on some systems (everywindowsthing up to w2k,
maybe some commercial unices).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] index: be careful when handling long names
  2008-01-13 22:53               ` Alex Riesen
@ 2008-01-13 23:08                 ` Junio C Hamano
  2008-01-13 23:33                   ` Alex Riesen
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2008-01-13 23:08 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Linus Torvalds, Git Mailing List

Alex Riesen <raa.lkml@gmail.com> writes:

> Junio C Hamano, Sun, Jan 13, 2008 23:36:34 +0100:
>> +test_expect_success 'very long name in the index handled sanely' '
>> +
>> +	a=a && # 1
>> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 16
>> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 256
>> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 4096
>
> I'd expect it to fail on some systems (everywindowsthing up to w2k,
> maybe some commercial unices).

My understanding is that Everywindowsthing do not come with any
(POSIX compliant) shell that we support by default, so if you
are talking about a limit of shell variable value, I do not
think it is an issue to begin with.  It is just the matter of
picking a sensible shell (I understand both Cygwin and msys
ports use a shell that supports more than 4k bytes in value
given to a variable).

I would agree that it might overflow the argument limit when
this is given to "echo", though.  We cannot do much about it,
but you may have cleverer ideas.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] index: be careful when handling long names
  2008-01-13 23:08                 ` Junio C Hamano
@ 2008-01-13 23:33                   ` Alex Riesen
  2008-01-14 21:03                     ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Alex Riesen @ 2008-01-13 23:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

Junio C Hamano, Mon, Jan 14, 2008 00:08:07 +0100:
> Alex Riesen <raa.lkml@gmail.com> writes:
> 
> > Junio C Hamano, Sun, Jan 13, 2008 23:36:34 +0100:
> >> +test_expect_success 'very long name in the index handled sanely' '
> >> +
> >> +	a=a && # 1
> >> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 16
> >> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 256
> >> +	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 4096
> >
> > I'd expect it to fail on some systems (everywindowsthing up to w2k,
> > maybe some commercial unices).
> 
> My understanding is that Everywindowsthing do not come with any
> (POSIX compliant) shell that we support by default, so if you
> are talking about a limit of shell variable value, I do not
> think it is an issue to begin with.  It is just the matter of

Oh, right. The file system wont even see it, it is passed directly to
update-index.

> picking a sensible shell (I understand both Cygwin and msys
> ports use a shell that supports more than 4k bytes in value
> given to a variable).

can't check right now, but I believe it is so

> I would agree that it might overflow the argument limit when
> this is given to "echo", though.  We cannot do much about it,
> but you may have cleverer ideas.

I thought about conditionally disabling the test, like it was done
when the tabs in filenames had to be tested. Wont be needed for this
particular case.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13 17:24         ` Linus Torvalds
  2008-01-13 19:39           ` Junio C Hamano
@ 2008-01-14  1:00           ` Junio C Hamano
  2008-01-14 17:07             ` Linus Torvalds
  2008-01-15  0:18             ` Linus Torvalds
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-14  1:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Linus Torvalds <torvalds@linux-foundation.org> writes:

> So at a minimum, you should also make ie_match_stat() set CE_UPTODATE if 
> it matches. Or something.

I've reworked the patch, and in the kernel repository, a
single-path commit after touching that path now calls 23k
lstat(2).  It used to call 46k lstat(2) after your fix.

This depends on the "index: be careful when handling long names"
patch I sent earlier.

-- >8 --
[PATCH] Avoid running lstat(2) on the same cache entry.

Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  This
patch is not about reducing them.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.  If somebody later walks the index and
   wants to see if the work tree has changes, they do not have
   to be checked with lstat(2) again.

 * This reduces the maximum length cached from 4k down to 2k to
   introduce the new flag.  People with an index that records
   paths longer than that needs to first commit their changes
   with the old git, discard the cache with "rm -f .git/index"
   and re-read with updated git by running "git-read-tree HEAD"
   followed by "git update-index --refresh".

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * refresh_cache_ent() also marks the cache entry that are clean
   with CE_UPTODATE.

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 cache.h      |    4 +++-
 diff.c       |    4 ++++
 read-cache.c |   20 +++++++++++++++++++-
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 7f1457a..3e1cdf9 100644
--- a/cache.h
+++ b/cache.h
@@ -108,7 +108,8 @@ struct cache_entry {
 	char name[FLEX_ARRAY]; /* more */
 };
 
-#define CE_NAMEMASK  (0x0fff)
+#define CE_NAMEMASK  (0x07ff)
+#define CE_UPTODATE  (0x0800)
 #define CE_STAGEMASK (0x3000)
 #define CE_UPDATE    (0x4000)
 #define CE_VALID     (0x8000)
@@ -129,6 +130,7 @@ static inline size_t ce_namelen(const struct cache_entry *ce)
 	return strlen(ce->name + CE_NAMEMASK) + CE_NAMEMASK;
 }
 
+#define ce_uptodate(ce) (!!((ce)->ce_flags & htons(CE_UPTODATE)))
 #define ce_size(ce) cache_entry_size(ce_namelen(ce))
 #define ce_stage(ce) ((CE_STAGEMASK & ntohs((ce)->ce_flags)) >> CE_STAGESHIFT)
 
diff --git a/diff.c b/diff.c
index b18c140..62d0c06 100644
--- a/diff.c
+++ b/diff.c
@@ -1510,6 +1510,10 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
 	if (pos < 0)
 		return 0;
 	ce = active_cache[pos];
+
+	if (ce_uptodate(ce))
+		return 1;
+
 	if ((lstat(name, &st) < 0) ||
 	    !S_ISREG(st.st_mode) || /* careful! */
 	    ce_match_stat(ce, &st, 0) ||
diff --git a/read-cache.c b/read-cache.c
index 928f49b..6e40e37 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -44,6 +44,9 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 
 	if (assume_unchanged)
 		ce->ce_flags |= htons(CE_VALID);
+
+	if (S_ISREG(st->st_mode))
+		ce->ce_flags |= htons(CE_UPTODATE);
 }
 
 static int ce_compare_data(struct cache_entry *ce, struct stat *st)
@@ -428,6 +431,7 @@ int add_file_to_index_with_stat(struct index_state *istate,
 	    !ie_match_stat(istate, istate->cache[pos], st, ce_option)) {
 		/* Nothing changed, really */
 		free(ce);
+		istate->cache[pos]->ce_flags |= htons(CE_UPTODATE);
 		return 0;
 	}
 
@@ -804,6 +808,9 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	int changed, size;
 	int ignore_valid = options & CE_MATCH_IGNORE_VALID;
 
+	if (ce_uptodate(ce))
+		return ce;
+
 	if (lstat(ce->name, &st) < 0) {
 		if (err)
 			*err = errno;
@@ -822,8 +829,15 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 		if (ignore_valid && assume_unchanged &&
 		    !(ce->ce_flags & htons(CE_VALID)))
 			; /* mark this one VALID again */
-		else
+		else {
+			/*
+			 * We do not mark the index itself "modified"
+			 * because CE_UPTODATE flag is in-core only;
+			 * we are not going to write this change out.
+			 */
+			ce->ce_flags |= htons(CE_UPTODATE);
 			return ce;
+		}
 	}
 
 	if (ie_modified(istate, ce, &st, options)) {
@@ -1180,13 +1194,17 @@ int write_index(struct index_state *istate, int newfd)
 
 	for (i = 0; i < entries; i++) {
 		struct cache_entry *ce = cache[i];
+		unsigned short ce_flags;
 		if (!ce->ce_mode)
 			continue;
 		if (istate->timestamp &&
 		    istate->timestamp <= ntohl(ce->ce_mtime.sec))
 			ce_smudge_racily_clean_entry(ce);
+		ce_flags = ce->ce_flags;
+		ce->ce_flags &= htons(~(CE_UPDATE | CE_UPTODATE));
 		if (ce_write(&c, newfd, ce, ce_size(ce)) < 0)
 			return -1;
+		ce->ce_flags = ce_flags;
 	}
 
 	/* Write extension data here */
-- 
1.5.4.rc3.4.g16335

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14  1:00           ` performance problem: "git commit filename" Junio C Hamano
@ 2008-01-14 17:07             ` Linus Torvalds
  2008-01-14 18:38               ` Junio C Hamano
  2008-01-14 19:39               ` Linus Torvalds
  2008-01-15  0:18             ` Linus Torvalds
  1 sibling, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-01-14 17:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Kristian H?gsberg

On Sun, 13 Jan 2008, Junio C Hamano wrote:
> 
> I've reworked the patch, and in the kernel repository, a
> single-path commit after touching that path now calls 23k
> lstat(2).  It used to call 46k lstat(2) after your fix.

Ok, I really like what the patch does, and how it looks.

At the same time, I *really* hate how we now edit the cache entries in 
place for these kinds of things that really have nothing to do with the 
on-disk format. That's not a new thing (CE_UPDATE is the same), but it 
definitely got uglier.

So I think this patch is good, but I think it would be even better if we 
just bit the bullet and started looking at having a different in-memory 
representation from the on-disk one. Possibly not *that* much different: 
perhaps just keeping a pointer to the on-disk one along with a flags 
value.

That would be a fairly painful change, though (and quite independent from 
this particular one - apart from the fact that CE_UPTODATE is one of the 
users that could be cleaned up if we did that).

Comments?

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14 17:07             ` Linus Torvalds
@ 2008-01-14 18:38               ` Junio C Hamano
  2008-01-14 19:39               ` Linus Torvalds
  1 sibling, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-14 18:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Linus Torvalds <torvalds@linux-foundation.org> writes:

> ... I think it would be even better if we 
> just bit the bullet and started looking at having a different in-memory 
> representation from the on-disk one. Possibly not *that* much different: 
> perhaps just keeping a pointer to the on-disk one along with a flags 
> value.

We have two things we currently do that are not about on-disk
index file ('-'), and this patch adds another ('+'):

 - Update the work tree file that corresponds to this entry
   (CE_UPDATE);

 - This entry is to be removed (ce_mode == 0);

 + The work tree file that corresponds to this entry is known to
   be unchanged (CE_UPTODATE).

We could introduce "struct in_core_cache_entry" that has these
information, indexed and sorted by name, and has a pointer to
what we read from the on-disk index.

	struct in_core_cache_entry {
		struct cache_entry *e;
                unsigned is_up_to_date : 1,
                	 to_be_updated : 1,
                         to_be_removed : 1;
	};

The code that iterate over active_cache[] will instead iterate
over this.  The number of the entries in this array will be the
new active_nr.

In the existing code, we reference "ce->name" and "ce->sha1"
everywhere. When we check and update flags we do bitops between
"ce->ce_flags" and "htons(CE_BLAH)" in many places.  Converting
them would adds another indirection and be quite painful.  But
the compiler can reliably spot the places we fail to find, so it
at least is not so risky.  It will just be a lot of work.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14 17:07             ` Linus Torvalds
  2008-01-14 18:38               ` Junio C Hamano
@ 2008-01-14 19:39               ` Linus Torvalds
  2008-01-14 20:08                 ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2008-01-14 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Kristian H?gsberg



On Mon, 14 Jan 2008, Linus Torvalds wrote:
> 
> So I think this patch is good, but I think it would be even better if we 
> just bit the bullet and started looking at having a different in-memory 
> representation from the on-disk one.

Ok, so here's a possible patch.

It passes all the tests for me, and looks fairly ok, but it's also a bit 
big.

What makes it big is that I made the in-memory format be in host order, so 
that we can remove a *lot* of the "htonl/ntohl" switcheroo, and do it just 
on index file read/write.

The nice thing about this patch is that it would make it a lot easier to 
do any index handling changes,  because it makes a clear difference 
between the on-disk and the in-memory formats.

I realize that the patch looks big (195 lines inserted and 148 lines 
removed), but *most* of the lines are literally those ntohl() 
simplifications, ie stuff like

	-       if (S_ISGITLINK(ntohl(ce->ce_mode))) {
	+       if (S_ISGITLINK(ce->ce_mode)) {

so while it adds lines (for the "convert from disk" and "convert to disk" 
format conversions), in  many ways it really simplifies the source code 
too.

Comments?

This is on top of current master, so it's *before* junios thing that adds 
a CE_UPTODATE.

With this, the high 16 bits of "ce_flags" are in-memory only, so you could 
just make CE_UPTODATE be 0x10000, and it automatically ends up never being 
written to disk (and always "reads" as zero).

		Linus

---
 builtin-apply.c        |   10 ++--
 builtin-blame.c        |    2 +-
 builtin-fsck.c         |    2 +-
 builtin-grep.c         |    4 +-
 builtin-ls-files.c     |   10 ++--
 builtin-read-tree.c    |    2 +-
 builtin-rerere.c       |    4 +-
 builtin-update-index.c |   18 +++---
 cache-tree.c           |    2 +-
 cache.h                |   41 +++++++----
 diff-lib.c             |   31 ++++-----
 dir.c                  |    2 +-
 entry.c                |    6 +-
 merge-index.c          |    2 +-
 merge-recursive.c      |    2 +-
 reachable.c            |    2 +-
 read-cache.c           |  183 ++++++++++++++++++++++++++++-------------------
 sha1_name.c            |    2 +-
 tree.c                 |    4 +-
 unpack-trees.c         |   14 ++--
 20 files changed, 195 insertions(+), 148 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index d57bb6e..bd7cc37 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1946,7 +1946,7 @@ static int read_file_or_gitlink(struct cache_entry *ce, struct strbuf *buf)
 	if (!ce)
 		return 0;
 
-	if (S_ISGITLINK(ntohl(ce->ce_mode))) {
+	if (S_ISGITLINK(ce->ce_mode)) {
 		strbuf_grow(buf, 100);
 		strbuf_addf(buf, "Subproject commit %s\n", sha1_to_hex(ce->sha1));
 	} else {
@@ -2023,7 +2023,7 @@ static int check_to_create_blob(const char *new_name, int ok_if_exists)
 
 static int verify_index_match(struct cache_entry *ce, struct stat *st)
 {
-	if (S_ISGITLINK(ntohl(ce->ce_mode))) {
+	if (S_ISGITLINK(ce->ce_mode)) {
 		if (!S_ISDIR(st->st_mode))
 			return -1;
 		return 0;
@@ -2082,12 +2082,12 @@ static int check_patch(struct patch *patch, struct patch *prev_patch)
 				return error("%s: does not match index",
 					     old_name);
 			if (cached)
-				st_mode = ntohl(ce->ce_mode);
+				st_mode = ce->ce_mode;
 		} else if (stat_ret < 0)
 			return error("%s: %s", old_name, strerror(errno));
 
 		if (!cached)
-			st_mode = ntohl(ce_mode_from_stat(ce, st.st_mode));
+			st_mode = ce_mode_from_stat(ce, st.st_mode);
 
 		if (patch->is_new < 0)
 			patch->is_new = 0;
@@ -2388,7 +2388,7 @@ static void add_index_file(const char *path, unsigned mode, void *buf, unsigned
 	ce = xcalloc(1, ce_size);
 	memcpy(ce->name, path, namelen);
 	ce->ce_mode = create_ce_mode(mode);
-	ce->ce_flags = htons(namelen);
+	ce->ce_flags = namelen;
 	if (S_ISGITLINK(mode)) {
 		const char *s = buf;
 
diff --git a/builtin-blame.c b/builtin-blame.c
index 9b4c02e..c7e6887 100644
--- a/builtin-blame.c
+++ b/builtin-blame.c
@@ -2092,7 +2092,7 @@ static struct commit *fake_working_tree_commit(const char *path, const char *con
 	if (!mode) {
 		int pos = cache_name_pos(path, len);
 		if (0 <= pos)
-			mode = ntohl(active_cache[pos]->ce_mode);
+			mode = active_cache[pos]->ce_mode;
 		else
 			/* Let's not bother reading from HEAD tree */
 			mode = S_IFREG | 0644;
diff --git a/builtin-fsck.c b/builtin-fsck.c
index e4874f6..8876d34 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -762,7 +762,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 			struct blob *blob;
 			struct object *obj;
 
-			mode = ntohl(active_cache[i]->ce_mode);
+			mode = active_cache[i]->ce_mode;
 			if (S_ISGITLINK(mode))
 				continue;
 			blob = lookup_blob(active_cache[i]->sha1);
diff --git a/builtin-grep.c b/builtin-grep.c
index 0d6cc73..9180b39 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -331,7 +331,7 @@ static int external_grep(struct grep_opt *opt, const char **paths, int cached)
 		struct cache_entry *ce = active_cache[i];
 		char *name;
 		int kept;
-		if (!S_ISREG(ntohl(ce->ce_mode)))
+		if (!S_ISREG(ce->ce_mode))
 			continue;
 		if (!pathspec_matches(paths, ce->name))
 			continue;
@@ -387,7 +387,7 @@ static int grep_cache(struct grep_opt *opt, const char **paths, int cached)
 
 	for (nr = 0; nr < active_nr; nr++) {
 		struct cache_entry *ce = active_cache[nr];
-		if (!S_ISREG(ntohl(ce->ce_mode)))
+		if (!S_ISREG(ce->ce_mode))
 			continue;
 		if (!pathspec_matches(paths, ce->name))
 			continue;
diff --git a/builtin-ls-files.c b/builtin-ls-files.c
index 0f0ab2d..d56e33e 100644
--- a/builtin-ls-files.c
+++ b/builtin-ls-files.c
@@ -189,7 +189,7 @@ static void show_ce_entry(const char *tag, struct cache_entry *ce)
 		return;
 
 	if (tag && *tag && show_valid_bit &&
-	    (ce->ce_flags & htons(CE_VALID))) {
+	    (ce->ce_flags & CE_VALID)) {
 		static char alttag[4];
 		memcpy(alttag, tag, 3);
 		if (isalpha(tag[0]))
@@ -210,7 +210,7 @@ static void show_ce_entry(const char *tag, struct cache_entry *ce)
 	} else {
 		printf("%s%06o %s %d\t",
 		       tag,
-		       ntohl(ce->ce_mode),
+		       ce->ce_mode,
 		       abbrev ? find_unique_abbrev(ce->sha1,abbrev)
 				: sha1_to_hex(ce->sha1),
 		       ce_stage(ce));
@@ -242,7 +242,7 @@ static void show_files(struct dir_struct *dir, const char *prefix)
 				continue;
 			if (show_unmerged && !ce_stage(ce))
 				continue;
-			if (ce->ce_flags & htons(CE_UPDATE))
+			if (ce->ce_flags & CE_UPDATE)
 				continue;
 			show_ce_entry(ce_stage(ce) ? tag_unmerged : tag_cached, ce);
 		}
@@ -350,7 +350,7 @@ void overlay_tree_on_cache(const char *tree_name, const char *prefix)
 		struct cache_entry *ce = active_cache[i];
 		if (!ce_stage(ce))
 			continue;
-		ce->ce_flags |= htons(CE_STAGEMASK);
+		ce->ce_flags |= CE_STAGEMASK;
 	}
 
 	if (prefix) {
@@ -379,7 +379,7 @@ void overlay_tree_on_cache(const char *tree_name, const char *prefix)
 			 */
 			if (last_stage0 &&
 			    !strcmp(last_stage0->name, ce->name))
-				ce->ce_flags |= htons(CE_UPDATE);
+				ce->ce_flags |= CE_UPDATE;
 		}
 	}
 }
diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index 43cd56a..eb879e1 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -46,7 +46,7 @@ static int read_cache_unmerged(void)
 			cache_tree_invalidate_path(active_cache_tree, ce->name);
 			last = ce;
 			ce->ce_mode = 0;
-			ce->ce_flags &= ~htons(CE_STAGEMASK);
+			ce->ce_flags &= ~CE_STAGEMASK;
 		}
 		*dst++ = ce;
 	}
diff --git a/builtin-rerere.c b/builtin-rerere.c
index 37e6248..df1a55d 100644
--- a/builtin-rerere.c
+++ b/builtin-rerere.c
@@ -149,8 +149,8 @@ static int find_conflict(struct path_list *conflict)
 		if (ce_stage(e2) == 2 &&
 		    ce_stage(e3) == 3 &&
 		    ce_same_name(e2, e3) &&
-		    S_ISREG(ntohl(e2->ce_mode)) &&
-		    S_ISREG(ntohl(e3->ce_mode))) {
+		    S_ISREG(e2->ce_mode) &&
+		    S_ISREG(e3->ce_mode)) {
 			path_list_insert((const char *)e2->name, conflict);
 			i++; /* skip over both #2 and #3 */
 		}
diff --git a/builtin-update-index.c b/builtin-update-index.c
index e1a938d..b5a34ed 100644
--- a/builtin-update-index.c
+++ b/builtin-update-index.c
@@ -47,10 +47,10 @@ static int mark_valid(const char *path)
 	if (0 <= pos) {
 		switch (mark_valid_only) {
 		case MARK_VALID:
-			active_cache[pos]->ce_flags |= htons(CE_VALID);
+			active_cache[pos]->ce_flags |= CE_VALID;
 			break;
 		case UNMARK_VALID:
-			active_cache[pos]->ce_flags &= ~htons(CE_VALID);
+			active_cache[pos]->ce_flags &= ~CE_VALID;
 			break;
 		}
 		cache_tree_invalidate_path(active_cache_tree, path);
@@ -95,7 +95,7 @@ static int add_one_path(struct cache_entry *old, const char *path, int len, stru
 	size = cache_entry_size(len);
 	ce = xcalloc(1, size);
 	memcpy(ce->name, path, len);
-	ce->ce_flags = htons(len);
+	ce->ce_flags = len;
 	fill_stat_cache_info(ce, st);
 	ce->ce_mode = ce_mode_from_stat(old, st->st_mode);
 
@@ -139,7 +139,7 @@ static int process_directory(const char *path, int len, struct stat *st)
 	/* Exact match: file or existing gitlink */
 	if (pos >= 0) {
 		struct cache_entry *ce = active_cache[pos];
-		if (S_ISGITLINK(ntohl(ce->ce_mode))) {
+		if (S_ISGITLINK(ce->ce_mode)) {
 
 			/* Do nothing to the index if there is no HEAD! */
 			if (resolve_gitlink_ref(path, "HEAD", sha1) < 0)
@@ -183,7 +183,7 @@ static int process_file(const char *path, int len, struct stat *st)
 	int pos = cache_name_pos(path, len);
 	struct cache_entry *ce = pos < 0 ? NULL : active_cache[pos];
 
-	if (ce && S_ISGITLINK(ntohl(ce->ce_mode)))
+	if (ce && S_ISGITLINK(ce->ce_mode))
 		return error("%s is already a gitlink, not replacing", path);
 
 	return add_one_path(ce, path, len, st);
@@ -226,7 +226,7 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
 	ce->ce_flags = create_ce_flags(len, stage);
 	ce->ce_mode = create_ce_mode(mode);
 	if (assume_unchanged)
-		ce->ce_flags |= htons(CE_VALID);
+		ce->ce_flags |= CE_VALID;
 	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
 	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
 	if (add_cache_entry(ce, option))
@@ -246,14 +246,14 @@ static void chmod_path(int flip, const char *path)
 	if (pos < 0)
 		goto fail;
 	ce = active_cache[pos];
-	mode = ntohl(ce->ce_mode);
+	mode = ce->ce_mode;
 	if (!S_ISREG(mode))
 		goto fail;
 	switch (flip) {
 	case '+':
-		ce->ce_mode |= htonl(0111); break;
+		ce->ce_mode |= 0111; break;
 	case '-':
-		ce->ce_mode &= htonl(~0111); break;
+		ce->ce_mode &= ~0111; break;
 	default:
 		goto fail;
 	}
diff --git a/cache-tree.c b/cache-tree.c
index 50b3526..3ef5f87 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -320,7 +320,7 @@ static int update_one(struct cache_tree *it,
 		}
 		else {
 			sha1 = ce->sha1;
-			mode = ntohl(ce->ce_mode);
+			mode = ce->ce_mode;
 			entlen = pathlen - baselen;
 		}
 		if (mode != S_IFGITLINK && !missing_ok && !has_sha1_file(sha1))
diff --git a/cache.h b/cache.h
index 39331c2..0aed11e 100644
--- a/cache.h
+++ b/cache.h
@@ -94,17 +94,31 @@ struct cache_time {
  * We save the fields in big-endian order to allow using the
  * index file over NFS transparently.
  */
+struct ondisk_cache_entry {
+	struct cache_time ctime;
+	struct cache_time mtime;
+	unsigned int dev;
+	unsigned int ino;
+	unsigned int mode;
+	unsigned int uid;
+	unsigned int gid;
+	unsigned int size;
+	unsigned char sha1[20];
+	unsigned short flags;
+	char name[FLEX_ARRAY]; /* more */
+};
+
 struct cache_entry {
-	struct cache_time ce_ctime;
-	struct cache_time ce_mtime;
+	unsigned int ce_ctime;
+	unsigned int ce_mtime;
 	unsigned int ce_dev;
 	unsigned int ce_ino;
 	unsigned int ce_mode;
 	unsigned int ce_uid;
 	unsigned int ce_gid;
 	unsigned int ce_size;
+	unsigned int ce_flags;
 	unsigned char sha1[20];
-	unsigned short ce_flags;
 	char name[FLEX_ARRAY]; /* more */
 };
 
@@ -114,28 +128,29 @@ struct cache_entry {
 #define CE_VALID     (0x8000)
 #define CE_STAGESHIFT 12
 
-#define create_ce_flags(len, stage) htons((len) | ((stage) << CE_STAGESHIFT))
-#define ce_namelen(ce) (CE_NAMEMASK & ntohs((ce)->ce_flags))
+#define create_ce_flags(len, stage) ((len) | ((stage) << CE_STAGESHIFT))
+#define ce_namelen(ce) (CE_NAMEMASK & (ce)->ce_flags)
 #define ce_size(ce) cache_entry_size(ce_namelen(ce))
-#define ce_stage(ce) ((CE_STAGEMASK & ntohs((ce)->ce_flags)) >> CE_STAGESHIFT)
+#define ondisk_ce_size(ce) ondisk_cache_entry_size(ce_namelen(ce))
+#define ce_stage(ce) ((CE_STAGEMASK & (ce)->ce_flags) >> CE_STAGESHIFT)
 
 #define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644)
 static inline unsigned int create_ce_mode(unsigned int mode)
 {
 	if (S_ISLNK(mode))
-		return htonl(S_IFLNK);
+		return S_IFLNK;
 	if (S_ISDIR(mode) || S_ISGITLINK(mode))
-		return htonl(S_IFGITLINK);
-	return htonl(S_IFREG | ce_permissions(mode));
+		return S_IFGITLINK;
+	return S_IFREG | ce_permissions(mode);
 }
 static inline unsigned int ce_mode_from_stat(struct cache_entry *ce, unsigned int mode)
 {
 	extern int trust_executable_bit, has_symlinks;
 	if (!has_symlinks && S_ISREG(mode) &&
-	    ce && S_ISLNK(ntohl(ce->ce_mode)))
+	    ce && S_ISLNK(ce->ce_mode))
 		return ce->ce_mode;
 	if (!trust_executable_bit && S_ISREG(mode)) {
-		if (ce && S_ISREG(ntohl(ce->ce_mode)))
+		if (ce && S_ISREG(ce->ce_mode))
 			return ce->ce_mode;
 		return create_ce_mode(0666);
 	}
@@ -146,14 +161,14 @@ static inline unsigned int ce_mode_from_stat(struct cache_entry *ce, unsigned in
 	S_ISLNK(mode) ? S_IFLNK : S_ISDIR(mode) ? S_IFDIR : S_IFGITLINK)
 
 #define cache_entry_size(len) ((offsetof(struct cache_entry,name) + (len) + 8) & ~7)
+#define ondisk_cache_entry_size(len) ((offsetof(struct ondisk_cache_entry,name) + (len) + 8) & ~7)
 
 struct index_state {
 	struct cache_entry **cache;
 	unsigned int cache_nr, cache_alloc, cache_changed;
 	struct cache_tree *cache_tree;
 	time_t timestamp;
-	void *mmap;
-	size_t mmap_size;
+	void *alloc;
 };
 
 extern struct index_state the_index;
diff --git a/diff-lib.c b/diff-lib.c
index d85d8f3..c20adaa 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -37,7 +37,7 @@ static int get_mode(const char *path, int *mode)
 	if (!path || !strcmp(path, "/dev/null"))
 		*mode = 0;
 	else if (!strcmp(path, "-"))
-		*mode = ntohl(create_ce_mode(0666));
+		*mode = create_ce_mode(0666);
 	else if (stat(path, &st))
 		return error("Could not access '%s'", path);
 	else
@@ -384,7 +384,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 					continue;
 			}
 			else
-				dpath->mode = ntohl(ce_mode_from_stat(ce, st.st_mode));
+				dpath->mode = ce_mode_from_stat(ce, st.st_mode);
 
 			while (i < entries) {
 				struct cache_entry *nce = active_cache[i];
@@ -398,10 +398,10 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 				 */
 				stage = ce_stage(nce);
 				if (2 <= stage) {
-					int mode = ntohl(nce->ce_mode);
+					int mode = nce->ce_mode;
 					num_compare_stages++;
 					hashcpy(dpath->parent[stage-2].sha1, nce->sha1);
-					dpath->parent[stage-2].mode = ntohl(ce_mode_from_stat(nce, mode));
+					dpath->parent[stage-2].mode = ce_mode_from_stat(nce, mode);
 					dpath->parent[stage-2].status =
 						DIFF_STATUS_MODIFIED;
 				}
@@ -442,15 +442,15 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 			}
 			if (silent_on_removed)
 				continue;
-			diff_addremove(&revs->diffopt, '-', ntohl(ce->ce_mode),
+			diff_addremove(&revs->diffopt, '-', ce->ce_mode,
 				       ce->sha1, ce->name, NULL);
 			continue;
 		}
 		changed = ce_match_stat(ce, &st, ce_option);
 		if (!changed && !DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
 			continue;
-		oldmode = ntohl(ce->ce_mode);
-		newmode = ntohl(ce_mode_from_stat(ce, st.st_mode));
+		oldmode = ce->ce_mode;
+		newmode = ce_mode_from_stat(ce, st.st_mode);
 		diff_change(&revs->diffopt, oldmode, newmode,
 			    ce->sha1, (changed ? null_sha1 : ce->sha1),
 			    ce->name, NULL);
@@ -471,7 +471,7 @@ static void diff_index_show_file(struct rev_info *revs,
 				 struct cache_entry *ce,
 				 unsigned char *sha1, unsigned int mode)
 {
-	diff_addremove(&revs->diffopt, prefix[0], ntohl(mode),
+	diff_addremove(&revs->diffopt, prefix[0], mode,
 		       sha1, ce->name, NULL);
 }
 
@@ -550,14 +550,14 @@ static int show_modified(struct rev_info *revs,
 		p->len = pathlen;
 		memcpy(p->path, new->name, pathlen);
 		p->path[pathlen] = 0;
-		p->mode = ntohl(mode);
+		p->mode = mode;
 		hashclr(p->sha1);
 		memset(p->parent, 0, 2 * sizeof(struct combine_diff_parent));
 		p->parent[0].status = DIFF_STATUS_MODIFIED;
-		p->parent[0].mode = ntohl(new->ce_mode);
+		p->parent[0].mode = new->ce_mode;
 		hashcpy(p->parent[0].sha1, new->sha1);
 		p->parent[1].status = DIFF_STATUS_MODIFIED;
-		p->parent[1].mode = ntohl(old->ce_mode);
+		p->parent[1].mode = old->ce_mode;
 		hashcpy(p->parent[1].sha1, old->sha1);
 		show_combined_diff(p, 2, revs->dense_combined_merges, revs);
 		free(p);
@@ -569,9 +569,6 @@ static int show_modified(struct rev_info *revs,
 	    !DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
 		return 0;
 
-	mode = ntohl(mode);
-	oldmode = ntohl(oldmode);
-
 	diff_change(&revs->diffopt, oldmode, mode,
 		    old->sha1, sha1, old->name, NULL);
 	return 0;
@@ -628,7 +625,7 @@ static int diff_cache(struct rev_info *revs,
 					   cached, match_missing))
 				break;
 			diff_unmerge(&revs->diffopt, ce->name,
-				     ntohl(ce->ce_mode), ce->sha1);
+				     ce->ce_mode, ce->sha1);
 			break;
 		case 3:
 			diff_unmerge(&revs->diffopt, ce->name,
@@ -664,7 +661,7 @@ static void mark_merge_entries(void)
 		struct cache_entry *ce = active_cache[i];
 		if (!ce_stage(ce))
 			continue;
-		ce->ce_flags |= htons(CE_STAGEMASK);
+		ce->ce_flags |= CE_STAGEMASK;
 	}
 }
 
@@ -723,7 +720,7 @@ int do_diff_cache(const unsigned char *tree_sha1, struct diff_options *opt)
 						   ce->name);
 			last = ce;
 			ce->ce_mode = 0;
-			ce->ce_flags &= ~htons(CE_STAGEMASK);
+			ce->ce_flags &= ~CE_STAGEMASK;
 		}
 		*dst++ = ce;
 	}
diff --git a/dir.c b/dir.c
index 3e345c2..1b9cc7a 100644
--- a/dir.c
+++ b/dir.c
@@ -391,7 +391,7 @@ static enum exist_status directory_exists_in_index(const char *dirname, int len)
 			break;
 		if (endchar == '/')
 			return index_directory;
-		if (!endchar && S_ISGITLINK(ntohl(ce->ce_mode)))
+		if (!endchar && S_ISGITLINK(ce->ce_mode))
 			return index_gitdir;
 	}
 	return index_nonexistent;
diff --git a/entry.c b/entry.c
index 257ab46..44f4b89 100644
--- a/entry.c
+++ b/entry.c
@@ -103,7 +103,7 @@ static int write_entry(struct cache_entry *ce, char *path, const struct checkout
 	int fd;
 	long wrote;
 
-	switch (ntohl(ce->ce_mode) & S_IFMT) {
+	switch (ce->ce_mode & S_IFMT) {
 		char *new;
 		struct strbuf buf;
 		unsigned long size;
@@ -129,7 +129,7 @@ static int write_entry(struct cache_entry *ce, char *path, const struct checkout
 			strcpy(path, ".merge_file_XXXXXX");
 			fd = mkstemp(path);
 		} else
-			fd = create_file(path, ntohl(ce->ce_mode));
+			fd = create_file(path, ce->ce_mode);
 		if (fd < 0) {
 			free(new);
 			return error("git-checkout-index: unable to create file %s (%s)",
@@ -221,7 +221,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *t
 		unlink(path);
 		if (S_ISDIR(st.st_mode)) {
 			/* If it is a gitlink, leave it alone! */
-			if (S_ISGITLINK(ntohl(ce->ce_mode)))
+			if (S_ISGITLINK(ce->ce_mode))
 				return 0;
 			if (!state->force)
 				return error("%s is a directory", path);
diff --git a/merge-index.c b/merge-index.c
index fa719cb..bbb700b 100644
--- a/merge-index.c
+++ b/merge-index.c
@@ -48,7 +48,7 @@ static int merge_entry(int pos, const char *path)
 			break;
 		found++;
 		strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
-		sprintf(ownbuf[stage], "%o", ntohl(ce->ce_mode));
+		sprintf(ownbuf[stage], "%o", ce->ce_mode);
 		arguments[stage] = hexbuf[stage];
 		arguments[stage + 4] = ownbuf[stage];
 	} while (++pos < active_nr);
diff --git a/merge-recursive.c b/merge-recursive.c
index b34177d..0db0b3a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -333,7 +333,7 @@ static struct path_list *get_unmerged(void)
 			item->util = xcalloc(1, sizeof(struct stage_data));
 		}
 		e = item->util;
-		e->stages[ce_stage(ce)].mode = ntohl(ce->ce_mode);
+		e->stages[ce_stage(ce)].mode = ce->ce_mode;
 		hashcpy(e->stages[ce_stage(ce)].sha, ce->sha1);
 	}
 
diff --git a/reachable.c b/reachable.c
index 6383401..00f289f 100644
--- a/reachable.c
+++ b/reachable.c
@@ -176,7 +176,7 @@ static void add_cache_refs(struct rev_info *revs)
 		 * lookup_blob() on them, to avoid populating the hash table
 		 * with invalid information
 		 */
-		if (S_ISGITLINK(ntohl(active_cache[i]->ce_mode)))
+		if (S_ISGITLINK(active_cache[i]->ce_mode))
 			continue;
 
 		lookup_blob(active_cache[i]->sha1);
diff --git a/read-cache.c b/read-cache.c
index 7db5588..4414a40 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -30,20 +30,16 @@ struct index_state the_index;
  */
 void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 {
-	ce->ce_ctime.sec = htonl(st->st_ctime);
-	ce->ce_mtime.sec = htonl(st->st_mtime);
-#ifdef USE_NSEC
-	ce->ce_ctime.nsec = htonl(st->st_ctim.tv_nsec);
-	ce->ce_mtime.nsec = htonl(st->st_mtim.tv_nsec);
-#endif
-	ce->ce_dev = htonl(st->st_dev);
-	ce->ce_ino = htonl(st->st_ino);
-	ce->ce_uid = htonl(st->st_uid);
-	ce->ce_gid = htonl(st->st_gid);
-	ce->ce_size = htonl(st->st_size);
+	ce->ce_ctime = st->st_ctime;
+	ce->ce_mtime = st->st_mtime;
+	ce->ce_dev = st->st_dev;
+	ce->ce_ino = st->st_ino;
+	ce->ce_uid = st->st_uid;
+	ce->ce_gid = st->st_gid;
+	ce->ce_size = st->st_size;
 
 	if (assume_unchanged)
-		ce->ce_flags |= htons(CE_VALID);
+		ce->ce_flags |= CE_VALID;
 }
 
 static int ce_compare_data(struct cache_entry *ce, struct stat *st)
@@ -116,7 +112,7 @@ static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st)
 			return DATA_CHANGED;
 		break;
 	case S_IFDIR:
-		if (S_ISGITLINK(ntohl(ce->ce_mode)))
+		if (S_ISGITLINK(ce->ce_mode))
 			return 0;
 	default:
 		return TYPE_CHANGED;
@@ -128,14 +124,14 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 {
 	unsigned int changed = 0;
 
-	switch (ntohl(ce->ce_mode) & S_IFMT) {
+	switch (ce->ce_mode & S_IFMT) {
 	case S_IFREG:
 		changed |= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;
 		/* We consider only the owner x bit to be relevant for
 		 * "mode changes"
 		 */
 		if (trust_executable_bit &&
-		    (0100 & (ntohl(ce->ce_mode) ^ st->st_mode)))
+		    (0100 & (ce->ce_mode ^ st->st_mode)))
 			changed |= MODE_CHANGED;
 		break;
 	case S_IFLNK:
@@ -152,29 +148,17 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	case 0: /* Special case: unmerged file in index */
 		return MODE_CHANGED | DATA_CHANGED | TYPE_CHANGED;
 	default:
-		die("internal error: ce_mode is %o", ntohl(ce->ce_mode));
+		die("internal error: ce_mode is %o", ce->ce_mode);
 	}
-	if (ce->ce_mtime.sec != htonl(st->st_mtime))
+	if (ce->ce_mtime != (unsigned int) st->st_mtime)
 		changed |= MTIME_CHANGED;
-	if (ce->ce_ctime.sec != htonl(st->st_ctime))
+	if (ce->ce_ctime != (unsigned int) st->st_ctime)
 		changed |= CTIME_CHANGED;
 
-#ifdef USE_NSEC
-	/*
-	 * nsec seems unreliable - not all filesystems support it, so
-	 * as long as it is in the inode cache you get right nsec
-	 * but after it gets flushed, you get zero nsec.
-	 */
-	if (ce->ce_mtime.nsec != htonl(st->st_mtim.tv_nsec))
-		changed |= MTIME_CHANGED;
-	if (ce->ce_ctime.nsec != htonl(st->st_ctim.tv_nsec))
-		changed |= CTIME_CHANGED;
-#endif
-
-	if (ce->ce_uid != htonl(st->st_uid) ||
-	    ce->ce_gid != htonl(st->st_gid))
+	if (ce->ce_uid != (unsigned int) st->st_uid ||
+	    ce->ce_gid != (unsigned int) st->st_gid)
 		changed |= OWNER_CHANGED;
-	if (ce->ce_ino != htonl(st->st_ino))
+	if (ce->ce_ino != (unsigned int) st->st_ino)
 		changed |= INODE_CHANGED;
 
 #ifdef USE_STDEV
@@ -183,11 +167,11 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	 * clients will have different views of what "device"
 	 * the filesystem is on
 	 */
-	if (ce->ce_dev != htonl(st->st_dev))
+	if (ce->ce_dev != (unsigned int) st->st_dev)
 		changed |= INODE_CHANGED;
 #endif
 
-	if (ce->ce_size != htonl(st->st_size))
+	if (ce->ce_size != (unsigned int) st->st_size)
 		changed |= DATA_CHANGED;
 
 	return changed;
@@ -205,7 +189,7 @@ int ie_match_stat(struct index_state *istate,
 	 * If it's marked as always valid in the index, it's
 	 * valid whatever the checked-out copy says.
 	 */
-	if (!ignore_valid && (ce->ce_flags & htons(CE_VALID)))
+	if (!ignore_valid && (ce->ce_flags & CE_VALID))
 		return 0;
 
 	changed = ce_match_stat_basic(ce, st);
@@ -228,7 +212,7 @@ int ie_match_stat(struct index_state *istate,
 	 */
 	if (!changed &&
 	    istate->timestamp &&
-	    istate->timestamp <= ntohl(ce->ce_mtime.sec)) {
+	    istate->timestamp <= ce->ce_mtime) {
 		if (assume_racy_is_modified)
 			changed |= DATA_CHANGED;
 		else
@@ -320,7 +304,7 @@ int index_name_pos(struct index_state *istate, const char *name, int namelen)
 	while (last > first) {
 		int next = (last + first) >> 1;
 		struct cache_entry *ce = istate->cache[next];
-		int cmp = cache_name_compare(name, namelen, ce->name, ntohs(ce->ce_flags));
+		int cmp = cache_name_compare(name, namelen, ce->name, ce->ce_flags);
 		if (!cmp)
 			return next;
 		if (cmp < 0) {
@@ -405,7 +389,7 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
 	size = cache_entry_size(namelen);
 	ce = xcalloc(1, size);
 	memcpy(ce->name, path, namelen);
-	ce->ce_flags = htons(namelen);
+	ce->ce_flags = namelen;
 	fill_stat_cache_info(ce, &st);
 
 	if (trust_executable_bit && has_symlinks)
@@ -616,7 +600,7 @@ static int has_dir_name(struct index_state *istate,
 		}
 		len = slash - name;
 
-		pos = index_name_pos(istate, name, ntohs(create_ce_flags(len, stage)));
+		pos = index_name_pos(istate, name, create_ce_flags(len, stage));
 		if (pos >= 0) {
 			/*
 			 * Found one, but not so fast.  This could
@@ -704,7 +688,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
 
 	cache_tree_invalidate_path(istate->cache_tree, ce->name);
-	pos = index_name_pos(istate, ce->name, ntohs(ce->ce_flags));
+	pos = index_name_pos(istate, ce->name, ce->ce_flags);
 
 	/* existing match? Just replace it. */
 	if (pos >= 0) {
@@ -736,7 +720,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 		if (!ok_to_replace)
 			return error("'%s' appears as both a file and as a directory",
 				     ce->name);
-		pos = index_name_pos(istate, ce->name, ntohs(ce->ce_flags));
+		pos = index_name_pos(istate, ce->name, ce->ce_flags);
 		pos = -pos-1;
 	}
 	return pos + 1;
@@ -810,7 +794,7 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 		 * valid again, under "assume unchanged" mode.
 		 */
 		if (ignore_valid && assume_unchanged &&
-		    !(ce->ce_flags & htons(CE_VALID)))
+		    !(ce->ce_flags & CE_VALID))
 			; /* mark this one VALID again */
 		else
 			return ce;
@@ -826,7 +810,6 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	updated = xmalloc(size);
 	memcpy(updated, ce, size);
 	fill_stat_cache_info(updated, &st);
-
 	/*
 	 * If ignore_valid is not set, we should leave CE_VALID bit
 	 * alone.  Otherwise, paths marked with --no-assume-unchanged
@@ -834,8 +817,8 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
 	 * automatically, which is not really what we want.
 	 */
 	if (!ignore_valid && assume_unchanged &&
-	    !(ce->ce_flags & htons(CE_VALID)))
-		updated->ce_flags &= ~htons(CE_VALID);
+	    !(ce->ce_flags & CE_VALID))
+		updated->ce_flags &= ~CE_VALID;
 
 	return updated;
 }
@@ -880,7 +863,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, const char **p
 				/* If we are doing --really-refresh that
 				 * means the index is not valid anymore.
 				 */
-				ce->ce_flags &= ~htons(CE_VALID);
+				ce->ce_flags &= ~CE_VALID;
 				istate->cache_changed = 1;
 			}
 			if (quiet)
@@ -942,16 +925,34 @@ int read_index(struct index_state *istate)
 	return read_index_from(istate, get_index_file());
 }
 
+static void convert_from_disk(struct ondisk_cache_entry *ondisk, struct cache_entry *ce)
+{
+	ce->ce_ctime = ntohl(ondisk->ctime.sec);
+	ce->ce_mtime = ntohl(ondisk->mtime.sec);
+	ce->ce_dev   = ntohl(ondisk->dev);
+	ce->ce_ino   = ntohl(ondisk->ino);
+	ce->ce_mode  = ntohl(ondisk->mode);
+	ce->ce_uid   = ntohl(ondisk->uid);
+	ce->ce_gid   = ntohl(ondisk->gid);
+	ce->ce_size  = ntohl(ondisk->size);
+	/* On-disk flags are just 16 bits */
+	ce->ce_flags = ntohs(ondisk->flags);
+	hashcpy(ce->sha1, ondisk->sha1);
+	memcpy(ce->name, ondisk->name, ce_namelen(ce)+1);
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
 	int fd, i;
 	struct stat st;
-	unsigned long offset;
+	unsigned long src_offset, dst_offset;
 	struct cache_header *hdr;
+	void *mmap;
+	size_t mmap_size;
 
 	errno = EBUSY;
-	if (istate->mmap)
+	if (istate->alloc)
 		return istate->cache_nr;
 
 	errno = ENOENT;
@@ -967,31 +968,47 @@ int read_index_from(struct index_state *istate, const char *path)
 		die("cannot stat the open index (%s)", strerror(errno));
 
 	errno = EINVAL;
-	istate->mmap_size = xsize_t(st.st_size);
-	if (istate->mmap_size < sizeof(struct cache_header) + 20)
+	mmap_size = xsize_t(st.st_size);
+	if (mmap_size < sizeof(struct cache_header) + 20)
 		die("index file smaller than expected");
 
-	istate->mmap = xmmap(NULL, istate->mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
 	close(fd);
+	if (mmap == MAP_FAILED)
+		die("unable to map index file");
 
-	hdr = istate->mmap;
-	if (verify_hdr(hdr, istate->mmap_size) < 0)
+	hdr = mmap;
+	if (verify_hdr(hdr, mmap_size) < 0)
 		goto unmap;
 
 	istate->cache_nr = ntohl(hdr->hdr_entries);
 	istate->cache_alloc = alloc_nr(istate->cache_nr);
 	istate->cache = xcalloc(istate->cache_alloc, sizeof(struct cache_entry *));
 
-	offset = sizeof(*hdr);
+	/*
+	 * The disk format is actually larger than the in-memory format,
+	 * due to space for nsec etc, so even though the in-memory one
+	 * has room for a few  more flags, we can allocate using the same
+	 * index size
+	 */
+	istate->alloc = xmalloc(mmap_size);
+
+	src_offset = sizeof(*hdr);
+	dst_offset = 0;
 	for (i = 0; i < istate->cache_nr; i++) {
+		struct ondisk_cache_entry *disk_ce;
 		struct cache_entry *ce;
 
-		ce = (struct cache_entry *)((char *)(istate->mmap) + offset);
-		offset = offset + ce_size(ce);
+		disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset);
+		ce = (struct cache_entry *)((char *)istate->alloc + dst_offset);
+		convert_from_disk(disk_ce, ce);
 		istate->cache[i] = ce;
+
+		src_offset += ondisk_ce_size(ce);
+		dst_offset += ce_size(ce);
 	}
 	istate->timestamp = st.st_mtime;
-	while (offset <= istate->mmap_size - 20 - 8) {
+	while (src_offset <= mmap_size - 20 - 8) {
 		/* After an array of active_nr index entries,
 		 * there can be arbitrary number of extended
 		 * sections, each of which is prefixed with
@@ -999,40 +1016,36 @@ int read_index_from(struct index_state *istate, const char *path)
 		 * in 4-byte network byte order.
 		 */
 		unsigned long extsize;
-		memcpy(&extsize, (char *)(istate->mmap) + offset + 4, 4);
+		memcpy(&extsize, (char *)mmap + src_offset + 4, 4);
 		extsize = ntohl(extsize);
 		if (read_index_extension(istate,
-					 ((const char *) (istate->mmap)) + offset,
-					 (char *) (istate->mmap) + offset + 8,
+					 (const char *) mmap + src_offset,
+					 (char *) mmap + src_offset + 8,
 					 extsize) < 0)
 			goto unmap;
-		offset += 8;
-		offset += extsize;
+		src_offset += 8;
+		src_offset += extsize;
 	}
+	munmap(mmap, mmap_size);
 	return istate->cache_nr;
 
 unmap:
-	munmap(istate->mmap, istate->mmap_size);
+	munmap(mmap, mmap_size);
 	errno = EINVAL;
 	die("index file corrupt");
 }
 
 int discard_index(struct index_state *istate)
 {
-	int ret;
-
 	istate->cache_nr = 0;
 	istate->cache_changed = 0;
 	istate->timestamp = 0;
 	cache_tree_free(&(istate->cache_tree));
-	if (istate->mmap == NULL)
-		return 0;
-	ret = munmap(istate->mmap, istate->mmap_size);
-	istate->mmap = NULL;
-	istate->mmap_size = 0;
+	free(istate->alloc);
+	istate->alloc = NULL;
 
 	/* no need to throw away allocated active_cache */
-	return ret;
+	return 0;
 }
 
 #define WRITE_BUFFER_SIZE 8192
@@ -1148,6 +1161,28 @@ static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
 	}
 }
 
+static int ce_write_entry(SHA_CTX *c, int fd, struct cache_entry *ce)
+{
+	int size = ondisk_ce_size(ce);
+	struct ondisk_cache_entry *ondisk = xcalloc(1, size);
+
+	ondisk->ctime.sec = htonl(ce->ce_ctime);
+	ondisk->ctime.nsec = 0;
+	ondisk->mtime.sec = htonl(ce->ce_mtime);
+	ondisk->mtime.nsec = 0;
+	ondisk->dev  = htonl(ce->ce_dev);
+	ondisk->ino  = htonl(ce->ce_ino);
+	ondisk->mode = htonl(ce->ce_mode);
+	ondisk->uid  = htonl(ce->ce_uid);
+	ondisk->gid  = htonl(ce->ce_gid);
+	ondisk->size = htonl(ce->ce_size);
+	hashcpy(ondisk->sha1, ce->sha1);
+	ondisk->flags = htons(ce->ce_flags);
+	memcpy(ondisk->name, ce->name, ce_namelen(ce));
+
+	return ce_write(c, fd, ondisk, size);
+}
+
 int write_index(struct index_state *istate, int newfd)
 {
 	SHA_CTX c;
@@ -1173,9 +1208,9 @@ int write_index(struct index_state *istate, int newfd)
 		if (!ce->ce_mode)
 			continue;
 		if (istate->timestamp &&
-		    istate->timestamp <= ntohl(ce->ce_mtime.sec))
+		    istate->timestamp <= ce->ce_mtime)
 			ce_smudge_racily_clean_entry(ce);
-		if (ce_write(&c, newfd, ce, ce_size(ce)) < 0)
+		if (ce_write_entry(&c, newfd, ce) < 0)
 			return -1;
 	}
 
diff --git a/sha1_name.c b/sha1_name.c
index 13e1164..be8489e 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -695,7 +695,7 @@ int get_sha1_with_mode(const char *name, unsigned char *sha1, unsigned *mode)
 				break;
 			if (ce_stage(ce) == stage) {
 				hashcpy(sha1, ce->sha1);
-				*mode = ntohl(ce->ce_mode);
+				*mode = ce->ce_mode;
 				return 0;
 			}
 			pos++;
diff --git a/tree.c b/tree.c
index 8c0819f..87708ef 100644
--- a/tree.c
+++ b/tree.c
@@ -142,8 +142,8 @@ static int cmp_cache_name_compare(const void *a_, const void *b_)
 
 	ce1 = *((const struct cache_entry **)a_);
 	ce2 = *((const struct cache_entry **)b_);
-	return cache_name_compare(ce1->name, ntohs(ce1->ce_flags),
-				  ce2->name, ntohs(ce2->ce_flags));
+	return cache_name_compare(ce1->name, ce1->ce_flags,
+				  ce2->name, ce2->ce_flags);
 }
 
 int read_tree(struct tree *tree, int stage, const char **match)
diff --git a/unpack-trees.c b/unpack-trees.c
index aa2513e..9205320 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -289,7 +289,7 @@ static struct checkout state;
 static void check_updates(struct cache_entry **src, int nr,
 			struct unpack_trees_options *o)
 {
-	unsigned short mask = htons(CE_UPDATE);
+	unsigned short mask = CE_UPDATE;
 	unsigned cnt = 0, total = 0;
 	struct progress *progress = NULL;
 	char last_symlink[PATH_MAX];
@@ -408,7 +408,7 @@ static void verify_uptodate(struct cache_entry *ce,
 		 * submodules that are marked to be automatically
 		 * checked out.
 		 */
-		if (S_ISGITLINK(ntohl(ce->ce_mode)))
+		if (S_ISGITLINK(ce->ce_mode))
 			return;
 		errno = 0;
 	}
@@ -450,7 +450,7 @@ static int verify_clean_subdirectory(struct cache_entry *ce, const char *action,
 	int cnt = 0;
 	unsigned char sha1[20];
 
-	if (S_ISGITLINK(ntohl(ce->ce_mode)) &&
+	if (S_ISGITLINK(ce->ce_mode) &&
 	    resolve_gitlink_ref(ce->name, "HEAD", sha1) == 0) {
 		/* If we are not going to update the submodule, then
 		 * we don't care.
@@ -580,7 +580,7 @@ static void verify_absent(struct cache_entry *ce, const char *action,
 static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
 		struct unpack_trees_options *o)
 {
-	merge->ce_flags |= htons(CE_UPDATE);
+	merge->ce_flags |= CE_UPDATE;
 	if (old) {
 		/*
 		 * See if we can re-use the old CE directly?
@@ -601,7 +601,7 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
 		invalidate_ce_path(merge);
 	}
 
-	merge->ce_flags &= ~htons(CE_STAGEMASK);
+	merge->ce_flags &= ~CE_STAGEMASK;
 	add_cache_entry(merge, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 	return 1;
 }
@@ -634,7 +634,7 @@ static void show_stage_entry(FILE *o,
 	else
 		fprintf(o, "%s%06o %s %d\t%s\n",
 			label,
-			ntohl(ce->ce_mode),
+			ce->ce_mode,
 			sha1_to_hex(ce->sha1),
 			ce_stage(ce),
 			ce->name);
@@ -920,7 +920,7 @@ int oneway_merge(struct cache_entry **src,
 			struct stat st;
 			if (lstat(old->name, &st) ||
 			    ce_match_stat(old, &st, CE_MATCH_IGNORE_VALID))
-				old->ce_flags |= htons(CE_UPDATE);
+				old->ce_flags |= CE_UPDATE;
 		}
 		return keep_entry(old, o);
 	}

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14 19:39               ` Linus Torvalds
@ 2008-01-14 20:08                 ` Junio C Hamano
  2008-01-14 21:00                   ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2008-01-14 20:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, 14 Jan 2008, Linus Torvalds wrote:
>> 
>> So I think this patch is good, but I think it would be even better if we 
>> just bit the bullet and started looking at having a different in-memory 
>> representation from the on-disk one.
>
> Ok, so here's a possible patch.
>
> It passes all the tests for me, and looks fairly ok, but it's also a bit 
> big.
>
> What makes it big is that I made the in-memory format be in host order, so 
> that we can remove a *lot* of the "htonl/ntohl" switcheroo, and do it just 
> on index file read/write.
>
> The nice thing about this patch is that it would make it a lot easier to 
> do any index handling changes,  because it makes a clear difference 
> between the on-disk and the in-memory formats.
>
> I realize that the patch looks big (195 lines inserted and 148 lines 
> removed), but *most* of the lines are literally those ntohl() 
> simplifications, ie stuff like
>
> 	-       if (S_ISGITLINK(ntohl(ce->ce_mode))) {
> 	+       if (S_ISGITLINK(ce->ce_mode)) {
>
> so while it adds lines (for the "convert from disk" and "convert to disk" 
> format conversions), in  many ways it really simplifies the source code 
> too.
>
> Comments?
>
> This is on top of current master, so it's *before* junios thing that adds 
> a CE_UPTODATE.
>
> With this, the high 16 bits of "ce_flags" are in-memory only, so you could 
> just make CE_UPTODATE be 0x10000, and it automatically ends up never being 
> written to disk (and always "reads" as zero).
>
> 		Linus

> diff --git a/cache.h b/cache.h
> index 39331c2..0aed11e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -94,17 +94,31 @@ struct cache_time {
>   * We save the fields in big-endian order to allow using the
>   * index file over NFS transparently.
>   */
> +struct ondisk_cache_entry {
> +	struct cache_time ctime;
> +	struct cache_time mtime;
> +	unsigned int dev;
> +	unsigned int ino;
> +	unsigned int mode;
> +	unsigned int uid;
> +	unsigned int gid;
> +	unsigned int size;
> +	unsigned char sha1[20];
> +	unsigned short flags;
> +	char name[FLEX_ARRAY]; /* more */
> +};
> +
>  struct cache_entry {
> -	struct cache_time ce_ctime;
> -	struct cache_time ce_mtime;
> +	unsigned int ce_ctime;
> +	unsigned int ce_mtime;
>  	unsigned int ce_dev;
>  	unsigned int ce_ino;
>  	unsigned int ce_mode;
>  	unsigned int ce_uid;
>  	unsigned int ce_gid;
>  	unsigned int ce_size;
> +	unsigned int ce_flags;
>  	unsigned char sha1[20];
> -	unsigned short ce_flags;
>  	char name[FLEX_ARRAY]; /* more */
>  };

If we are using different types anyway, we might want to start
using time_t (a worse alternative is ulong which we use for
timestamps everywhere else, which we probably want to convert to
time_t as well).

Is there still a reason to insist that ce_flags should be a
single field that is multi-purposed for storing stage, namelen
and other flags?  Wouldn't the code become even simpler and
safer if we separated them into individual fields?  For example,
a piece like this:

@@ -2388,7 +2388,7 @@ static void add_index_file(const char *path, unsigned mode, void *buf, unsigned
 	ce = xcalloc(1, ce_size);
 	memcpy(ce->name, path, namelen);
 	ce->ce_mode = create_ce_mode(mode);
-	ce->ce_flags = htons(namelen);
+	ce->ce_flags = namelen;
 	if (S_ISGITLINK(mode)) {
 		const char *s = buf;
 
still has that "names longer than 4096 bytes go unchecked,
corrupting stage information" issue.

+static void convert_from_disk(struct ondisk_cache_entry *ondisk, struct cache_entry *ce)
+{
+	ce->ce_ctime = ntohl(ondisk->ctime.sec);
+	ce->ce_mtime = ntohl(ondisk->mtime.sec);
+	ce->ce_dev   = ntohl(ondisk->dev);
+	ce->ce_ino   = ntohl(ondisk->ino);
+	ce->ce_mode  = ntohl(ondisk->mode);
+	ce->ce_uid   = ntohl(ondisk->uid);
+	ce->ce_gid   = ntohl(ondisk->gid);
+	ce->ce_size  = ntohl(ondisk->size);
+	/* On-disk flags are just 16 bits */
+	ce->ce_flags = ntohs(ondisk->flags);
+	hashcpy(ce->sha1, ondisk->sha1);
+	memcpy(ce->name, ondisk->name, ce_namelen(ce)+1);
+}

I presume that the fix to handle names that are longer than 4096
bytes naturally fits here.  We can make the low 12-bits of
ondisk->ce_flags all 1 for such names and we actually
count the strlen to populate ce->ce_namelen.

> +	/*
> +	 * The disk format is actually larger than the in-memory format,
> +	 * due to space for nsec etc, so even though the in-memory one
> +	 * has room for a few  more flags, we can allocate using the same
> +	 * index size
> +	 */
> +	istate->alloc = xmalloc(mmap_size);
> +
> +	src_offset = sizeof(*hdr);
> +	dst_offset = 0;
>  	for (i = 0; i < istate->cache_nr; i++) {
> +		struct ondisk_cache_entry *disk_ce;
>  		struct cache_entry *ce;
>  
> -		ce = (struct cache_entry *)((char *)(istate->mmap) + offset);
> -		offset = offset + ce_size(ce);
> +		disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset);
> +		ce = (struct cache_entry *)((char *)istate->alloc + dst_offset);
> +		convert_from_disk(disk_ce, ce);
>  		istate->cache[i] = ce;
> +
> +		src_offset += ondisk_ce_size(ce);
> +		dst_offset += ce_size(ce);
>  	}
>  	istate->timestamp = st.st_mtime;

I somehow had this impression that it was a huge deal to you
that we do not have to read and populate each cache entry when
reading from the existing index file, and thought that was the
reason why we mmap and access the fields in network byte order.
If that was my misconception, then I agree this is a good change
to make everything else easier to write and much less error
prone.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14 20:08                 ` Junio C Hamano
@ 2008-01-14 21:00                   ` Linus Torvalds
  0 siblings, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-01-14 21:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Kristian H?gsberg



On Mon, 14 Jan 2008, Junio C Hamano wrote:
> 
> If we are using different types anyway, we might want to start
> using time_t (a worse alternative is ulong which we use for
> timestamps everywhere else, which we probably want to convert to
> time_t as well).

Careful.

There are two issues, one trivial one and one important one:
 (a) trivially, right now, the code depends on the fact that the in-memory 
     structure is actually smaller than the on-disk one, to avoid having 
     to estimate the size of the allocation for the in-memory array. That 
     was a matter of gettign a quickly working and efficient patch (we do 
     *not* want to allocate those initial "struct cache_entry" entries one 
     by one, we want to allocate one big block!)

     This should be pretty easy to fix up, by just taking the sizes and 
     number of entries (which we do know) into account of the initial 
     allocation. However, it's made a bit more interesting by the 
     differing alignment of the "name" part (and the fact that we align 
     each individual on-disk and in-memory structure).

 (b) More importantly, the on-disk structures DO NOT CONTAIN the whole 
     stat information! The classic example of this is "ce_size": it's 
     32-bit, but it works even if you have a file that is larger than 32 
     bits in size! It just means that from a stat comparison standpoint, 
     we only compare the low 32 bits!

     This means that if you make "ce_size" be a "loff_t", for example, you 
     still need to then *compare* it in just an "unsigned int",  because 
     the upper bits aren't zero - they are "nonexistent".

that (b) is important, and is why some of the code changed from

	-       if (ce->ce_ino != htonl(st->st_ino))
	+       if (ce->ce_ino != (unsigned int) st->st_ino)

ie note how this didn't just remove the "htonl()", it replaced it by a 
"truncate to 'unsigned int'"!

So the fact that the types aren't necessarily the "native" types is 
actually *important*.

> Is there still a reason to insist that ce_flags should be a
> single field that is multi-purposed for storing stage, namelen
> and other flags?  Wouldn't the code become even simpler and
> safer if we separated them into individual fields?  For example,
> a piece like this:

No reason for that part, except I wanted to make this particular initial 
patch be as minimal as possible.

> I somehow had this impression that it was a huge deal to you
> that we do not have to read and populate each cache entry when
> reading from the existing index file, and thought that was the
> reason why we mmap and access the fields in network byte order.
> If that was my misconception, then I agree this is a good change
> to make everything else easier to write and much less error
> prone.

I was a bit worried about it, but I did make sure that the allocation is 
done as one single allocation, and I did time it. Doing a 

	git update-index --refresh

seems to be identical before and after, so the costs of conversion are 
either very small or are possibly counteracted by the fact that we then 
can avoid the byte-order conversion of individual words less at run-time.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] index: be careful when handling long names
  2008-01-13 23:33                   ` Alex Riesen
@ 2008-01-14 21:03                     ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-14 21:03 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Linus Torvalds, Git Mailing List

Alex Riesen <raa.lkml@gmail.com> writes:

> Junio C Hamano, Mon, Jan 14, 2008 00:08:07 +0100:
> ...
>> I would agree that it might overflow the argument limit when
>> this is given to "echo", though.  We cannot do much about it,
>> but you may have cleverer ideas.
>
> I thought about conditionally disabling the test, like it was done
> when the tabs in filenames had to be tested.

Yes, we can and should do that when somebody reports this (or
any other tests) a real issue, as we have done for other tests.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste
  2008-01-13 10:38       ` [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste Junio C Hamano
@ 2008-01-14 21:23         ` しらいしななこ
  2008-01-14 21:54           ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: しらいしななこ @ 2008-01-14 21:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Quoting Junio C Hamano <gitster@pobox.com>:

> When I did 2888605c649ccd423232161186d72c0e6c458a48
> (builtin-commit: fix partial-commit support), I mindlessly cut
> and pasted from builtin-ls-files.c, and included the part that
> was meant to exclude redundant path after "ls-files --with-tree"
> overlayed the HEAD commit on top of the index.  This logic does
> not apply to what git-commit does and should not have been
> copied, even though it would not hurt.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin-commit.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/builtin-commit.c b/builtin-commit.c
> index 6d2ca80..265ba6b 100644
> --- a/builtin-commit.c
> +++ b/builtin-commit.c
> @@ -156,8 +156,6 @@ static int list_paths(struct path_list *list, const char *with_tree,
>  
>  	for (i = 0; i < active_nr; i++) {
>  		struct cache_entry *ce = active_cache[i];
> -		if (ce->ce_flags & htons(CE_UPDATE))
> -			continue;
>  		if (!pathspec_match(pattern, m, ce->name, 0))
>  			continue;
>  		path_list_insert(ce->name, list);

I think I may be wrong, but is this change really correct?

You are calling overlay_tree_on_cache() which does use CE_UPDATE flag to mark
duplicate entries.  And that is the same algorithm as used when git-ls-files
is called with its --with-tree option.  I think this if statement is not
mindless but is the right thing to have the same logic as you have in
git-ls-files.

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

----------------------------------------------------------------------
Finally - A spam blocker that actually works.
http://www.bluebottle.com/tag/4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste
  2008-01-14 21:23         ` しらいしななこ
@ 2008-01-14 21:54           ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-14 21:54 UTC (permalink / raw)
  To: しらいしななこ; +Cc: git

しらいしななこ  <nanako3@bluebottle.com> writes:

> You are calling overlay_tree_on_cache() which does use CE_UPDATE flag to mark
> duplicate entries.  And that is the same algorithm as used when git-ls-files
> is called with its --with-tree option.  I think this if statement is not
> mindless but is the right thing to have the same logic as you have in
> git-ls-files.

You are right and I was stupid.

Because the pathname ce->name is given to path_list_insert()
which does not allow duplicates, there is no breakage either way
from the correctness point of view in this codepath, unlike the
one in ls-files.  But avoiding unnecessary processing with a
single bit check is certainly better.

Will revert, but I first have to go find a brown paper bag.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  1:46 ` Linus Torvalds
  2008-01-13  4:04   ` Linus Torvalds
@ 2008-01-14 23:15   ` Kristian Høgsberg
  2008-01-14 23:48     ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Kristian Høgsberg @ 2008-01-14 23:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On Sat, 2008-01-12 at 17:46 -0800, Linus Torvalds wrote:

> HOWEVER. When that logic was converted from that shell-script into a 
> builtin-commit.c, that conversion was not done correctly. The old "git 
> read-tree -i -m" was not translated as a "unpack_trees()" call, but as 
> this in prepare_index():
> 
> 	discard_cache()
> 	..
> 	tree = parse_tree_indirect(head_sha1);
> 	..
> 	read_tree(tree, 0, NULL)
> 
> which is very wrong, because it replaces the old index entirely, and 
> doesn't do that stat information merging.
> 
> As a result, the index that is created by read-tree is totally bogus in 
> the stat cache, and yes, everything will have to be re-computed.
> 
> Kristian?

Sorry for being late to the game, and yes, it's a bug I introduced with
the rewrite.  When doing the rewrite I was a bit puzzled by the

  git-read-tree --index-output="$TMP_INDEX" -i -m HEAD

part of the shell script.  I carried a FIXME around in the patch for a
while, as can be seen here:

  http://marc.info/?l=git&m=118478660425992&w=2

since I couldn't figure out what the difference in behavior was between
just using read_tree(), which did exactly what I wanted and the more
complicated unpack_tree().  I guess it fell through the cracks,
especially since it never caused the test suite to fail :/

Kristian

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-13  4:04   ` Linus Torvalds
  2008-01-13  5:38     ` Daniel Barkalow
  2008-01-13  8:12     ` Junio C Hamano
@ 2008-01-14 23:46     ` Kristian Høgsberg
  2 siblings, 0 replies; 33+ messages in thread
From: Kristian Høgsberg @ 2008-01-14 23:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On Sat, 2008-01-12 at 20:04 -0800, Linus Torvalds wrote:

> It makes builtin-commit.c use the same logic that "git read-tree -i -m" 
> does (which is what the old shell script did), and it seems to pass the 
> test-suite, and it looks pretty obvious.
> 
> It also brings down the number of open/mmap/munmap/close calls to where it 
> should be, although it still does *way* too many "lstat()" operations (ie 
> it does 4*lstat for each file in the index - one more than the 
> non-filename one does).
> 
> With that fixed, performance is also roughly where it should be (ie the 
> 17-18s for the cold-cache case), because it no longer needs to rehash all 
> the files!
> 
> HOWEVER. This was just a quick hack, and while it all looks sane, this is 
> some damn core code. Somebody else should double- and triple-check this.

I took a look too, and it looks to me like the it's the exact same code
path in builtin-read-tree.c that the old

        git read-tree --index-output="$TMP_INDEX" -i -m HEAD

part of the shell script would trigger.  So yes, this look like the
right fix to me.

Signed-off-by: Kristian Høgsberg <krh@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14 23:15   ` Kristian Høgsberg
@ 2008-01-14 23:48     ` Junio C Hamano
  2008-01-14 23:53       ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2008-01-14 23:48 UTC (permalink / raw)
  To: Kristian Høgsberg; +Cc: Linus Torvalds, Git Mailing List

Kristian Høgsberg <krh@redhat.com> writes:

> ...  I guess it fell through the cracks,
> especially since it never caused the test suite to fail :/

Yeah, the breakage was not about the correctness, and because I
almost never do partial commit I did not notice it until Linus
brought it up.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14 23:48     ` Junio C Hamano
@ 2008-01-14 23:53       ` Linus Torvalds
  0 siblings, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-01-14 23:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kristian Høgsberg, Git Mailing List

On Mon, 14 Jan 2008, Junio C Hamano wrote:
> 
> Yeah, the breakage was not about the correctness, and because I
> almost never do partial commit I did not notice it until Linus
> brought it up.

I wouldn't have noticed it either, if it wasn't for the fact that my 
kernel tree was out-of-cache for other testing reasons. When cached, the 
difference was still quite noticeable, but I would probably not have 
noticed the difference between half a second and a second and a half.

But a commit that took over a minute due to IO was very noticeable 
indeed..

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-14  1:00           ` performance problem: "git commit filename" Junio C Hamano
  2008-01-14 17:07             ` Linus Torvalds
@ 2008-01-15  0:18             ` Linus Torvalds
  2008-01-15  1:13               ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2008-01-15  0:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Kristian H?gsberg

On Sun, 13 Jan 2008, Junio C Hamano wrote:
> 
> I've reworked the patch, and in the kernel repository, a
> single-path commit after touching that path now calls 23k
> lstat(2).  It used to call 46k lstat(2) after your fix.

Hmm. This part of it looks incorrect:

> diff --git a/diff.c b/diff.c
> index b18c140..62d0c06 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1510,6 +1510,10 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
>  	if (pos < 0)
>  		return 0;
>  	ce = active_cache[pos];
> +
> +	if (ce_uptodate(ce))
> +		return 1;
> +
>  	if ((lstat(name, &st) < 0) ||
>  	    !S_ISREG(st.st_mode) || /* careful! */
>  	    ce_match_stat(ce, &st, 0) ||

Isn't this wrong? I think it also needs to check that ce->sha1 matches the 
right SHA1, because even if the lstat() information may be fine, if the 
SHA1 doesn't match what we want, we still shouldn't use the checked-out 
copy, of course.

The old code continues with a

	   hashcmp(sha1, ce->sha1))
		return 0;

in that if-statement that is partially visible in the context, and it's 
that hashcmp() that got incorrectly cut off from the logic.

(Of course, maybe we never call this function unless we've already checked 
that the cache-entry SHA1 matches, but if so, that subsequent hashcmp 
should just be removed instead).

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: performance problem: "git commit filename"
  2008-01-15  0:18             ` Linus Torvalds
@ 2008-01-15  1:13               ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-01-15  1:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Kristian H?gsberg

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sun, 13 Jan 2008, Junio C Hamano wrote:
>> 
>> I've reworked the patch, and in the kernel repository, a
>> single-path commit after touching that path now calls 23k
>> lstat(2).  It used to call 46k lstat(2) after your fix.
>
> Hmm. This part of it looks incorrect:
>
>> diff --git a/diff.c b/diff.c
>> index b18c140..62d0c06 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -1510,6 +1510,10 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
>>  	if (pos < 0)
>>  		return 0;
>>  	ce = active_cache[pos];
>> +
>> +	if (ce_uptodate(ce))
>> +		return 1;
>> +
>>  	if ((lstat(name, &st) < 0) ||
>>  	    !S_ISREG(st.st_mode) || /* careful! */
>>  	    ce_match_stat(ce, &st, 0) ||
>
> Isn't this wrong?

You are right.  We call this with sha1 that may not necessarily
be the same as ce->sha1.

The code should probably be something like:

	ce = active_cache[pos];

	/*
         * Even if ce matches the work tree, it is not what we can
	 * reuse for sha1, if the hash is different or not a
         * regular blob.
         */
	if (hashcmp(sha1, ce->sha1) || !S_ISREG(ntohl(ce->st_mode))
		return 0;
	/*
         * Does ce actually match the work tree?  If so we can reuse.
         */
	if (ce_uptodate(ce) ||
	    (!lstat(name, &st) && !ce_match_stat(ce, &st, 0)))
		return 1;
	return 0;

The expression inside the latter if () condition should probably
be the new abstraction at the level of ce_modified().

Currently ce_modified() assumes that lstat(2) is cheap and the
callers have called it on paths they are interested in already.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2008-01-15  1:14 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-12 22:46 performance problem: "git commit filename" Linus Torvalds
2008-01-13  1:46 ` Linus Torvalds
2008-01-13  4:04   ` Linus Torvalds
2008-01-13  5:38     ` Daniel Barkalow
2008-01-13  8:14       ` Junio C Hamano
2008-01-13 16:57       ` Linus Torvalds
2008-01-13 19:31         ` Daniel Barkalow
2008-01-13  8:12     ` Junio C Hamano
2008-01-13 10:33       ` Junio C Hamano
2008-01-13 10:54         ` [PATCH] builtin-commit.c: do not lstat(2) partially committed paths twice Junio C Hamano
2008-01-13 11:09         ` performance problem: "git commit filename" Junio C Hamano
2008-01-13 17:24         ` Linus Torvalds
2008-01-13 19:39           ` Junio C Hamano
2008-01-13 22:36             ` [PATCH] index: be careful when handling long names Junio C Hamano
2008-01-13 22:53               ` Alex Riesen
2008-01-13 23:08                 ` Junio C Hamano
2008-01-13 23:33                   ` Alex Riesen
2008-01-14 21:03                     ` Junio C Hamano
2008-01-14  1:00           ` performance problem: "git commit filename" Junio C Hamano
2008-01-14 17:07             ` Linus Torvalds
2008-01-14 18:38               ` Junio C Hamano
2008-01-14 19:39               ` Linus Torvalds
2008-01-14 20:08                 ` Junio C Hamano
2008-01-14 21:00                   ` Linus Torvalds
2008-01-15  0:18             ` Linus Torvalds
2008-01-15  1:13               ` Junio C Hamano
2008-01-13 10:38       ` [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste Junio C Hamano
2008-01-14 21:23         ` しらいしななこ
2008-01-14 21:54           ` Junio C Hamano
2008-01-14 23:46     ` performance problem: "git commit filename" Kristian Høgsberg
2008-01-14 23:15   ` Kristian Høgsberg
2008-01-14 23:48     ` Junio C Hamano
2008-01-14 23:53       ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).