* Re: [PATCH v2] parse-opt: migrate builtin-ls-files.
From: Pierre Habouzit @ 2009-01-07 14:46 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Junio C Hamano, git
In-Reply-To: <1231297894-31809-1-git-send-email-vmiklos@frugalware.org>
[-- Attachment #1: Type: text/plain, Size: 746 bytes --]
On Wed, Jan 07, 2009 at 03:11:34AM +0000, Miklos Vajna wrote:
> +static int option_parse_ignored(const struct option *opt,
> + const char *arg, int unset)
> +{
> + struct dir_struct *dir = opt->value;
> +
> + if (unset)
> + dir->show_ignored = 0;
> + else
> + dir->show_ignored = 1;
dir->show_ignored = !unset ?
> +static int option_parse_directory(const struct option *opt,
> + const char *arg, int unset)
> +{
ditto
> +static int option_parse_empty(const struct option *opt,
> + const char *arg, int unset)
> +{
ditto
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
* Re: [PATCH (topgit) 1/2] Implement setup_pager just like in git
From: Pierre Habouzit @ 2009-01-07 14:44 UTC (permalink / raw)
To: Kirill Smelkov; +Cc: martin f krafft, git, pasky
In-Reply-To: <20090107112754.GA15158@roro3>
[-- Attachment #1: Type: text/plain, Size: 977 bytes --]
On Wed, Jan 07, 2009 at 11:27:54AM +0000, Kirill Smelkov wrote:
> Martin, thanks for your review.
>
> I'll too reply inline:
>
> On Tue, Jan 06, 2009 at 09:32:03PM +0100, martin f krafft wrote:
> > Thanks, Kirill, for the patches. A couple of comments inline. I hope
> > Petr has a chance to look too.
> >
> > also sprach Kirill Smelkov <kirr@landau.phys.spbu.ru> [2009.01.06.1616 +0100]:
> > > +# isatty FD
> > > +isatty()
> > > +{
> > > + tty -s 0<&$1 || return 1
> > > + return 0
> > > +}
> >
> > You don't need any of the return statements. Functions' return
> > values are the return values of the last commands they execute.
>
> Agree, I'll rework isatty to be just
>
> isatty()
> {
> tty -s 0<&$1
> }
why not test -t 0 ? I'm not sure it's POSIX though.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Pierre Habouzit @ 2009-01-07 14:39 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.DEB.1.00.0901062037250.30769@pacific.mpi-cbg.de>
[-- Attachment #1.1: Type: text/plain, Size: 1832 bytes --]
On Tue, Jan 06, 2009 at 07:40:02PM +0000, Johannes Schindelin wrote:
> Hi,
>
> On Tue, 6 Jan 2009, Pierre Habouzit wrote:
>
> > On jeu, jan 01, 2009 at 04:38:09 +0000, Johannes Schindelin wrote:
> > >
> > > Nothing fancy, really, just a straight-forward implementation of the
> > > heavily under-documented and under-analyzed paience diff algorithm.
> >
> > Btw, what is the status of this series ? I see it neither in pu nor in
> > next. And I would gladly see it included in git.
>
> AFAICT people wanted to be reasonably sure that it is worth the effort.
Fair enough.
> Although I would like to see it in once it is fleshed out -- even if it
> does not meet our usefulness standard -- because people said Git is
> inferior for not providing a patience diff. If we have --patience, we can
> say "but we have it, it's just not useful, check for yourself".
Well I believe it's useful, but maybe the standard algorithm could be
tweaked the way Linus proposes to make the "long" lines weight louder or
so.
> Due to the lines being much longer than 80 characters, this example was
> not useful to me.
I hope the other one was ;)
WRT the leaks, you want to squash the attached patch on the proper
patches of your series (maybe the xdl_free on map.entries could be put
in a hasmap_destroy or similar btw, but valgrind reports no more leaks
in xdiff now). As of timings, with the current implementation, patience
diff is around 30% slower than the default implementation, which is a
bit worse than what I would expect, probably due to the fact that you
had to work around the libxdiff interface I guess.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #1.2: memory-leaks.diff --]
[-- Type: text/plain, Size: 1120 bytes --]
diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
index d01cbdd..c2ffb03 100644
--- a/xdiff/xpatience.c
+++ b/xdiff/xpatience.c
@@ -203,8 +203,10 @@ static struct entry *find_longest_common_sequence(struct hashmap *map)
}
/* No common unique lines were found */
- if (!longest)
+ if (!longest) {
+ xdl_free(sequence);
return NULL;
+ }
/* Iterate starting at the last element, adjusting the "next" members */
entry = sequence[longest - 1];
@@ -213,6 +215,7 @@ static struct entry *find_longest_common_sequence(struct hashmap *map)
entry->previous->next = entry;
entry = entry->previous;
}
+ xdl_free(sequence);
return entry;
}
@@ -350,6 +353,7 @@ static int patience_diff(mmfile_t *file1, mmfile_t *file2,
env->xdf1.rchg[line1++ - 1] = 1;
while(count2--)
env->xdf2.rchg[line2++ - 1] = 1;
+ xdl_free(map.entries);
return 0;
}
@@ -361,6 +365,7 @@ static int patience_diff(mmfile_t *file1, mmfile_t *file2,
result = fall_back_to_classic_diff(&map,
line1, count1, line2, count2);
+ xdl_free(map.entries);
return result;
}
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply related
* Re: [PATCH (topgit) 1/2] Implement setup_pager just like in git
From: Bert Wesarg @ 2009-01-07 14:24 UTC (permalink / raw)
To: Kirill Smelkov; +Cc: martin f krafft, git, pasky
In-Reply-To: <20090107112754.GA15158@roro3>
On Wed, Jan 7, 2009 at 12:27, Kirill Smelkov <kirr@landau.phys.spbu.ru> wrote:
> Martin, thanks for your review.
> + # atexit(close(1); wait pager)
> + trap "exec >&-; rm "$_pager_fifo"; rmdir "$_pager_fifo_dir"; wait" EXIT
I think you need to escape the double quotes.
Bert
^ permalink raw reply
* [PATCH/RFC v3 2/2] create_directories() inside entry.c: only check each directory once!
From: Kjetil Barvik @ 2009-01-07 13:24 UTC (permalink / raw)
To: git; +Cc: Linus Torvalds, Kjetil Barvik
In-Reply-To: <1231334689-17135-1-git-send-email-barvik@broadpark.no>
When we do an 'git checkout' after some time we end up in the
'checkout_entry()' function inside entry.c, and from here we call the
'create_directories()' function to make sure the all the directories
exists for the possible new file or entry.
The 'create_directories()' function happily started to check that all
path component exists. This resulted in tons and tons of calls to
lstat() or stat() when we checkout files nested deep inside a
directory.
We try to avoid this by remembering the last checked and possible
newly created directory.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
:100644 100644 768ba38... ec1297f... M cache.h
:100644 100644 aa2ee46... 36d6f98... M entry.c
:100644 100644 28e2759... 0a03e65... M unpack-trees.c
cache.h | 1 +
entry.c | 82 +++++++++++++++++++++++++++++++++++++++++++------------
unpack-trees.c | 1 +
3 files changed, 66 insertions(+), 18 deletions(-)
diff --git a/cache.h b/cache.h
index 768ba3825f3015828381490b0c387177a4f71578..ec1297ff5621cc9eb7fce51cc025f18a030ac9ea 100644
--- a/cache.h
+++ b/cache.h
@@ -718,6 +718,7 @@ struct checkout {
refresh_cache:1;
};
+extern void clear_created_dirs_cache(void);
extern int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath);
#define LSTAT_DIR (1u << 0)
diff --git a/entry.c b/entry.c
index aa2ee46a84033585d8e07a585610c5a697af82c2..36d6f98c1f59a86a5e9c117e1181c1169208168f 100644
--- a/entry.c
+++ b/entry.c
@@ -1,33 +1,67 @@
#include "cache.h"
#include "blob.h"
-static void create_directories(const char *path, const struct checkout *state)
+static char dirs_path[PATH_MAX];
+static int dirs_len = 0;
+
+static inline int
+greatest_common_created_dirs_prefix(int len, const char *name)
{
- int len = strlen(path);
- char *buf = xmalloc(len + 1);
- const char *slash = path;
+ int max_len, match_len = 0, i = 0;
- while ((slash = strchr(slash+1, '/')) != NULL) {
- struct stat st;
- int stat_status;
+ max_len = len < dirs_len ? len : dirs_len;
+ while (i < max_len && name[i] == dirs_path[i]) {
+ if (name[i] == '/') match_len = i;
+ i++;
+ }
+ if (i == dirs_len && len > dirs_len && name[dirs_len] == '/')
+ match_len = dirs_len;
+ return match_len;
+}
- len = slash - path;
- memcpy(buf, path, len);
- buf[len] = 0;
+void clear_created_dirs_cache(void)
+{
+ dirs_path[0] = '\0';
+ dirs_len = 0;
+}
- if (len <= state->base_dir_len)
+static void
+create_directories(int len, const char *path, const struct checkout *state)
+{
+ int i, max_len, last_slash, stat_status;
+ struct stat st;
+
+ /* Check the cache for previously checked or created
+ * directories (and components) within this function. There
+ * is no need to check or re-create directory components more
+ * than once!
+ */
+ max_len = len < PATH_MAX ? len : PATH_MAX;
+ i = last_slash = greatest_common_created_dirs_prefix(max_len, path);
+
+ while (i < max_len) {
+ do {
+ dirs_path[i] = path[i];
+ i++;
+ } while (i < max_len && path[i] != '/');
+ if (i >= max_len)
+ break;
+ last_slash = i;
+ dirs_path[last_slash] = '\0';
+
+ if (last_slash <= state->base_dir_len)
/*
* checkout-index --prefix=<dir>; <dir> is
* allowed to be a symlink to an existing
* directory.
*/
- stat_status = stat(buf, &st);
+ stat_status = stat(dirs_path, &st);
else
/*
* if there currently is a symlink, we would
* want to replace it with a real directory.
*/
- stat_status = lstat(buf, &st);
+ stat_status = lstat(dirs_path, &st);
if (!stat_status && S_ISDIR(st.st_mode))
continue; /* ok, it is already a directory. */
@@ -38,14 +72,20 @@ static void create_directories(const char *path, const struct checkout *state)
* error codepath; we do not care, as we unlink and
* mkdir again in such a case.
*/
- if (mkdir(buf, 0777)) {
+ if (mkdir(dirs_path, 0777)) {
if (errno == EEXIST && state->force &&
- !unlink(buf) && !mkdir(buf, 0777))
+ !unlink(dirs_path) && !mkdir(dirs_path, 0777))
continue;
- die("cannot create directory at %s", buf);
+ die("cannot create directory at %s", dirs_path);
}
}
- free(buf);
+ /* Update the cache of already created/checked directories */
+ if (last_slash > 0 && last_slash < PATH_MAX) {
+ dirs_path[last_slash] = '\0';
+ dirs_len = last_slash;
+ } else {
+ clear_created_dirs_cache();
+ }
}
static void remove_subtree(const char *path)
@@ -55,6 +95,11 @@ static void remove_subtree(const char *path)
char pathbuf[PATH_MAX];
char *name;
+ /* To be utterly safe we invalidate the cache of the
+ * previously created directories.
+ */
+ clear_created_dirs_cache();
+
if (!dir)
die("cannot opendir %s (%s)", path, strerror(errno));
strcpy(pathbuf, path);
@@ -201,6 +246,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *t
memcpy(path, state->base_dir, len);
strcpy(path + len, ce->name);
+ len += ce_namelen(ce);
if (!lstat(path, &st)) {
unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID);
@@ -229,6 +275,6 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *t
return error("unable to unlink old '%s' (%s)", path, strerror(errno));
} else if (state->not_new)
return 0;
- create_directories(path, state);
+ create_directories(len, path, state);
return write_entry(ce, path, state, 0);
}
diff --git a/unpack-trees.c b/unpack-trees.c
index 28e275981a21b033459ef9c7e420cce4bf7e5513..0a03e65f9c9d869ab2d8b3c337f032ff2b8e7b2f 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -119,6 +119,7 @@ static int check_updates(struct unpack_trees_options *o)
}
}
+ clear_created_dirs_cache();
for (i = 0; i < index->cache_nr; i++) {
struct cache_entry *ce = index->cache[i];
--
1.6.1.rc1.49.g7f705
^ permalink raw reply related
* [PATCH/RFC v3 1/2] Optimised, faster, more effective symlink/directory detection
From: Kjetil Barvik @ 2009-01-07 13:24 UTC (permalink / raw)
To: git; +Cc: Linus Torvalds, Kjetil Barvik
In-Reply-To: <1231334689-17135-1-git-send-email-barvik@broadpark.no>
Changes includes the following:
- The cache functionality is more effective. Previously when A/B/C/D
was in the cache and A/B/C/E/file.c was called for, there was no
match at all from the cache. Now we use the fact that the paths
"A", "A/B" and "A/B/C" is already tested, and we only need to do an
lstat() call on "A/B/C/E".
- We only cache/store the last path regardless of it's type. Since the
cache functionality is always used with alphabetically sorted names
(at least it seams so for me), there is no need to store both the
last symlink-leading path and the last real-directory path. Note
that if the cache is not called with (mostly) alphabetically sorted
names, neither the old, nor this new one, would be very effective.
- We also can cache the fact that a directory does not exist.
Previously we could end up doing lots of lstat() calls for a removed
directory which previously contained lots of files. Since we
already have simplified the cache functionality and only store the
last path (see above), this new functionality was easy to add.
- Previously, when symlink A/B/C/S was cached/stored in the
symlink-leading path, and A/B/C/file.c was called for, it was not
easy to use the fact that we already known that the paths "A", "A/B"
and "A/B/C" is real directories. Since we now only store one single
path (the last one), we also get similar logic for free regarding
the new "non-exsisting-directory-cache".
- Avoid copying the first path components of the name 2 zillions times
when we tests new path components. Since we always cache/store the
last path, we can copy each component as we test those directly into
the cache. Previously we ended up doing a memcpy() for the full
path/name right before each lstat() call, and when updating the
cache for each time we have tested an new path component.
- We also use less memory, that is PATH_MAX bytes less memory on the
stack and PATH_MAX bytes less memory on the heap.
- Introduce a 3rd argument, 'unsigned int track_flags', to the
cache-test function, check_lstat_cache(). This new argument can be
used to tell the cache functionality which types of directories
should be cached.
- Also introduce a 'void clear_lstat_cache(void)' function, which
should be used to clean the cache before usage. If for instance,
you have changed the types of directories which should be cached,
the cache could contain a path which was not wanted.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
:100644 100644 719de8b... 870961e... M builtin-add.c
:100644 100644 a8f75ed... d3d001a... M builtin-apply.c
:100644 100644 5604977... 8907219... M builtin-update-index.c
:100644 100644 231c06d... 768ba38... M cache.h
:100644 100644 ae96c64... c9caa0e... M diff-lib.c
:100644 100644 5a5e781... 2f025c8... M symlinks.c
:100644 100644 54f301d... 28e2759... M unpack-trees.c
builtin-add.c | 1 +
builtin-apply.c | 1 +
builtin-update-index.c | 1 +
cache.h | 22 ++++++++-
diff-lib.c | 1 +
symlinks.c | 123 +++++++++++++++++++++++++++++++-----------------
unpack-trees.c | 5 +-
7 files changed, 107 insertions(+), 47 deletions(-)
diff --git a/builtin-add.c b/builtin-add.c
index 719de8b0f2d2d831f326d948aa18700e5c474950..870961e8ca4e3d6f9333020083d0a232bccd542c 100644
--- a/builtin-add.c
+++ b/builtin-add.c
@@ -225,6 +225,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
argc = parse_options(argc, argv, builtin_add_options,
builtin_add_usage, 0);
+ clear_symlink_cache();
if (patch_interactive)
add_interactive = 1;
if (add_interactive)
diff --git a/builtin-apply.c b/builtin-apply.c
index a8f75ed3ed411d8cf7a3ec9dfefef7407c50f447..d3d001a96be6e502d6338af4467f7c313370d78e 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -3154,6 +3154,7 @@ int cmd_apply(int argc, const char **argv, const char *unused_prefix)
if (apply_default_whitespace)
parse_whitespace_option(apply_default_whitespace);
+ clear_symlink_cache();
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
char *end;
diff --git a/builtin-update-index.c b/builtin-update-index.c
index 560497750586ec61be4e34de6dedd9c307129817..8907219fb9cb438113e29ee17854edb5dd4baa4d 100644
--- a/builtin-update-index.c
+++ b/builtin-update-index.c
@@ -581,6 +581,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
if (entries < 0)
die("cache corrupted");
+ clear_symlink_cache();
for (i = 1 ; i < argc; i++) {
const char *path = argv[i];
const char *p;
diff --git a/cache.h b/cache.h
index 231c06d7726b575f6e522d5b0c0fe43557e8c651..768ba3825f3015828381490b0c387177a4f71578 100644
--- a/cache.h
+++ b/cache.h
@@ -719,7 +719,27 @@ struct checkout {
};
extern int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath);
-extern int has_symlink_leading_path(int len, const char *name);
+
+#define LSTAT_DIR (1u << 0)
+#define LSTAT_NOENT (1u << 1)
+#define LSTAT_SYMLINK (1u << 2)
+#define LSTAT_LSTATERR (1u << 3)
+#define LSTAT_ERR (1u << 4)
+extern unsigned int check_lstat_cache(int len, const char *name,
+ unsigned int track_flags);
+extern void clear_lstat_cache(void);
+static inline unsigned int has_symlink_leading_path(int len, const char *name)
+{
+ return check_lstat_cache(len, name, LSTAT_SYMLINK|LSTAT_DIR) &
+ LSTAT_SYMLINK;
+}
+#define clear_symlink_cache() clear_lstat_cache()
+static inline unsigned int has_symlink_or_noent_leading_path(int len, const char *name)
+{
+ return check_lstat_cache(len, name, LSTAT_SYMLINK|LSTAT_NOENT|LSTAT_DIR) &
+ (LSTAT_SYMLINK|LSTAT_NOENT);
+}
+#define clear_symlink_or_noent_cache() clear_lstat_cache()
extern struct alternate_object_database {
struct alternate_object_database *next;
diff --git a/diff-lib.c b/diff-lib.c
index ae96c64ca209f4df9008198e8a04b160bed618c7..c9caa0e6ef0f4a8ee8b850869ef6d0f52b712385 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -69,6 +69,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
diff_unmerged_stage = 2;
entries = active_nr;
symcache[0] = '\0';
+ clear_symlink_cache();
for (i = 0; i < entries; i++) {
struct stat st;
unsigned int oldmode, newmode;
diff --git a/symlinks.c b/symlinks.c
index 5a5e781a15d7d9cb60797958433eca896b31ec85..2f025c87b0be0e713af7b1400e71284a22c25e9f 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -1,64 +1,99 @@
#include "cache.h"
-struct pathname {
- int len;
- char path[PATH_MAX];
-};
+static char cache_path[PATH_MAX];
+static int cache_len = 0;
+static unsigned int cache_flags = 0;
-/* Return matching pathname prefix length, or zero if not matching */
-static inline int match_pathname(int len, const char *name, struct pathname *match)
+static inline int
+greatest_common_path_cache_prefix(int len, const char *name)
{
- int match_len = match->len;
- return (len > match_len &&
- name[match_len] == '/' &&
- !memcmp(name, match->path, match_len)) ? match_len : 0;
-}
+ int max_len, match_len = 0, i = 0;
-static inline void set_pathname(int len, const char *name, struct pathname *match)
-{
- if (len < PATH_MAX) {
- match->len = len;
- memcpy(match->path, name, len);
- match->path[len] = 0;
+ max_len = len < cache_len ? len : cache_len;
+ while (i < max_len && name[i] == cache_path[i]) {
+ if (name[i] == '/') match_len = i;
+ i++;
}
+ if (i == cache_len && len > cache_len && name[cache_len] == '/')
+ match_len = cache_len;
+ return match_len;
}
-int has_symlink_leading_path(int len, const char *name)
+/*
+ * Check if name 'name' of length 'len' has a symlink leading
+ * component, or if the directory exists and is real, or not.
+ *
+ * To speed up the check, some information is allowed to be cached.
+ * This is indicated by the 'track_flags' argument.
+ */
+unsigned int
+check_lstat_cache(int len, const char *name, unsigned int track_flags)
{
- static struct pathname link, nonlink;
- char path[PATH_MAX];
+ int match_len, last_slash, max_len;
+ unsigned int match_flags, ret_flags, save_flags;
struct stat st;
- char *sp;
- int known_dir;
- /*
- * See if the last known symlink cache matches.
+ /* Check if match from the cache for 2 "excluding" path types.
*/
- if (match_pathname(len, name, &link))
- return 1;
+ match_len = last_slash =
+ greatest_common_path_cache_prefix(len, name);
+ match_flags =
+ cache_flags & track_flags & (LSTAT_NOENT|LSTAT_SYMLINK);
+ if (match_flags && match_len == cache_len)
+ return match_flags;
- /*
- * Get rid of the last known directory part
+ /* Okay, no match from the cache so far, so now we have to
+ * check the rest of the path components.
*/
- known_dir = match_pathname(len, name, &nonlink);
-
- while ((sp = strchr(name + known_dir + 1, '/')) != NULL) {
- int thislen = sp - name ;
- memcpy(path, name, thislen);
- path[thislen] = 0;
+ ret_flags = LSTAT_DIR;
+ max_len = len < PATH_MAX ? len : PATH_MAX;
+ while (match_len < max_len) {
+ do {
+ cache_path[match_len] = name[match_len];
+ match_len++;
+ } while (match_len < max_len && name[match_len] != '/');
+ if (match_len >= max_len)
+ break;
+ last_slash = match_len;
+ cache_path[last_slash] = '\0';
- if (lstat(path, &st))
- return 0;
- if (S_ISDIR(st.st_mode)) {
- set_pathname(thislen, path, &nonlink);
- known_dir = thislen;
+ if (lstat(cache_path, &st)) {
+ ret_flags = LSTAT_LSTATERR;
+ if (errno == ENOENT)
+ ret_flags |= LSTAT_NOENT;
+ } else if (S_ISDIR(st.st_mode)) {
continue;
- }
- if (S_ISLNK(st.st_mode)) {
- set_pathname(thislen, path, &link);
- return 1;
+ } else if (S_ISLNK(st.st_mode)) {
+ ret_flags = LSTAT_SYMLINK;
+ } else {
+ ret_flags = LSTAT_ERR;
}
break;
}
- return 0;
+
+ /* At the end update the cache. Note that max 3 different
+ * path types can be cached for the moment!
+ */
+ save_flags = ret_flags & track_flags &
+ (LSTAT_NOENT|LSTAT_SYMLINK|LSTAT_DIR);
+ if (save_flags && last_slash > 0 && last_slash < PATH_MAX) {
+ cache_path[last_slash] = '\0';
+ cache_len = last_slash;
+ cache_flags = save_flags;
+ } else {
+ clear_lstat_cache();
+ }
+ return ret_flags;
+}
+
+/*
+ * Before usage of the check_lstat_cache() function one should call
+ * clear_lstat_cache() (at an appropriate place) to make sure that the
+ * cache is clean.
+ */
+void clear_lstat_cache(void)
+{
+ cache_path[0] = '\0';
+ cache_len = 0;
+ cache_flags = 0;
}
diff --git a/unpack-trees.c b/unpack-trees.c
index 54f301da67be879c80426bc21776427fdd38c02e..28e275981a21b033459ef9c7e420cce4bf7e5513 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -61,7 +61,7 @@ static void unlink_entry(struct cache_entry *ce)
char *cp, *prev;
char *name = ce->name;
- if (has_symlink_leading_path(ce_namelen(ce), ce->name))
+ if (has_symlink_or_noent_leading_path(ce_namelen(ce), ce->name))
return;
if (unlink(name))
return;
@@ -105,6 +105,7 @@ static int check_updates(struct unpack_trees_options *o)
cnt = 0;
}
+ clear_symlink_or_noent_cache();
for (i = 0; i < index->cache_nr; i++) {
struct cache_entry *ce = index->cache[i];
@@ -584,7 +585,7 @@ static int verify_absent(struct cache_entry *ce, const char *action,
if (o->index_only || o->reset || !o->update)
return 0;
- if (has_symlink_leading_path(ce_namelen(ce), ce->name))
+ if (has_symlink_or_noent_leading_path(ce_namelen(ce), ce->name))
return 0;
if (!lstat(ce->name, &st)) {
--
1.6.1.rc1.49.g7f705
^ permalink raw reply related
* [PATCH/RFC v3 0/2] git checkout: optimise away lots of lstat() calls
From: Kjetil Barvik @ 2009-01-07 13:24 UTC (permalink / raw)
To: git; +Cc: Linus Torvalds, Kjetil Barvik
I have just started to clone some interesting Linux git trees to watch
the development more closely, and therefore also started to use git. I
noticed that 'git checkout' takes some time, and especially that the
'git checkout' command does lots and lots of lstat() calls.
After some more investigation and thinking, I have made 2 patches and
been able to optimise away over 42% of all lstat() calls in some cases
for the 'git checkout' command. I have not tested other git porcelain
commands for reduced lstat() calls, but I would guess that the more
effective 'lstat_cache()' compared to 'has_leading_symlink_cache()',
should also give better numbers in other cases.
Both patches is against git master, and the git 'make test' test suite
still passes after each patch.
To document the improvement, below is some numbers, which compares
before and after the 2 patches. To reproduce the numbers:
- git clone the Linux git tree to be able to get the Linux tags
'v2.6.25' and 'v2.6.27'.
- git checkout -b my-v2.6.27 v2.6.27
- git checkout -b my-v2.6.25 v2.6.25
Then, when the current branch is 'my-v2.6.25' do:
strace -o strace_to27 -T git checkout -q my-v2.6.27
And then you pretty print and collect stats from the 'strace_to27'
file. If someone wants a copy of the strace_stat.pl script, which I
made/used to do the pretty printing, then give me a hint.
Below is the stats/numbers from the current git version (before the 2
patches). Notice that we do an lstat() call on the "arch" directory
over 6000 times!
TOTAL 185151 100.000% OK:165544 NOT: 19607 11.136001 sec 60 usec/call
lstat64 120954 65.327% OK:107013 NOT: 13941 5.388727 sec 45 usec/call
strings 120954 tot 30163 uniq 4.010 /uniq 5.388727 sec 45 usec/call
files 61491 tot 28712 uniq 2.142 /uniq 2.740520 sec 45 usec/call
dirs 45522 tot 1436 uniq 31.701 /uniq 1.994448 sec 44 usec/call
errors 13941 tot 5189 uniq 2.687 /uniq 0.653759 sec 47 usec/call
6297 5.206% OK: 6297 NOT: 0 "arch"
4544 3.757% OK: 4544 NOT: 0 "drivers"
1816 1.501% OK: 1816 NOT: 0 "arch/arm"
1499 1.239% OK: 1499 NOT: 0 "include"
912 0.754% OK: 912 NOT: 0 "arch/powerpc"
764 0.632% OK: 764 NOT: 0 "fs"
746 0.617% OK: 746 NOT: 0 "drivers/net"
662 0.547% OK: 662 NOT: 0 "net"
652 0.539% OK: 325 NOT: 327 "arch/sparc/include"
636 0.526% OK: 636 NOT: 0 "drivers/media"
606 0.501% OK: 606 NOT: 0 "include/linux"
533 0.441% OK: 533 NOT: 0 "arch/sh"
522 0.432% OK: 260 NOT: 262 "arch/powerpc/include"
488 0.403% OK: 243 NOT: 245 "arch/sh/include"
413 0.341% OK: 413 NOT: 0 "arch/sparc"
390 0.322% OK: 390 NOT: 0 "arch/x86"
383 0.317% OK: 383 NOT: 0 "Documentation"
370 0.306% OK: 184 NOT: 186 "arch/ia64/include"
366 0.303% OK: 366 NOT: 0 "drivers/media/video"
348 0.288% OK: 173 NOT: 175 "arch/arm/include"
Here is the stats/numbers after applying the 2 patches. Notice how
nice the top 20 entries list now looks!
TOTAL 133655 100.000% OK:121615 NOT: 12040 10.429999 sec 78 usec/call
lstat64 69603 52.077% OK: 63218 NOT: 6385 3.419920 sec 49 usec/call
strings 69603 tot 30163 uniq 2.308 /uniq 3.419920 sec 49 usec/call
files 61491 tot 28712 uniq 2.142 /uniq 3.034869 sec 49 usec/call
dirs 1727 tot 1164 uniq 1.484 /uniq 0.075681 sec 44 usec/call
errors 6385 tot 5189 uniq 1.230 /uniq 0.309370 sec 48 usec/call
4 0.006% OK: 4 NOT: 0 ".gitignore"
4 0.006% OK: 4 NOT: 0 ".mailmap"
4 0.006% OK: 4 NOT: 0 "CREDITS"
4 0.006% OK: 4 NOT: 0 "Documentation/00-INDEX"
4 0.006% OK: 4 NOT: 0 "Documentation/ABI/testing/sysfs-block"
4 0.006% OK: 4 NOT: 0 "Documentation/ABI/testing/sysfs-firmware-acpi"
4 0.006% OK: 4 NOT: 0 "Documentation/CodingStyle"
4 0.006% OK: 4 NOT: 0 "Documentation/DMA-API.txt"
4 0.006% OK: 4 NOT: 0 "Documentation/DMA-mapping.txt"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/Makefile"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/gadget.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/kernel-api.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/kernel-locking.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/procfs-guide.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/procfs_example.c"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/rapidio.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/s390-drivers.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/uio-howto.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/videobook.tmpl"
4 0.006% OK: 4 NOT: 0 "Documentation/DocBook/writing_usb_driver.tmpl"
Note that the overall used system time as recorded from 'strace -T',
does not drop so much that the reduced lstat() time should indicate
for _this_ particular test run. This is because now each unlink()
call takes much more time, at least for me on an slow ide disk (using
ext3) on a laptop.
A simple test gives me an overall improvement of 2.937 seconds: real
time drops from 28.195s (best of 5 runs with 'time git ...'), to
25.381s (best of 5 runs).
I have also noticed that inside the unlink_entry() function in file
unpack-trees.c, one could often end up calling rmdir() lots and lots
of times on none-empty directories. Maybe one should schedule each
directory for removal by an appropriate function, and then at the end
call a new function to clean all the directories at once?
Comments?
----
Changes since v2:
= inside patch 1/2
[[Following is updates after comments from Linus Torvalds - Thanks!]]
- simplified the interface: introduce 2 static inline functions
has_symlink_leading_path() and has_symlink_or_noent_leading_path()
- similar, introduce 2 defines: clear_symlink_cache() and
clear_symlink_or_noent_cache()
- reorganise the patches: previous patch 2/4 and 4/4 is put into
this one
- update the commit message accordingly
- keep the symlinks.c file
= inside patch 2/2
- was patch 3/4 in v2
- always null terminate the dirs_path array
- update the patch with some of the comments regarding patch 1/4
from Junio C Hamano
Changes since v1:
= inside patch 1/4
- always null terminate the cache_path array
- added a paragraph to the commit message for this patch
- small cleanup on 2 comments, and a small line indentation change
[[Following is updates after comments from Junio C Hamano - Thanks!]]
- removed the 'static inline update_path_cache()' function
- replaced the else-part of the above inline function with a call
to the 'clear_lstat_cache()' function.
- deleted the '|| errno == ENOTDIR' part inside the big while-loop
inside check_lstat_cache(), and updated the named BIT-fields
accordingly
= inside patch 2/4
- moved a paragraph out from the commit message for this patch and
into this cover-letter
[[Following is updates after comments from Junio C Hamano - Thanks!]]
- Removed the '|LSTAT_NOTDIR' part from the call to lstat_cache()
inside function 'check_removed()' inside file diff-lib.c
Kjetil Barvik (2):
Optimised, faster, more effective symlink/directory detection
create_directories() inside entry.c: only check each directory once!
builtin-add.c | 1 +
builtin-apply.c | 1 +
builtin-update-index.c | 1 +
cache.h | 23 +++++++++-
diff-lib.c | 1 +
entry.c | 82 +++++++++++++++++++++++++-------
symlinks.c | 123 +++++++++++++++++++++++++++++++-----------------
unpack-trees.c | 6 ++-
8 files changed, 173 insertions(+), 65 deletions(-)
^ permalink raw reply
* Re: Problems getting rid of large files using git-filter-branch
From: Johannes Schindelin @ 2009-01-07 12:45 UTC (permalink / raw)
To: Øyvind Harboe; +Cc: git
In-Reply-To: <c09652430901070215p436db79boae4c56bfa1afbc1a@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 162 bytes --]
Hi,
On Wed, 7 Jan 2009, Øyvind Harboe wrote:
> How would git behave if it ran out of memory?
Something like
fatal: Out of memory, malloc failed
Ciao,
Dscho
^ permalink raw reply
* Re: [PATCH] gitweb: support the rel=vcs microformat
From: Giuseppe Bilotta @ 2009-01-07 12:30 UTC (permalink / raw)
To: git
In-Reply-To: <20090107042518.GB24735@gnu.kitenet.net>
On Wednesday 07 January 2009 05:25, Joey Hess wrote:
> The rel=vcs microformat allows a web page to indicate the locations of
> repositories related to it in a machine-parseable manner.
> (See http://kitenet.net/~joey/rfc/rel-vcs/)
Interesting idea, I like it. However, I see a problem in the proposed
implementation versus the spec. According to the spec:
"""
The "title" is optional, but recommended if there are multiple, different
repositories linked to on one page. It is a human-readable description of the
repository.
[...]
If there are multiple repositories listed, without titles, tools should assume
they are different repositories.
"""
In this patch you do NOT add titles to the rel=vcs links, which means that
everything works fine only if there is a single URL for each project. If a
project has different URLs, it's going to appear multiple times as _different_
projects to a spec-compliant reader.
A possible solution would be to make @git_url_list into a map keyed by the
project name and having the description and repo URL(s) as values.
Since there is the possibility of different projects having the same
description (e.g. the default one), the link title could be composed of
"$project - $description" rather than simply $description.
Note that both in summary and in project list view you already retrieve the
description, so there are no additional disk hits.
> Make gitweb use the microformat in the header of pages it generates,
> if it has been configured with project url information in any of the usual
> ways.
>
> Since getting the urls can require hitting disk, I avoided putting the
> microformat on *every* page gitweb generates. Just put it on the project
> summary page, the project list page, and the forks list page.
> The first of these already looks up the urls, so adding the microformat was
> free. There is a small overhead in including the microformat on the
> latter two pages, but getting the project descriptions for those pages
> already incurs a similar overhead, and the ability to get every repo url
> in one place seems worthwhile.
>
> This changes git_get_project_description() to not check wantarray, and only
> return in list context -- the only way it is used AFAICS.
I assume you mean git_get_project_url_list()?
>
> Signed-off-by: Joey Hess <joey@gnu.kitenet.net>
> ---
> gitweb/gitweb.perl | 38 ++++++++++++++++++++++++++------------
> 1 files changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 99f71b4..3f8a228 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -789,6 +789,9 @@ $git_dir = "$projectroot/$project" if $project;
> our @snapshot_fmts = gitweb_get_feature('snapshot');
> @snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts);
>
> +# populated later with git urls for the project
> +our @git_url_list;
> +
> # dispatch
> if (!defined $action) {
> if (defined $hash) {
> @@ -2100,17 +2103,22 @@ sub git_show_project_tagcloud {
> }
>
> sub git_get_project_url_list {
> + # use per project git URL list in $projectroot/$path/cloneurl
> + # or make project git URL from git base URL and project name
> my $path = shift;
>
> + my @ret;
> +
> $git_dir = "$projectroot/$path";
> - open my $fd, "$git_dir/cloneurl"
> - or return wantarray ?
> - @{ config_to_multi(git_get_project_config('url')) } :
> - config_to_multi(git_get_project_config('url'));
> - my @git_project_url_list = map { chomp; $_ } <$fd>;
> - close $fd;
> + if (open my $fd, "$git_dir/cloneurl") {
> + @ret = map { chomp; $_ } <$fd>;
> + close $fd;
> + }
> + else {
Coding style: } else {
> + @ret = @{ config_to_multi(git_get_project_config('url')) };
> + }
>
> - return wantarray ? @git_project_url_list : \@git_project_url_list;
> + return @ret ? @ret : map { "$_/$project" } @git_base_url_list;
> }
>
> sub git_get_projects_list {
> @@ -2953,6 +2961,10 @@ EOF
> print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n);
> }
>
> + foreach my $url (@git_url_list) {
> + print qq{<link rel="vcs" type="git" href="$url" />\n};
> + }
> +
> print "</head>\n" .
> "<body>\n";
>
> @@ -4380,6 +4392,8 @@ sub git_project_list {
> die_error(404, "No projects found");
> }
>
> + @git_url_list = map { git_get_project_url_list($_->{path}) } @list;
> +
> git_header_html();
> if (-f $home_text) {
> print "<div class=\"index_include\">\n";
> @@ -4400,6 +4414,8 @@ sub git_forks {
> if (defined $order && $order !~ m/none|project|descr|owner|age/) {
> die_error(400, "Unknown order parameter");
> }
> +
> + @git_url_list = map { git_get_project_url_list($_->{path}) } @list;
>
> my @list = git_get_projects_list($project);
> if (!@list) {
> @@ -4457,6 +4473,8 @@ sub git_summary {
> @forklist = git_get_projects_list($project);
> }
>
> + @git_url_list = git_get_project_url_list($project);
> +
> git_header_html();
> git_print_page_nav('summary','', $head);
>
> @@ -4468,12 +4486,8 @@ sub git_summary {
> print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n";
> }
>
> - # use per project git URL list in $projectroot/$project/cloneurl
> - # or make project git URL from git base URL and project name
> my $url_tag = "URL";
> - my @url_list = git_get_project_url_list($project);
> - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list;
> - foreach my $git_url (@url_list) {
> + foreach my $git_url (@git_url_list) {
> next unless $git_url;
> print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n";
> $url_tag = "";
--
Giuseppe "Oblomov" Bilotta
^ permalink raw reply
* Re: [RFC/PATCH 2/3] replace_object: add mechanism to replace objects found in "refs/replace/"
From: Johannes Schindelin @ 2009-01-07 12:29 UTC (permalink / raw)
To: Christian Couder; +Cc: Junio C Hamano, git
In-Reply-To: <20090107084341.1554d8cd.chriscool@tuxfamily.org>
Hi,
On Wed, 7 Jan 2009, Christian Couder wrote:
> diff --git a/replace_object.c b/replace_object.c
> new file mode 100644
> index 0000000..b50890d
> --- /dev/null
> +++ b/replace_object.c
> @@ -0,0 +1,102 @@
> +#include "cache.h"
> +#include "refs.h"
> +
> +static struct replace_object {
> + unsigned char sha1[2][20];
> +} **replace_object;
> +
> +static int replace_object_alloc, replace_object_nr;
> +
> +static int replace_object_pos(const unsigned char *sha1)
> +{
> + int lo, hi;
> + lo = 0;
> + hi = replace_object_nr;
> + while (lo < hi) {
> [...]
I suspect that this sorted list should be a hashmap.
Ciao,
Dscho
^ permalink raw reply
* Re: Error: unable to unlink ... when using "git gc"
From: Sitaram Chamarty @ 2009-01-07 12:29 UTC (permalink / raw)
To: git
In-Reply-To: <slrngm92hr.72d.sitaramc@sitaramc.homelinux.net>
On 2009-01-07, Sitaram Chamarty <sitaramc@gmail.com> wrote:
> $ dummy_commit
> $ jeeves dummy_commit
err, helper functions. Should have mentioned.
I do a lot of playing around with git :-)
^ permalink raw reply
* Re: [PATCH (topgit) 1/2] Implement setup_pager just like in git
From: Thomas Rast @ 2009-01-07 12:24 UTC (permalink / raw)
To: Kirill Smelkov; +Cc: martin f krafft, git, pasky
In-Reply-To: <20090107112754.GA15158@roro3>
[-- Attachment #1: Type: text/plain, Size: 1198 bytes --]
Kirill Smelkov wrote:
> On Tue, Jan 06, 2009 at 09:32:03PM +0100, martin f krafft wrote:
> > I find this very confusing. Why not simply
> >
> > TG_PAGER="${GIT_PAGER:-}"
> > TG_PAGER="${TG_PAGER:-$PAGER}"
> >
> > ?
>
> I find it confusing too, but this is needed because they usually do
> something like this
>
> $ GIT_PAGER='' <some-git-command>
>
> to force it to be pagerless.
[...]
> So I think it would be better to preserve the same semantics for `tg
> patch` callers, though it's a pity that it's hard (maybe I'm wrong ?) to
> see whether an env var is unset.
Admittedly I haven't really studied your patch, but the ${} constructs
can in fact tell empty from unset:
$ EMPTY=
$ unset UNDEFINED
$ echo ${UNDEFINED-foo}
foo
$ echo ${UNDEFINED:-foo}
foo
$ echo ${EMPTY-foo}
$ echo ${EMPTY:-foo}
foo
'man bash' indeed says
When not performing substring expansion, bash tests for a parameter
that is unset or null; omitting the colon results in a test only for
a parameter that is unset.
So I suppose you could use
${GIT_PAGER-${PAGER-less}}
or similar.
--
Thomas Rast
trast@{inf,student}.ethz.ch
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
* Re: [PATCH (topgit) 1/2] Implement setup_pager just like in git
From: Kirill Smelkov @ 2009-01-07 11:27 UTC (permalink / raw)
To: martin f krafft; +Cc: git, pasky
In-Reply-To: <20090106203203.GA11274@lapse.rw.madduck.net>
Martin, thanks for your review.
I'll too reply inline:
On Tue, Jan 06, 2009 at 09:32:03PM +0100, martin f krafft wrote:
> Thanks, Kirill, for the patches. A couple of comments inline. I hope
> Petr has a chance to look too.
>
> also sprach Kirill Smelkov <kirr@landau.phys.spbu.ru> [2009.01.06.1616 +0100]:
> > +# isatty FD
> > +isatty()
> > +{
> > + tty -s 0<&$1 || return 1
> > + return 0
> > +}
>
> You don't need any of the return statements. Functions' return
> values are the return values of the last commands they execute.
Agree, I'll rework isatty to be just
isatty()
{
tty -s 0<&$1
}
> Since we are not using set -e, just "tty -s 0<&$1" in the body will
> have the same effect.
We _are_ using `set -e` in tg.sh:
http://repo.or.cz/w/topgit.git?a=blob;f=tg.sh;h=8c23d26b9a62ddcc1869bb70299862c32edd4403;hb=HEAD#l254
And also I think `tty -s 0<&$1` is a bit obscure, so maybe it is still
better to put it into a function with well-known name such as isatty?
>
> > + # TG_PAGER = GIT_PAGER | PAGER
> > + # http://unix.derkeiler.com/Newsgroups/comp.unix.shell/2004-03/0792.html
> > + case ${GIT_PAGER+XXX} in
> > + '')
> > + case ${PAGER+XXX} in
> > + '')
> > + # both GIT_PAGER & PAGER unset
>
> I find this very confusing. Why not simply
>
> TG_PAGER="${GIT_PAGER:-}"
> TG_PAGER="${TG_PAGER:-$PAGER}"
>
> ?
I find it confusing too, but this is needed because they usually do
something like this
$ GIT_PAGER='' <some-git-command>
to force it to be pagerless.
For example in git-rebase.sh:
http://repo.or.cz/w/git.git?a=blob;f=git-rebase.sh;h=ebd4df3a0e821ddcfd1eabfcaac17f854e172a85;hb=HEAD#l415
And other places.
So I think it would be better to preserve the same semantics for `tg
patch` callers, though it's a pity that it's hard (maybe I'm wrong ?) to
see whether an env var is unset.
I'll add additional note about why this is needed.
>
> > + [ -z "$TG_PAGER" -o "$TG_PAGER" = "cat" ] && return 0
>
> What if I set my pager to /bin/cat? But I suppose then there's just
> a wasted FIFO and process, nothing big.
Yes. I'd drop "cat" from here completely, but since I was mimicing
pager.c I left it:
http://repo.or.cz/w/git.git?a=blob;f=pager.c;h=f19ddbc87df04f117cd5e39189c8322fd5f29d68;hb=HEAD#l55
I'm ok with dropping cat. Should we?
> > + _pager_fifo="$(mktemp -t tg-pager-fifo.XXXXXX)"
> > + rm "$_pager_fifo" && mkfifo -m 600 "$_pager_fifo"
>
> There's a race condition here. I can't see a real problem with it,
> though, nor do I know of a better way.
Yes I know and agree here is a race.
But netheir did I knew how to overcome this (I though we'll need mkfifo
to support creating fifos with temp names, but mkfifo lacks support for
this)
The real problem here is that in bash, it's hard to get such constructs
going without fifos at all -- just using pipes like we would do in C,
because:
exec >file
redirects the rest of current process output to file, but
exec |less
does _not_ redirect the rest of the current process output through pipe
to less.
Likewise
exec > >(less)
works (process redirection), but is a bashishm, so = not an option for
us.
Fortunately Adeodato suggested to use `mktemp -d`, so this is reworked
to:
_pager_fifo_dir="$(mktemp -t -d tg-pager-fifo.XXXXXX)"
_pager_fifo="$_pager_fifo_dir/0"
mkfifo -m 600 "$_pager_fifo"
+ appopriate cleanup.
> > + "$TG_PAGER" < "$_pager_fifo" &
> > + exec > "$_pager_fifo" # dup2(pager_fifo.in, 1)
> > +
> > + # this is needed so e.g. `git diff` will still colorize it's output if
> > + # requested in ~/.gitconfig with color.diff=auto
> > + export GIT_PAGER_IN_USE=1
> > +
> > + # atexit(close(1); wait pager)
> > + trap "exec >&-; rm $_pager_fifo; wait" EXIT
>
> Consistency: $_pager_fifo is not passed as a quoted string to rm
> here. In the unlikely event that $TMPDIR contains a space, this
> would fail.
Thanks, corrected.
> I definitely want Petr's opinion on this before I integrate it.
Ok, let's wait what Petr says. Here is improved patch:
Changes since v1:
o isatty() simplified
o added a note about different meaning of GIT_PAGER='' and unset
GIT_PAGER
o fixed race wrt to mkfifo creation
o fixed potential whitespace problem in "$_pager_fifo" cleanup
(interdiff)
diff --git a/tg.sh b/tg.sh
index 51a82af..bf9cf5c 100644
--- a/tg.sh
+++ b/tg.sh
@@ -248,8 +248,7 @@ do_help()
# isatty FD
isatty()
{
- tty -s 0<&$1 || return 1
- return 0
+ tty -s 0<&$1
}
# setup_pager
@@ -259,6 +258,7 @@ setup_pager()
isatty 1 || return 0
# TG_PAGER = GIT_PAGER | PAGER
+ # (but differentiate between GIT_PAGER='' and unset variables)
# http://unix.derkeiler.com/Newsgroups/comp.unix.shell/2004-03/0792.html
case ${GIT_PAGER+XXX} in
'')
@@ -283,8 +283,9 @@ setup_pager()
# now spawn pager
export LESS=${LESS:-FRSX} # as in pager.c:pager_preexec()
- _pager_fifo="$(mktemp -t tg-pager-fifo.XXXXXX)"
- rm "$_pager_fifo" && mkfifo -m 600 "$_pager_fifo"
+ _pager_fifo_dir="$(mktemp -t -d tg-pager-fifo.XXXXXX)"
+ _pager_fifo="$_pager_fifo_dir/0"
+ mkfifo -m 600 "$_pager_fifo"
"$TG_PAGER" < "$_pager_fifo" &
exec > "$_pager_fifo" # dup2(pager_fifo.in, 1)
@@ -294,7 +295,7 @@ setup_pager()
export GIT_PAGER_IN_USE=1
# atexit(close(1); wait pager)
- trap "exec >&-; rm $_pager_fifo; wait" EXIT
+ trap "exec >&-; rm "$_pager_fifo"; rmdir "$_pager_fifo_dir"; wait" EXIT
}
## Startup
>From 2193b7c703c2d31c8739eec617b8c0e8c1d09b79 Mon Sep 17 00:00:00 2001
From: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Date: Tue, 6 Jan 2009 17:56:37 +0300
Subject: [PATCH (topgit) v2] Implement setup_pager just like in git
setup_pager() spawns a pager process and redirect the rest of our output
to it.
This will be needed to fix `tg patch` output in the next commit.
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
---
tg.sh | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 54 insertions(+), 0 deletions(-)
diff --git a/tg.sh b/tg.sh
index 8c23d26..bf9cf5c 100644
--- a/tg.sh
+++ b/tg.sh
@@ -243,6 +243,60 @@ do_help()
fi
}
+## Pager stuff
+
+# isatty FD
+isatty()
+{
+ tty -s 0<&$1
+}
+
+# setup_pager
+# Spawn pager process and redirect the rest of our output to it
+setup_pager()
+{
+ isatty 1 || return 0
+
+ # TG_PAGER = GIT_PAGER | PAGER
+ # (but differentiate between GIT_PAGER='' and unset variables)
+ # http://unix.derkeiler.com/Newsgroups/comp.unix.shell/2004-03/0792.html
+ case ${GIT_PAGER+XXX} in
+ '')
+ case ${PAGER+XXX} in
+ '')
+ # both GIT_PAGER & PAGER unset
+ TG_PAGER=''
+ ;;
+ *)
+ TG_PAGER="$PAGER"
+ ;;
+ esac
+ ;;
+ *)
+ TG_PAGER="$GIT_PAGER"
+ ;;
+ esac
+
+ [ -z "$TG_PAGER" -o "$TG_PAGER" = "cat" ] && return 0
+
+
+ # now spawn pager
+ export LESS=${LESS:-FRSX} # as in pager.c:pager_preexec()
+
+ _pager_fifo_dir="$(mktemp -t -d tg-pager-fifo.XXXXXX)"
+ _pager_fifo="$_pager_fifo_dir/0"
+ mkfifo -m 600 "$_pager_fifo"
+
+ "$TG_PAGER" < "$_pager_fifo" &
+ exec > "$_pager_fifo" # dup2(pager_fifo.in, 1)
+
+ # this is needed so e.g. `git diff` will still colorize it's output if
+ # requested in ~/.gitconfig with color.diff=auto
+ export GIT_PAGER_IN_USE=1
+
+ # atexit(close(1); wait pager)
+ trap "exec >&-; rm "$_pager_fifo"; rmdir "$_pager_fifo_dir"; wait" EXIT
+}
## Startup
--
1.6.1.48.ge9b8
Thanks,
Kirill
^ permalink raw reply related
* [PATCH] diff --no-index -q: fix endless loop
From: Thomas Rast @ 2009-01-07 11:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7veizfbnuw.fsf@gitster.siamese.dyndns.org>
We forgot to move to the next argument when parsing -q, getting stuck
in an endless loop.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
Junio C Hamano wrote:
> I'll queue the "--" fix, "-q" fix and this pager fix. Thanks.
Seems the after-midnight rule indeed has some value. I pointed out
the argv[i] bug, which I see you have squashed into the -- fix, but
failed to notice that the -q parsing is also missing an i++.
I'm still not convinced of the option's value, but this patch at least
fixes the bug.
diff-no-index.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/diff-no-index.c b/diff-no-index.c
index 12ff1f1..60ed174 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -207,8 +207,10 @@ void diff_no_index(struct rev_info *revs,
int j;
if (!strcmp(argv[i], "--no-index"))
i++;
- else if (!strcmp(argv[i], "-q"))
+ else if (!strcmp(argv[i], "-q")) {
options |= DIFF_SILENT_ON_REMOVED;
+ i++;
+ }
else if (!strcmp(argv[i], "--"))
i++;
else {
--
tg: (3bbe36c..) t/diff-q-endless (depends on: next)
^ permalink raw reply related
* Re: BUG?? INSTALL MAKEFILE
From: Lars Sadau @ 2009-01-07 10:44 UTC (permalink / raw)
To: git
In-Reply-To: <vpqzli45p6q.fsf@bauges.imag.fr>
Matthieu Moy <Matthieu.Moy <at> imag.fr> writes:
>
> Ted Pavlic <ted <at> tedpavlic.com> writes:
>
> > According to the INSTALL doc, the default prefix should be ~.
I am the same opinion
> I didn't read that in INSTALL. What I read is that if I only run "make
> install", the prefix is $HOME, which is true. Now, ./configure uses a
> default value which is not the one of the Makefile, but that's another
> point.
>
May be not, confuses newbies.
------
Lars
^ permalink raw reply
* Re: Error: unable to unlink ... when using "git gc"
From: Sitaram Chamarty @ 2009-01-07 10:55 UTC (permalink / raw)
To: git
In-Reply-To: <200901070027.21721.bss@iguanasuicide.net>
On 2009-01-07, Boyd Stephen Smith Jr. <bss@iguanasuicide.net> wrote:
> On Tuesday 06 January 2009, Sitaram Chamarty <sitaramc@gmail.com> wrote=20
>> chmod g+ws .git
>>
>>Now set the group to something (I use "gitpushers" ;-)
>>
>> chgrp -R gitpushers .git
> ISTR this breaking here when someone on the team had a umask like 077 and=20
> was using file:// or ssh:// to push. I tended up "fixing" things with a=20
> cronjob, (which is a bit of a hack) IIRC.
That doesn't sound right. "git help init" says:
- 0xxx: 0xxx is an octal number and each file will have mode 0xxx
- 0xxx will override users umask(2) value, and thus, users
with a safe umask (0077) can use this option
- 0660 is equivalent to group.
So when you say "group", you're saying "0660", and when you
say "0660", you're overriding users umask value.
I sorta-kinda tested it (some output of the "find|xargs ls"
is snipped for brevity):
$ mkdir umt;cd umt
$ umask
0022
$ git init --shared=group
Initialized empty shared Git repository in /home/sitaram/t/umt/.git/
(reverse-i-search)`find': sfind f|ack .git|map -x du -sk |sort -n
$ find . -type d|xargs ls -ald
drwxr-xr-x 3 sitaram sitaram 4096 2009-01-07 15:59 .
drwxrwsr-x 7 sitaram sitaram 4096 2009-01-07 15:59 ./.git
$ dummy_commit
Created initial commit afb2645: file-30824 at Wed Jan 7 15:59:23 IST 2009
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 file-30824
$ find . -type d|xargs ls -ald
drwxr-xr-x 3 sitaram sitaram 4096 2009-01-07 15:59 .
drwxrwsr-x 9 sitaram sitaram 4096 2009-01-07 15:59 ./.git
drwxrwsr-x 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/a7
drwxrwsr-x 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/af
drwxrwsr-x 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/ca
$ umask 077
$ jeeves dummy_commit
Created commit 232f157: file-3025 at Wed Jan 7 15:59:36 IST 2009
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 file-3025
$ find . -type d|xargs ls -ald
drwxr-xr-x 3 sitaram sitaram 4096 2009-01-07 15:59 .
drwxrwsr-x 9 sitaram sitaram 4096 2009-01-07 15:59 ./.git
drwxrws--- 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/23
drwxrwsr-x 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/a7
drwxrws--- 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/a8
drwxrwsr-x 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/af
drwxrws--- 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/c7
drwxrwsr-x 2 sitaram sitaram 4096 2009-01-07 15:59 ./.git/objects/ca
^ permalink raw reply
* Re: [PATCH (topgit) 1/2] Implement setup_pager just like in git
From: Adeodato Simó @ 2009-01-07 10:35 UTC (permalink / raw)
To: martin f krafft; +Cc: Kirill Smelkov, git, pasky
In-Reply-To: <20090106203203.GA11274@lapse.rw.madduck.net>
* martin f krafft [Tue, 06 Jan 2009 21:32:03 +0100]:
> > + _pager_fifo="$(mktemp -t tg-pager-fifo.XXXXXX)"
> > + rm "$_pager_fifo" && mkfifo -m 600 "$_pager_fifo"
> There's a race condition here. I can't see a real problem with it,
> though, nor do I know of a better way.
mktemp -d, and create the fifo inside the directory created by mktemp.
--
Adeodato Simó dato at net.com.org.es
Debian Developer adeodato at debian.org
A hacker does for love what other would not do for money.
^ permalink raw reply
* Re: [PATCH (topgit) 1/2] Implement setup_pager just like in git
From: martin f krafft @ 2009-01-06 20:32 UTC (permalink / raw)
To: Kirill Smelkov, git; +Cc: pasky
In-Reply-To: <acaae74f79d385014e726b97f8258b2a0caa3dd0.1231254832.git.kirr@landau.phys.spbu.ru>
[-- Attachment #1: Type: text/plain, Size: 1944 bytes --]
Thanks, Kirill, for the patches. A couple of comments inline. I hope
Petr has a chance to look too.
also sprach Kirill Smelkov <kirr@landau.phys.spbu.ru> [2009.01.06.1616 +0100]:
> +# isatty FD
> +isatty()
> +{
> + tty -s 0<&$1 || return 1
> + return 0
> +}
You don't need any of the return statements. Functions' return
values are the return values of the last commands they execute.
Since we are not using set -e, just "tty -s 0<&$1" in the body will
have the same effect.
> + # TG_PAGER = GIT_PAGER | PAGER
> + # http://unix.derkeiler.com/Newsgroups/comp.unix.shell/2004-03/0792.html
> + case ${GIT_PAGER+XXX} in
> + '')
> + case ${PAGER+XXX} in
> + '')
> + # both GIT_PAGER & PAGER unset
I find this very confusing. Why not simply
TG_PAGER="${GIT_PAGER:-}"
TG_PAGER="${TG_PAGER:-$PAGER}"
?
> + [ -z "$TG_PAGER" -o "$TG_PAGER" = "cat" ] && return 0
What if I set my pager to /bin/cat? But I suppose then there's just
a wasted FIFO and process, nothing big.
> + _pager_fifo="$(mktemp -t tg-pager-fifo.XXXXXX)"
> + rm "$_pager_fifo" && mkfifo -m 600 "$_pager_fifo"
There's a race condition here. I can't see a real problem with it,
though, nor do I know of a better way.
> + "$TG_PAGER" < "$_pager_fifo" &
> + exec > "$_pager_fifo" # dup2(pager_fifo.in, 1)
> +
> + # this is needed so e.g. `git diff` will still colorize it's output if
> + # requested in ~/.gitconfig with color.diff=auto
> + export GIT_PAGER_IN_USE=1
> +
> + # atexit(close(1); wait pager)
> + trap "exec >&-; rm $_pager_fifo; wait" EXIT
Consistency: $_pager_fifo is not passed as a quoted string to rm
here. In the unlikely event that $TMPDIR contains a space, this
would fail.
I definitely want Petr's opinion on this before I integrate it.
--
martin | http://madduck.net/ | http://two.sentenc.es/
if you see an onion ring -- answer it!
spamtraps: madduck.bogus@madduck.net
[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
* Re: Problems getting rid of large files using git-filter-branch
From: Øyvind Harboe @ 2009-01-07 10:15 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0901071101480.7496@intel-tinevez-2-302>
On Wed, Jan 7, 2009 at 11:07 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Tue, 6 Jan 2009, Øyvind Harboe wrote:
>
>> On Tue, Jan 6, 2009 at 11:20 PM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>
>> > $ git verify-pack -v $PACK | grep -v "^chain " | sort -n -k 4
>>
>> I have never used the git verify-pack command, but I'm pretty sure the
>> "Terminated" string isn't the normal output :-)
>>
>> $ git verify-pack -v
>> .git/objects/pack/pack-1e039b82d8ae53ef5ec3614a3021466663cc70a4
>> Terminated
>
> I did
>
> $ git grep Terminated
>
> and came up empty :-)
>
> Seriously, I guess this could be some OOM thing. We _should_ handle this
> more gracefully, but it is possible that some uncatchable condition hits
> you, such as out-of-stack-space.
>
> I'd try running the command either with strace or with gdb, and I'd look
> at $? after the command returns, to find out what is actually happening.
After some investigation it turns out that my server has 228mByte of RAM
available. It is a virtual server running CentOS, hence the strange number
and ridiciulously tiny amount of memory(these days).
Now the strange thing is that I'm not getting this error message this morning...
How would git behave if it ran out of memory?
--
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer
^ permalink raw reply
* Re: Problems getting rid of large files using git-filter-branch
From: Johannes Schindelin @ 2009-01-07 10:07 UTC (permalink / raw)
To: Øyvind Harboe; +Cc: git
In-Reply-To: <c09652430901061436w36c013ep938e9cfba43140c9@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 836 bytes --]
Hi,
On Tue, 6 Jan 2009, Øyvind Harboe wrote:
> On Tue, Jan 6, 2009 at 11:20 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>
> > $ git verify-pack -v $PACK | grep -v "^chain " | sort -n -k 4
>
> I have never used the git verify-pack command, but I'm pretty sure the
> "Terminated" string isn't the normal output :-)
>
> $ git verify-pack -v
> .git/objects/pack/pack-1e039b82d8ae53ef5ec3614a3021466663cc70a4
> Terminated
I did
$ git grep Terminated
and came up empty :-)
Seriously, I guess this could be some OOM thing. We _should_ handle this
more gracefully, but it is possible that some uncatchable condition hits
you, such as out-of-stack-space.
I'd try running the command either with strace or with gdb, and I'd look
at $? after the command returns, to find out what is actually happening.
Hth,
Dscho
^ permalink raw reply
* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Junio C Hamano @ 2009-01-07 9:42 UTC (permalink / raw)
To: R. Tyler Ballance; +Cc: Linus Torvalds, Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <1231317131.8870.471.camel@starfruit>
"R. Tyler Ballance" <tyler@slide.com> writes:
> On Wed, 2009-01-07 at 00:16 -0800, Junio C Hamano wrote:
>> "R. Tyler Ballance" <tyler@slide.com> writes:
>>
>> > Unfortunately it doesn't, what I did notice was this when I did a `git
>> > status` in the directory right after untarring:
>> > tyler@grapefruit:~/jburgess_main> git status
>> > #
>> > # ---impressive amount of file names fly by---
>> > # ----snip---
>> > ...
>> > Basically, somehow Git thinks that *every* file in the repository is
>> > deleted at this point.
>>
>> That makes me suspect that your .git/index file is corrupt.
>
> Would this be tied to the corrupted pack file issue, or separate.
If you have perfectly good set of packs, if your index is corrupt you may
see "everything deleted", so in that sense it is independent.
As Linus's earlier conjecture was that this is related to some sort of
disk/cache corruption, I wouldn't be surprised if such a failure hit packs
and the index file indiscriminatingly. So in that sense they are
related.
I think "git ls-files" (before doing anything else, such as resetting, of
course) would report that the index is corrupt, if that is indeed the case.
^ permalink raw reply
* Re: [JGIT RFC] How read versions of a specific object
From: Imran M Yousuf @ 2009-01-07 9:23 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Git Mailing List
In-Reply-To: <20090107040417.GA10790@spearce.org>
On Wed, Jan 7, 2009 at 10:04 AM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Imran M Yousuf <imyousuf@gmail.com> wrote:
>> I am trying to read all or n-th version of an object. Currently to do
>> this I am using the following piece of code, which has to walk to
>> every commit is present and from there prepare a set of its object id,
>> it is definitely expensive if the commit history is huge, is there a
>> faster/better way to achieve it?
>
> Not really. You can more efficiently use JGit and reduce some of
> the overheads, but that's about it.
>
Thanks Shawn, for pointing it out and it actually does improve the
performance, for every lookup its like 200ms.
Best regards,
Imran
>> for (int i = 0; i < App.OBJECT_COUNT;
>> ++i) {
>> ObjectWalk objectWalk = new ObjectWalk(repo);
>
> Don't use ObjectWalk, use a RevWalk. You don't need it to keep
> track of tree or blob identities. The ObjectWalk code has more
> overhead to do that bookkeeping.
>
>> Commit revision = repo.mapCommit(revObject.getId());
>> Tree versionTree = repo.mapTree(revision.getTreeId());
>> if (versionTree.existsBlob(isbn)) {
>> revisions.add(versionTree.findBlobMember(isbn).getId());
>
> Use a TreeWalk to do this. Its quicker because it doesn't
> have to parse as much data to come up with the same result.
>
> More specifically there's a static factory method that sets up for
> a path limited walk and returns the TreeWalk pointing at that entry.
>
> You can use the fact that RevWalk.next() returns a RevCommit to get
> you the RevTree, which is the tree you need to give to the TreeWalk
> constructor (its the root level tree of the commit).
>
>
> But if App.OBJECT_COUNT is quite large and covers most of your
> objects, you are probably better off using a loop over the commits
> and diff'ing against the ancestor:
>
> final HashMap<String, Set<ObjectId>> versions = ...;
> final RevWalk rw = new RevWalk(repo);
> final TreeWalk tw = new TreeWalk(repo);
> rw.markStart(rw.parseCommit(repo.parse(HEAD)));
> tw.setFilter(TreeFilter.ANY_DIFF);
>
> RevCommit c;
> while ((c = rw.next()) != null) {
> final ObjectId[] p = new ObjectId[c.getParentCount() + 1];
> for (int i = 0; i < c.getParentCount(); i++) {
> rw.parse(c.getParent(i));
> p[i] = c.getParent(i).getTree();
> }
> final int me = p.length -1;
> p[me] = c.getTree();
> tw.reset(p);
> while (tw.next()) {
> if (tw.getFileMode(me).getObjectType() == Constants.OBJ_BLOB) {
> // This path was modified relative to the ancestor(s).
> //
> String s = tw.getPathString();
> Set<ObjectId> i = versions.get(s);
> if (i == null)
> versions.put(s, i = new HashSet<ObjectId>());
> i.add(tw.getObjectId(me));
> }
>
> if (tw.isSubtree()) {
> // make sure we recurse into modified directories
> tw.enterSubtree();
> }
> }
> }
>
> --
> Shawn.
>
--
Imran M Yousuf
Entrepreneur & Software Engineer
Smart IT Engineering
Dhaka, Bangladesh
Email: imran@smartitengineering.com
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557
^ permalink raw reply
* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: R. Tyler Ballance @ 2009-01-07 9:05 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <1231314099.8870.415.camel@starfruit>
[-- Attachment #1: Type: text/plain, Size: 2154 bytes --]
On Tue, 2009-01-06 at 23:41 -0800, R. Tyler Ballance wrote:
> I did try to do a git-fsck(1), and this is what I got:
> tyler@grapefruit:~/jburgess_main> /usr/local/bin/git fsck --full
> [1] 19381 segmentation fault /usr/local/bin/git fsck --full
> tyler@grapefruit:~/jburgess_main>
> >
Disregard this comment, Jan reminded me in IRC of the issue I had with
this repository and the stack-size ulimit. A healthy 32M stack-size
allows for the command to complete:
tyler@grapefruit:~/jburgess_main> /usr/local/bin/git fsck --full
dangling blob 6e58a06cd0f027fce1fc8a923a8c81d6b55f1705
dangling blob 5e87e049f69ee06af1f4f92a3d4ddcd912f8535e
dangling blob acb8e0937cea3f4068c5c67f60d3b97952654fa1
dangling blob 8bd540d657027ab12228e0522de86a97d3f8a7a9
dangling blob 90d93096369df4b2151df3e289952297c5f390dd
dangling blob aaa4a1bf9e6d39991503b7908ff71605d6632fef
dangling blob b003e20f06de7d0a7e11a1047fbb39e2deb84899
dangling blob bf207363786f08e888a725cfb8b2fe4bea11ceab
dangling blob 642493a1bcf09f74551f0ce5b4c3ccc23acaf3c9
dangling blob ff80b45e48795d4544d0a5b2f18714123ab21746
dangling blob e3c0e67e23ec588acab733e2069fb09fa38a235d
dangling blob 62efa65bcbb6d94f26be9b91dc98d7a7f6fbf602
dangling blob a6a2c717c230fe26263b9ce002f3f82fdab6f727
dangling blob d32988875aa90ef728dc1af6b71c130f2c8e8b94
dangling blob c8aad9fc37a27872bae18af78b1623bfb8f9a9f7
dangling blob 4cdbe97301e333d89037dc9f4a8440dab9e62049
dangling blob 2819bbcda3b0efe828709f5b22624712cb9ebdae
dangling blob b156cba3ee89e1506d02fbb845e8dc0889ff4090
dangling blob ed658b8beca69ebf6e25ab1ed6217881e27d4e9a
dangling blob d581cb2fb1cf2b4d15f8f9b90b08a3ebc619414f
dangling blob a85c2e55d27c9fb1f5874f8bd81c65e597327b4f
dangling blob 548f3e70c06f9306d71f6b8d11ed24ade581fa31
dangling blob 7a1c8fa8bc59c4e8fb347426cc11fabf8b0d1639
tyler@grapefruit:~/jburgess_main>
Cheers
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
* Re: [RFC/PATCH 2/3] replace_object: add mechanism to replace objects found in "refs/replace/"
From: Junio C Hamano @ 2009-01-07 8:41 UTC (permalink / raw)
To: Christian Couder; +Cc: git
In-Reply-To: <20090107084341.1554d8cd.chriscool@tuxfamily.org>
Christian Couder <chriscool@tuxfamily.org> writes:
> In "parse_commit_buffer", the parent sha1s from the original commit
> or from a commit graft that match a ref name in "refs/replace/" are
> replaced by the commit sha1 that has been read in the ref.
I may be reading this wrong, but if you replace the parent of the commit,
that won't improve anything over the graft mechanism, which we already
have.
What I was hoping to see was to give replacing commit back when a commit
is asked for by normal callers, while not replacing the objects when
reachability traversal (prune, pack transfer and fsck) wants to read the
commit. That way:
(1) Normal callers will see the rewritten history (which is the same as
graft); and
(2) We will kill the design bug of the current graft mechanism that graft
information cannot be transferred to the other end. By using object
replacement, you can fetch objects reachable from refs/replace/ at
the other end and place them in the same refs/replace/ hierarchy
locally, if you choose to use the same altered history, or you can
choose not to use the replacement yourself. The resulting repository
is fully connected either way.
(3) We will also kill the design bug of the current graft mechanism that
graft information hides repository corruption to fsck. The problem
with this is that if you and then remove the grafts, you will end up
with a corrupt repository, because these operations do look at grafts
(this is justified only partly because otherwise you will lose
objects that are reachable only via grafts, but is broken at the same
time because you can lose the true parents by letting graft hide them
from a reachable commit).
By doing fsck and prune always using the true reachability, and using
refs in the refs/replace/ hierarchy for protecting objects that are
involved in this new way of grafting, you will ensure the integrity
of the repository. Removal of a ref from the refs/replace/ hierarchy
won't result in such a corruption we can have today.
And that is exactly the reason why I was hoping the hook will be at
read_sha1_file() that gives you a rewritten object contents when you ask
for the original object, not at parse_commit_buffer which does not give
you a rewritten object, but makes you follow to a rewritten object when
parsing a commit (which itself is not replaced if it is the starting
point).
Doing replacement at parse_commit_buffer() time is one step too late. For
example, if you have replacement information for the commit that sits at
the tip of 'master' branch, I think your "git log master" will ignore that
replacement information. Creating one new commit on top of it and saying
"git log master" then will show the new commit, and suddenly starts
showing the replacement commit for the one you used to have at the tip of
the branch.
^ permalink raw reply
* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: R. Tyler Ballance @ 2009-01-07 8:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <7vaba3bken.fsf@gitster.siamese.dyndns.org>
[-- Attachment #1: Type: text/plain, Size: 823 bytes --]
On Wed, 2009-01-07 at 00:16 -0800, Junio C Hamano wrote:
> "R. Tyler Ballance" <tyler@slide.com> writes:
>
> > Unfortunately it doesn't, what I did notice was this when I did a `git
> > status` in the directory right after untarring:
> > tyler@grapefruit:~/jburgess_main> git status
> > #
> > # ---impressive amount of file names fly by---
> > # ----snip---
> > ...
> > Basically, somehow Git thinks that *every* file in the repository is
> > deleted at this point.
>
> That makes me suspect that your .git/index file is corrupt.
Would this be tied to the corrupted pack file issue, or separate.
Either way, how could I verify your assumptions? (i'll be lurking in
#git for a while if you want to interactively help ;))
Cheers
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox