* repack: handling of .keep files @ 2007-05-04 9:25 Alex Riesen 2007-05-04 9:56 ` Junio C Hamano 0 siblings, 1 reply; 11+ messages in thread From: Alex Riesen @ 2007-05-04 9:25 UTC (permalink / raw) To: Git Mailing List It does not seem to be consistent with the name: "git repack -a -d" removes .keep files anyway. How can I pin down a pack file so that it is never repacked? (Windows doesn't like big files, they get fragmented heavily and are hard to read and defragment. So I try keep them relatively small). P.S. Experimenting with the .keep-files I had a crash in git-log, when the pack was renamed into .keep.pack, but the index was not. git-log complained about two objects it could not read and than crashed. It's cygwin. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: repack: handling of .keep files 2007-05-04 9:25 repack: handling of .keep files Alex Riesen @ 2007-05-04 9:56 ` Junio C Hamano 2007-05-04 10:42 ` Alex Riesen 0 siblings, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2007-05-04 9:56 UTC (permalink / raw) To: Alex Riesen; +Cc: Git Mailing List "Alex Riesen" <raa.lkml@gmail.com> writes: > ... Experimenting with the .keep-files I had a crash in git-log, > when the pack was renamed into .keep.pack, but the > index was not. git-log complained about two objects it could not > read and than crashed. It's cygwin. This part makes me suspect you are not even using the .keep properly. In addition to pack-[0-9a-f]{40}.(pack|idx), you would have a corresponding pack-[0-9a-f]{40}.keep file (whose contents does not matter) to mark that these should not get repacked. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: repack: handling of .keep files 2007-05-04 9:56 ` Junio C Hamano @ 2007-05-04 10:42 ` Alex Riesen 2007-05-04 16:59 ` Junio C Hamano 0 siblings, 1 reply; 11+ messages in thread From: Alex Riesen @ 2007-05-04 10:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List On 5/4/07, Junio C Hamano <junkio@cox.net> wrote: > "Alex Riesen" <raa.lkml@gmail.com> writes: > > > ... Experimenting with the .keep-files I had a crash in git-log, > > when the pack was renamed into .keep.pack, but the > > index was not. git-log complained about two objects it could not > > read and than crashed. It's cygwin. > > This part makes me suspect you are not even using the .keep > properly. In addition to pack-[0-9a-f]{40}.(pack|idx), you > would have a corresponding pack-[0-9a-f]{40}.keep file (whose > contents does not matter) to mark that these should not get > repacked. This is it (described in git-index-pack.txt, maybe git-repack.txt should reference it too), thanks. Still, git-log shouldn't crash (nothing should, of course). And, the temporary pack is created in working tree, instead of in GIT_DIR (why not GIT_OBJECT_DIR, btw?) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: repack: handling of .keep files 2007-05-04 10:42 ` Alex Riesen @ 2007-05-04 16:59 ` Junio C Hamano 2007-05-04 17:24 ` Alex Riesen 0 siblings, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2007-05-04 16:59 UTC (permalink / raw) To: Alex Riesen; +Cc: Git Mailing List "Alex Riesen" <raa.lkml@gmail.com> writes: > Still, git-log shouldn't crash (nothing should, of course). Honestly, I think that's borderline. If you "dd if=/dev/random of=/dev/hda", should the kernel keep going, perhaps gracefully declining access to the filesystem on that drive? > And, the temporary pack is created in working tree, instead of in GIT_DIR > (why not GIT_OBJECT_DIR, btw?) Not limited to the temporary pack (the detail of exact use pattern escapes me -- I do not think it was temporary pack that initiated the use of GIT_DIR for temporary things), I think trying to not create things in the working tree came after somebody who had a read-only (to him, not necessarily to everybody) working tree and read-write GIT_DIR (separate location, specified with the environment variable) had trouble. At least the theory was GIT_DIR would be writable as long as you are doing a "write" oriented operation, regardless of what. Back then we did not support working in subdirectory of the project as fully as we do now, so using such configuration was not much less convenient than you have the normal layout of having $project/.git directory at the top of the working tree. While I do not think it is worth to try "supporting" use of read-only working tree for write oriented operations, which means that it should be safe to assume that the working tree is writable and we _could_ create temporary pack in working tree instead, I do not think the "_could_" means we should. In the case of temporary pack I do not think there would be a risk of filename collisions, I think it makes sense to use either GIT_DIR or GIT_OBJECT_DIRECTORY instead of the working tree. I do not know pros-and-cons between .git/ and .git/objects/; filesystems tend to cluster nearby things better, so the latter might be more logical, but packs are about using smaller (much much much smaller) number of files than you would use otherwise to store objects _and_ keeping them in use, so I suspect it would not make much practical difference even if we try to encourage the filesystem to allocate the disk blocks for new pack near existing packs by creating the temporary file. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: repack: handling of .keep files 2007-05-04 16:59 ` Junio C Hamano @ 2007-05-04 17:24 ` Alex Riesen 2007-05-04 19:20 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Alex Riesen @ 2007-05-04 17:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List On 5/4/07, Junio C Hamano <junkio@cox.net> wrote: > "Alex Riesen" <raa.lkml@gmail.com> writes: > > > Still, git-log shouldn't crash (nothing should, of course). > > Honestly, I think that's borderline. If you "dd if=/dev/random > of=/dev/hda", should the kernel keep going, perhaps gracefully > declining access to the filesystem on that drive? e2fsck has a test somewhere which randomly corrupts a partition and then lets the program fix it. All kind of corruptions happen, we will have to deal with them. Especially if this crash is so simple to reproduce. > case of temporary pack I do not think there would be a risk of > filename collisions, I think it makes sense to use either > GIT_DIR or GIT_OBJECT_DIRECTORY instead of the working tree. > > I do not know pros-and-cons between .git/ and .git/objects/; These are settable separately, so theoretically you can end up with .git and .git/objects being on different filesystems. Atomic rename wont be possible than. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: repack: handling of .keep files 2007-05-04 17:24 ` Alex Riesen @ 2007-05-04 19:20 ` Junio C Hamano [not found] ` <20070507173324.GA3436@steel.home> 2007-05-04 19:40 ` repack: handling of .keep files Nicolas Pitre 2007-05-04 21:54 ` [PATCH] Handle return code of parse_commit in revision machinery Alex Riesen 2 siblings, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2007-05-04 19:20 UTC (permalink / raw) To: Alex Riesen; +Cc: Git Mailing List "Alex Riesen" <raa.lkml@gmail.com> writes: > On 5/4/07, Junio C Hamano <junkio@cox.net> wrote: > ... >> I do not know pros-and-cons between .git/ and .git/objects/; > > These are settable separately, so theoretically you can end > up with .git and .git/objects being on different filesystems. > Atomic rename wont be possible than. Fair enough. As we've lived without trouble long enough creating the new pack tempfile in $GIT_DIR, I do not think it is urgent; but feel free to propose moving it to GIT_OBJECT_DIRECTORY. We _might_ need to teach fsck to take notice of such leftover temporary files from an earlier, aborted run if we do this -- I haven't looked. ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20070507173324.GA3436@steel.home>]
* Re: [PATCH] Use GIT_OBJECT_DIR for temporary files of pack-objects [not found] ` <20070507173324.GA3436@steel.home> @ 2007-05-07 17:51 ` Dana How 2007-05-07 21:33 ` Alex Riesen 0 siblings, 1 reply; 11+ messages in thread From: Dana How @ 2007-05-07 17:51 UTC (permalink / raw) To: Alex Riesen; +Cc: git, Junio C Hamano, danahow On 5/7/07, Alex Riesen <raa.lkml@gmail.com> wrote: > I'm not sure about fsck cleaning up after crashed/killed pack-objects: > not sure I _can_ detect if the temp files really are just leftovers. It looks like you create temp file in objects , not objects/pack . So a rule could be : packs left in the former are crashed/killed, and packs in the latter are complete? You should also look at $PACKTMP in git-repack.sh . In it $GIT_DIR should probably be $GIT_OBJECT_DIRECTORY ? Junio: This patch touches the same lines as the --max-pack-size patch. What do you want to do with the latter? Thanks, -- Dana L. How danahow@gmail.com +1 650 804 5991 cell ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Use GIT_OBJECT_DIR for temporary files of pack-objects 2007-05-07 17:51 ` [PATCH] Use GIT_OBJECT_DIR for temporary files of pack-objects Dana How @ 2007-05-07 21:33 ` Alex Riesen 2007-05-07 21:59 ` Dana How 0 siblings, 1 reply; 11+ messages in thread From: Alex Riesen @ 2007-05-07 21:33 UTC (permalink / raw) To: Dana How; +Cc: git, Junio C Hamano Dana How, Mon, May 07, 2007 19:51:24 +0200: > On 5/7/07, Alex Riesen <raa.lkml@gmail.com> wrote: > >I'm not sure about fsck cleaning up after crashed/killed pack-objects: > >not sure I _can_ detect if the temp files really are just leftovers. > > It looks like you create temp file in objects , not objects/pack . > > So a rule could be : packs left in the former are crashed/killed, > and packs in the latter are complete? How do you know for sure that they are _left_ already? > You should also look at $PACKTMP in git-repack.sh . > In it $GIT_DIR should probably be $GIT_OBJECT_DIRECTORY ? I think it should, but it is already not the working directory, so my original complaint does not apply ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Use GIT_OBJECT_DIR for temporary files of pack-objects 2007-05-07 21:33 ` Alex Riesen @ 2007-05-07 21:59 ` Dana How 0 siblings, 0 replies; 11+ messages in thread From: Dana How @ 2007-05-07 21:59 UTC (permalink / raw) To: Alex Riesen; +Cc: git, Junio C Hamano, danahow On 5/7/07, Alex Riesen <raa.lkml@gmail.com> wrote: > Dana How, Mon, May 07, 2007 19:51:24 +0200: > > On 5/7/07, Alex Riesen <raa.lkml@gmail.com> wrote: > > >I'm not sure about fsck cleaning up after crashed/killed pack-objects: > > >not sure I _can_ detect if the temp files really are just leftovers. > > > > It looks like you create temp file in objects , not objects/pack . > > > > So a rule could be : packs left in the former are crashed/killed, > > and packs in the latter are complete? > > How do you know for sure that they are _left_ already? Ah, e.g. you're packing and fsck'ing at the same time? Then you can't know. Never mind... > > You should also look at $PACKTMP in git-repack.sh . > > In it $GIT_DIR should probably be $GIT_OBJECT_DIRECTORY ? > > I think it should, but it is already not the working directory, so my > original complaint does not apply OK, if I have to further change my --max-pack-size patchset which touches git-repack.sh, I'll roll this in. Thanks, -- Dana L. How danahow@gmail.com +1 650 804 5991 cell ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: repack: handling of .keep files 2007-05-04 17:24 ` Alex Riesen 2007-05-04 19:20 ` Junio C Hamano @ 2007-05-04 19:40 ` Nicolas Pitre 2007-05-04 21:54 ` [PATCH] Handle return code of parse_commit in revision machinery Alex Riesen 2 siblings, 0 replies; 11+ messages in thread From: Nicolas Pitre @ 2007-05-04 19:40 UTC (permalink / raw) To: Alex Riesen; +Cc: Junio C Hamano, Git Mailing List On Fri, 4 May 2007, Alex Riesen wrote: > On 5/4/07, Junio C Hamano <junkio@cox.net> wrote: > > "Alex Riesen" <raa.lkml@gmail.com> writes: > > > > > Still, git-log shouldn't crash (nothing should, of course). > > > > Honestly, I think that's borderline. If you "dd if=/dev/random > > of=/dev/hda", should the kernel keep going, perhaps gracefully > > declining access to the filesystem on that drive? > > e2fsck has a test somewhere which randomly corrupts a partition > and then lets the program fix it. > All kind of corruptions happen, we will have to deal with them. > Especially if this crash is so simple to reproduce. > > > case of temporary pack I do not think there would be a risk of > > filename collisions, I think it makes sense to use either > > GIT_DIR or GIT_OBJECT_DIRECTORY instead of the working tree. > > > > I do not know pros-and-cons between .git/ and .git/objects/; > > These are settable separately, so theoretically you can end > up with .git and .git/objects being on different filesystems. > Atomic rename wont be possible than. Other temporary pack/object creation instances already use GIT_OBJECT_DIRECTORY. Nicolas ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] Handle return code of parse_commit in revision machinery 2007-05-04 17:24 ` Alex Riesen 2007-05-04 19:20 ` Junio C Hamano 2007-05-04 19:40 ` repack: handling of .keep files Nicolas Pitre @ 2007-05-04 21:54 ` Alex Riesen 2 siblings, 0 replies; 11+ messages in thread From: Alex Riesen @ 2007-05-04 21:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List This fixes a crash in broken repositories where random commits suddenly disappear. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- ... For example when loosing one of the packs by renaming it into a random name while trying to figure out how to protect it from deletion by "git-repack -a -d". Lesson learned, code fixed, tests passed. revision.c | 75 +++++++++++++++++++++++++++++++++++++++++------------------ revision.h | 2 +- 2 files changed, 53 insertions(+), 24 deletions(-) diff --git a/revision.c b/revision.c index e60a26c..b4c494d 100644 --- a/revision.c +++ b/revision.c @@ -318,7 +318,10 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) while ((parent = *pp) != NULL) { struct commit *p = parent->item; - parse_commit(p); + if (parse_commit(p) < 0) + die("cannot simplify commit %s (because of %s)", + sha1_to_hex(commit->object.sha1), + sha1_to_hex(p->object.sha1)); switch (rev_compare_tree(revs, p->tree, commit->tree)) { case REV_TREE_SAME: tree_same = 1; @@ -347,7 +350,10 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) * IOW, we pretend this parent is a * "root" commit. */ - parse_commit(p); + if(parse_commit(p) < 0) + die("cannot simplify commit %s (invalid %s)", + sha1_to_hex(commit->object.sha1), + sha1_to_hex(p->object.sha1)); p->parents = NULL; } /* fallthrough */ @@ -362,14 +368,14 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) commit->object.flags |= TREECHANGE; } -static void add_parents_to_list(struct rev_info *revs, struct commit *commit, struct commit_list **list) +static int add_parents_to_list(struct rev_info *revs, struct commit *commit, struct commit_list **list) { struct commit_list *parent = commit->parents; unsigned left_flag; int add, rest; if (commit->object.flags & ADDED) - return; + return 0; commit->object.flags |= ADDED; /* @@ -388,7 +394,8 @@ static void add_parents_to_list(struct rev_info *revs, struct commit *commit, st while (parent) { struct commit *p = parent->item; parent = parent->next; - parse_commit(p); + if (parse_commit(p) < 0) + return -1; p->object.flags |= UNINTERESTING; if (p->parents) mark_parents_uninteresting(p); @@ -397,7 +404,7 @@ static void add_parents_to_list(struct rev_info *revs, struct commit *commit, st p->object.flags |= SEEN; insert_by_date(p, list); } - return; + return 0; } /* @@ -409,7 +416,7 @@ static void add_parents_to_list(struct rev_info *revs, struct commit *commit, st revs->prune_fn(revs, commit); if (revs->no_walk) - return; + return 0; left_flag = (commit->object.flags & SYMMETRIC_LEFT); @@ -418,7 +425,8 @@ static void add_parents_to_list(struct rev_info *revs, struct commit *commit, st struct commit *p = parent->item; parent = parent->next; - parse_commit(p); + if (parse_commit(p) < 0) + return -1; p->object.flags |= left_flag; if (p->object.flags & SEEN) continue; @@ -426,6 +434,7 @@ static void add_parents_to_list(struct rev_info *revs, struct commit *commit, st if (add) insert_by_date(p, list); } + return 0; } static void cherry_pick_list(struct commit_list *list) @@ -508,7 +517,7 @@ static void cherry_pick_list(struct commit_list *list) free_patch_ids(&ids); } -static void limit_list(struct rev_info *revs) +static int limit_list(struct rev_info *revs) { struct commit_list *list = revs->commits; struct commit_list *newlist = NULL; @@ -524,7 +533,8 @@ static void limit_list(struct rev_info *revs) if (revs->max_age != -1 && (commit->date < revs->max_age)) obj->flags |= UNINTERESTING; - add_parents_to_list(revs, commit, &list); + if (add_parents_to_list(revs, commit, &list) < 0) + return -1; if (obj->flags & UNINTERESTING) { mark_parents_uninteresting(commit); if (everybody_uninteresting(list)) @@ -539,6 +549,7 @@ static void limit_list(struct rev_info *revs) cherry_pick_list(newlist); revs->commits = newlist; + return 0; } struct all_refs_cb { @@ -1227,7 +1238,7 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, const ch return left; } -void prepare_revision_walk(struct rev_info *revs) +int prepare_revision_walk(struct rev_info *revs) { int nr = revs->pending.nr; struct object_array_entry *e, *list; @@ -1249,42 +1260,59 @@ void prepare_revision_walk(struct rev_info *revs) free(list); if (revs->no_walk) - return; + return 0; if (revs->limited) - limit_list(revs); + if (limit_list(revs) < 0) + return -1; if (revs->topo_order) sort_in_topological_order_fn(&revs->commits, revs->lifo, revs->topo_setter, revs->topo_getter); + return 0; } -static int rewrite_one(struct rev_info *revs, struct commit **pp) +enum rewrite_result +{ + rewrite_one_ok, + rewrite_one_noparents, + rewrite_one_error, +}; + +static enum rewrite_result rewrite_one(struct rev_info *revs, struct commit **pp) { for (;;) { struct commit *p = *pp; if (!revs->limited) - add_parents_to_list(revs, p, &revs->commits); + if (add_parents_to_list(revs, p, &revs->commits) < 0) + return rewrite_one_error; if (p->parents && p->parents->next) - return 0; + return rewrite_one_ok; if (p->object.flags & (TREECHANGE | UNINTERESTING)) - return 0; + return rewrite_one_ok; if (!p->parents) - return -1; + return rewrite_one_noparents; *pp = p->parents->item; } } -static void rewrite_parents(struct rev_info *revs, struct commit *commit) +static int rewrite_parents(struct rev_info *revs, struct commit *commit) { struct commit_list **pp = &commit->parents; while (*pp) { struct commit_list *parent = *pp; - if (rewrite_one(revs, &parent->item) < 0) { + switch (rewrite_one(revs, &parent->item)) + { + case rewrite_one_ok: + break; + case rewrite_one_noparents: *pp = parent->next; continue; + case rewrite_one_error: + return -1; } pp = &parent->next; } + return 0; } static int commit_match(struct commit *commit, struct rev_info *opt) @@ -1320,7 +1348,8 @@ static struct commit *get_revision_1(struct rev_info *revs) if (revs->max_age != -1 && (commit->date < revs->max_age)) continue; - add_parents_to_list(revs, commit, &revs->commits); + if (add_parents_to_list(revs, commit, &revs->commits) < 0) + return NULL; } if (commit->object.flags & SHOWN) continue; @@ -1348,8 +1377,8 @@ static struct commit *get_revision_1(struct rev_info *revs) if (!commit->parents || !commit->parents->next) continue; } - if (revs->parents) - rewrite_parents(revs, commit); + if (revs->parents && rewrite_parents(revs, commit) < 0) + return NULL; } return commit; } while (revs->commits); diff --git a/revision.h b/revision.h index cdf94ad..2845167 100644 --- a/revision.h +++ b/revision.h @@ -113,7 +113,7 @@ extern void init_revisions(struct rev_info *revs, const char *prefix); extern int setup_revisions(int argc, const char **argv, struct rev_info *revs, const char *def); extern int handle_revision_arg(const char *arg, struct rev_info *revs,int flags,int cant_be_filename); -extern void prepare_revision_walk(struct rev_info *revs); +extern int prepare_revision_walk(struct rev_info *revs); extern struct commit *get_revision(struct rev_info *revs); extern void mark_parents_uninteresting(struct commit *commit); -- 1.5.2.rc1.21.g80e79 ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-05-07 21:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-04 9:25 repack: handling of .keep files Alex Riesen
2007-05-04 9:56 ` Junio C Hamano
2007-05-04 10:42 ` Alex Riesen
2007-05-04 16:59 ` Junio C Hamano
2007-05-04 17:24 ` Alex Riesen
2007-05-04 19:20 ` Junio C Hamano
[not found] ` <20070507173324.GA3436@steel.home>
2007-05-07 17:51 ` [PATCH] Use GIT_OBJECT_DIR for temporary files of pack-objects Dana How
2007-05-07 21:33 ` Alex Riesen
2007-05-07 21:59 ` Dana How
2007-05-04 19:40 ` repack: handling of .keep files Nicolas Pitre
2007-05-04 21:54 ` [PATCH] Handle return code of parse_commit in revision machinery Alex Riesen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).