* [PATCH 0/4] Teach "git clone" to pack refs @ 2008-06-15 14:02 Johan Herland 2008-06-15 14:04 ` [PATCH 1/4] Incorporate fetched packs in future object traversal Johan Herland ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: Johan Herland @ 2008-06-15 14:02 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Daniel Barkalow (This is meant for AFTER v1.5.6.) This is a re-post of the series I posted while builtin-clone was still under development. The rationale for this series can be found in earlier threads, but it roughly boils down to the following: 1. "git clone" currently creates "loose" refs for every ref in the cloned repo. A subsequent "git gc" will pack these refs into a packed-refs file. Having "git clone" produce the "packed" refs in the first place seems more efficient. 2. For repos with few refs the performance difference between writing loose refs and packed refs is negligible. However, for repos with thousands of refs [1], the difference between writing one packed-refs file and thousands of "loose" refs files is definitely noticeable. Even more so on Windows. 3. When the user updates a ref, a "loose" ref is written, and the corresponding packed ref (if any) is left unused. By making "git clone" write packed refs, we increase the overhead of unused packed refs (proportionally to the number of refs updated by the user). However, the number of refs updated by the user is typically small. If the user updates tens - or even hundreds - of refs, I still expect this overhead to be negligible, and in any case outweighed by the added performance when cloning repos with many refs. The series is based on current 'next'. Johan Herland (4): Incorporate fetched packs in future object traversal Move pack_refs() and friends into libgit Prepare testsuite for a "git clone" that packs refs Teach "git clone" to pack refs Makefile | 2 + builtin-clone.c | 8 ++- builtin-fetch-pack.c | 1 + builtin-pack-refs.c | 121 +----------------------------------------- pack-refs.c | 117 ++++++++++++++++++++++++++++++++++++++++ pack-refs.h | 18 ++++++ t/t5515-fetch-merge-logic.sh | 19 +++++++ 7 files changed, 164 insertions(+), 122 deletions(-) create mode 100644 pack-refs.c create mode 100644 pack-refs.h Have fun! :) ...Johan [1]: At $dayjob I'm converting old CVS modules with up to ~10 years of history, and some of the resulting repos have ~30000 refs (mostly build tags). When cloning these repos, the speedup of this series can be measured in seconds on Linux; minutes on Windows. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] Incorporate fetched packs in future object traversal 2008-06-15 14:02 [PATCH 0/4] Teach "git clone" to pack refs Johan Herland @ 2008-06-15 14:04 ` Johan Herland 2008-06-15 14:05 ` [PATCH 2/4] Move pack_refs() and friends into libgit Johan Herland ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Johan Herland @ 2008-06-15 14:04 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Daniel Barkalow Immediately after fetching a pack, we should call reprepare_packed_git() to make sure the objects in the pack are reachable. Otherwise, we will fail to look up objects that are present only in the fetched pack. Signed-off-by: Johan Herland <johan@herland.net> --- builtin-fetch-pack.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c index de1e8d1..f4dbcf0 100644 --- a/builtin-fetch-pack.c +++ b/builtin-fetch-pack.c @@ -820,5 +820,6 @@ struct ref *fetch_pack(struct fetch_pack_args *my_args, } } + reprepare_packed_git(); return ref_cpy; } -- 1.5.6.rc2.128.gf64ae ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/4] Move pack_refs() and friends into libgit 2008-06-15 14:02 [PATCH 0/4] Teach "git clone" to pack refs Johan Herland 2008-06-15 14:04 ` [PATCH 1/4] Incorporate fetched packs in future object traversal Johan Herland @ 2008-06-15 14:05 ` Johan Herland 2008-06-15 17:52 ` Jeff King 2008-06-15 14:05 ` [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs Johan Herland 2008-06-15 14:06 ` [PATCH 4/4] Teach "git clone" to pack refs Johan Herland 3 siblings, 1 reply; 14+ messages in thread From: Johan Herland @ 2008-06-15 14:05 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Daniel Barkalow This moves pack_refs() and underlying functionality into the library, to make pack-refs functionality easily available to all git programs. Most of builtin-pack-refs.c has been moved verbatim into a new file pack-refs.c that is compiled into libgit.a. A corresponding header file, pack-refs.h, has also been added, declaring pack_refs() and the #defines associated with the flags parameter to pack_refs(). This patch introduces no other changes in functionality. Signed-off-by: Johan Herland <johan@herland.net> --- Makefile | 2 + builtin-pack-refs.c | 121 +-------------------------------------------------- pack-refs.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++++ pack-refs.h | 18 ++++++++ 4 files changed, 138 insertions(+), 120 deletions(-) create mode 100644 pack-refs.c create mode 100644 pack-refs.h diff --git a/Makefile b/Makefile index 1937507..4a78388 100644 --- a/Makefile +++ b/Makefile @@ -354,6 +354,7 @@ LIB_H += log-tree.h LIB_H += mailmap.h LIB_H += object.h LIB_H += pack.h +LIB_H += pack-refs.h LIB_H += pack-revindex.h LIB_H += parse-options.h LIB_H += patch-ids.h @@ -429,6 +430,7 @@ LIB_OBJS += merge-file.o LIB_OBJS += name-hash.o LIB_OBJS += object.o LIB_OBJS += pack-check.o +LIB_OBJS += pack-refs.o LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o LIB_OBJS += pager.o diff --git a/builtin-pack-refs.c b/builtin-pack-refs.c index 1aaa76d..ff90aef 100644 --- a/builtin-pack-refs.c +++ b/builtin-pack-refs.c @@ -1,125 +1,6 @@ -#include "builtin.h" #include "cache.h" -#include "refs.h" -#include "object.h" -#include "tag.h" #include "parse-options.h" - -struct ref_to_prune { - struct ref_to_prune *next; - unsigned char sha1[20]; - char name[FLEX_ARRAY]; -}; - -#define PACK_REFS_PRUNE 0x0001 -#define PACK_REFS_ALL 0x0002 - -struct pack_refs_cb_data { - unsigned int flags; - struct ref_to_prune *ref_to_prune; - FILE *refs_file; -}; - -static int do_not_prune(int flags) -{ - /* If it is already packed or if it is a symref, - * do not prune it. - */ - return (flags & (REF_ISSYMREF|REF_ISPACKED)); -} - -static int handle_one_ref(const char *path, const unsigned char *sha1, - int flags, void *cb_data) -{ - struct pack_refs_cb_data *cb = cb_data; - int is_tag_ref; - - /* Do not pack the symbolic refs */ - if ((flags & REF_ISSYMREF)) - return 0; - is_tag_ref = !prefixcmp(path, "refs/tags/"); - - /* ALWAYS pack refs that were already packed or are tags */ - if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref && !(flags & REF_ISPACKED)) - return 0; - - fprintf(cb->refs_file, "%s %s\n", sha1_to_hex(sha1), path); - if (is_tag_ref) { - struct object *o = parse_object(sha1); - if (o->type == OBJ_TAG) { - o = deref_tag(o, path, 0); - if (o) - fprintf(cb->refs_file, "^%s\n", - sha1_to_hex(o->sha1)); - } - } - - if ((cb->flags & PACK_REFS_PRUNE) && !do_not_prune(flags)) { - int namelen = strlen(path) + 1; - struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen); - hashcpy(n->sha1, sha1); - strcpy(n->name, path); - n->next = cb->ref_to_prune; - cb->ref_to_prune = n; - } - return 0; -} - -/* make sure nobody touched the ref, and unlink */ -static void prune_ref(struct ref_to_prune *r) -{ - struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1); - - if (lock) { - unlink(git_path("%s", r->name)); - unlock_ref(lock); - } -} - -static void prune_refs(struct ref_to_prune *r) -{ - while (r) { - prune_ref(r); - r = r->next; - } -} - -static struct lock_file packed; - -static int pack_refs(unsigned int flags) -{ - int fd; - struct pack_refs_cb_data cbdata; - - memset(&cbdata, 0, sizeof(cbdata)); - cbdata.flags = flags; - - fd = hold_lock_file_for_update(&packed, git_path("packed-refs"), 1); - cbdata.refs_file = fdopen(fd, "w"); - if (!cbdata.refs_file) - die("unable to create ref-pack file structure (%s)", - strerror(errno)); - - /* perhaps other traits later as well */ - fprintf(cbdata.refs_file, "# pack-refs with: peeled \n"); - - for_each_ref(handle_one_ref, &cbdata); - if (ferror(cbdata.refs_file)) - die("failed to write ref-pack file"); - if (fflush(cbdata.refs_file) || fsync(fd) || fclose(cbdata.refs_file)) - die("failed to write ref-pack file (%s)", strerror(errno)); - /* - * Since the lock file was fdopen()'ed and then fclose()'ed above, - * assign -1 to the lock file descriptor so that commit_lock_file() - * won't try to close() it. - */ - packed.fd = -1; - if (commit_lock_file(&packed) < 0) - die("unable to overwrite old ref-pack file (%s)", strerror(errno)); - if (cbdata.flags & PACK_REFS_PRUNE) - prune_refs(cbdata.ref_to_prune); - return 0; -} +#include "pack-refs.h" static char const * const pack_refs_usage[] = { "git-pack-refs [options]", diff --git a/pack-refs.c b/pack-refs.c new file mode 100644 index 0000000..848d311 --- /dev/null +++ b/pack-refs.c @@ -0,0 +1,117 @@ +#include "cache.h" +#include "refs.h" +#include "tag.h" +#include "pack-refs.h" + +struct ref_to_prune { + struct ref_to_prune *next; + unsigned char sha1[20]; + char name[FLEX_ARRAY]; +}; + +struct pack_refs_cb_data { + unsigned int flags; + struct ref_to_prune *ref_to_prune; + FILE *refs_file; +}; + +static int do_not_prune(int flags) +{ + /* If it is already packed or if it is a symref, + * do not prune it. + */ + return (flags & (REF_ISSYMREF|REF_ISPACKED)); +} + +static int handle_one_ref(const char *path, const unsigned char *sha1, + int flags, void *cb_data) +{ + struct pack_refs_cb_data *cb = cb_data; + int is_tag_ref; + + /* Do not pack the symbolic refs */ + if ((flags & REF_ISSYMREF)) + return 0; + is_tag_ref = !prefixcmp(path, "refs/tags/"); + + /* ALWAYS pack refs that were already packed or are tags */ + if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref && !(flags & REF_ISPACKED)) + return 0; + + fprintf(cb->refs_file, "%s %s\n", sha1_to_hex(sha1), path); + if (is_tag_ref) { + struct object *o = parse_object(sha1); + if (o->type == OBJ_TAG) { + o = deref_tag(o, path, 0); + if (o) + fprintf(cb->refs_file, "^%s\n", + sha1_to_hex(o->sha1)); + } + } + + if ((cb->flags & PACK_REFS_PRUNE) && !do_not_prune(flags)) { + int namelen = strlen(path) + 1; + struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen); + hashcpy(n->sha1, sha1); + strcpy(n->name, path); + n->next = cb->ref_to_prune; + cb->ref_to_prune = n; + } + return 0; +} + +/* make sure nobody touched the ref, and unlink */ +static void prune_ref(struct ref_to_prune *r) +{ + struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1); + + if (lock) { + unlink(git_path("%s", r->name)); + unlock_ref(lock); + } +} + +static void prune_refs(struct ref_to_prune *r) +{ + while (r) { + prune_ref(r); + r = r->next; + } +} + +static struct lock_file packed; + +int pack_refs(unsigned int flags) +{ + int fd; + struct pack_refs_cb_data cbdata; + + memset(&cbdata, 0, sizeof(cbdata)); + cbdata.flags = flags; + + fd = hold_lock_file_for_update(&packed, git_path("packed-refs"), 1); + cbdata.refs_file = fdopen(fd, "w"); + if (!cbdata.refs_file) + die("unable to create ref-pack file structure (%s)", + strerror(errno)); + + /* perhaps other traits later as well */ + fprintf(cbdata.refs_file, "# pack-refs with: peeled \n"); + + for_each_ref(handle_one_ref, &cbdata); + if (ferror(cbdata.refs_file)) + die("failed to write ref-pack file"); + if (fflush(cbdata.refs_file) || fsync(fd) || fclose(cbdata.refs_file)) + die("failed to write ref-pack file (%s)", strerror(errno)); + /* + * Since the lock file was fdopen()'ed and then fclose()'ed above, + * assign -1 to the lock file descriptor so that commit_lock_file() + * won't try to close() it. + */ + packed.fd = -1; + if (commit_lock_file(&packed) < 0) + die("unable to overwrite old ref-pack file (%s)", strerror(errno)); + if (cbdata.flags & PACK_REFS_PRUNE) + prune_refs(cbdata.ref_to_prune); + return 0; +} diff --git a/pack-refs.h b/pack-refs.h new file mode 100644 index 0000000..518acfb --- /dev/null +++ b/pack-refs.h @@ -0,0 +1,18 @@ +#ifndef PACK_REFS_H +#define PACK_REFS_H + +/* + * Flags for controlling behaviour of pack_refs() + * PACK_REFS_PRUNE: Prune loose refs after packing + * PACK_REFS_ALL: Pack _all_ refs, not just tags and already packed refs + */ +#define PACK_REFS_PRUNE 0x0001 +#define PACK_REFS_ALL 0x0002 + +/* + * Write a packed-refs file for the current repository. + * flags: Combination of the above PACK_REFS_* flags. + */ +int pack_refs(unsigned int flags); + +#endif /* PACK_REFS_H */ -- 1.5.6.rc2.128.gf64ae ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] Move pack_refs() and friends into libgit 2008-06-15 14:05 ` [PATCH 2/4] Move pack_refs() and friends into libgit Johan Herland @ 2008-06-15 17:52 ` Jeff King 2008-06-15 21:27 ` [PATCH 2/4 v2] " Johan Herland 0 siblings, 1 reply; 14+ messages in thread From: Jeff King @ 2008-06-15 17:52 UTC (permalink / raw) To: Johan Herland; +Cc: git, Junio C Hamano, Daniel Barkalow On Sun, Jun 15, 2008 at 04:05:06PM +0200, Johan Herland wrote: > --- > Makefile | 2 + > builtin-pack-refs.c | 121 +----------------------------------------------- > pack-refs.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++ > pack-refs.h | 18 ++++++++ > 4 files changed, 138 insertions(+), 120 deletions(-) > create mode 100644 pack-refs.c > create mode 100644 pack-refs.h This patch would have been a lot easier to read, btw, if it had been generated with '-C' (then patch to pack-refs.c is based on builtin-pack-refs instead of /dev/null). -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/4 v2] Move pack_refs() and friends into libgit 2008-06-15 17:52 ` Jeff King @ 2008-06-15 21:27 ` Johan Herland 0 siblings, 0 replies; 14+ messages in thread From: Johan Herland @ 2008-06-15 21:27 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano, Daniel Barkalow This moves pack_refs() and underlying functionality into the library, to make pack-refs functionality easily available to all git programs. Most of builtin-pack-refs.c has been moved verbatim into a new file pack-refs.c that is compiled into libgit.a. A corresponding header file, pack-refs.h, has also been added, declaring pack_refs() and the #defines associated with the flags parameter to pack_refs(). This patch introduces no other changes in functionality. Signed-off-by: Johan Herland <johan@herland.net> --- On Sunday 15 June 2008, Jeff King wrote: > This patch would have been a lot easier to read, btw, if it had been > generated with '-C' (then patch to pack-refs.c is based on > builtin-pack-refs instead of /dev/null). Sorry about that. Here you go. ...Johan Makefile | 2 + builtin-pack-refs.c | 121 +----------------------------------- builtin-pack-refs.c => pack-refs.c | 27 +-------- pack-refs.h | 18 +++++ 4 files changed, 23 insertions(+), 145 deletions(-) copy builtin-pack-refs.c => pack-refs.c (80%) create mode 100644 pack-refs.h diff --git a/Makefile b/Makefile index 1937507..4a78388 100644 --- a/Makefile +++ b/Makefile @@ -354,6 +354,7 @@ LIB_H += log-tree.h LIB_H += mailmap.h LIB_H += object.h LIB_H += pack.h +LIB_H += pack-refs.h LIB_H += pack-revindex.h LIB_H += parse-options.h LIB_H += patch-ids.h @@ -429,6 +430,7 @@ LIB_OBJS += merge-file.o LIB_OBJS += name-hash.o LIB_OBJS += object.o LIB_OBJS += pack-check.o +LIB_OBJS += pack-refs.o LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o LIB_OBJS += pager.o diff --git a/builtin-pack-refs.c b/builtin-pack-refs.c index 1aaa76d..ff90aef 100644 --- a/builtin-pack-refs.c +++ b/builtin-pack-refs.c @@ -1,125 +1,6 @@ -#include "builtin.h" #include "cache.h" -#include "refs.h" -#include "object.h" -#include "tag.h" #include "parse-options.h" - -struct ref_to_prune { - struct ref_to_prune *next; - unsigned char sha1[20]; - char name[FLEX_ARRAY]; -}; - -#define PACK_REFS_PRUNE 0x0001 -#define PACK_REFS_ALL 0x0002 - -struct pack_refs_cb_data { - unsigned int flags; - struct ref_to_prune *ref_to_prune; - FILE *refs_file; -}; - -static int do_not_prune(int flags) -{ - /* If it is already packed or if it is a symref, - * do not prune it. - */ - return (flags & (REF_ISSYMREF|REF_ISPACKED)); -} - -static int handle_one_ref(const char *path, const unsigned char *sha1, - int flags, void *cb_data) -{ - struct pack_refs_cb_data *cb = cb_data; - int is_tag_ref; - - /* Do not pack the symbolic refs */ - if ((flags & REF_ISSYMREF)) - return 0; - is_tag_ref = !prefixcmp(path, "refs/tags/"); - - /* ALWAYS pack refs that were already packed or are tags */ - if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref && !(flags & REF_ISPACKED)) - return 0; - - fprintf(cb->refs_file, "%s %s\n", sha1_to_hex(sha1), path); - if (is_tag_ref) { - struct object *o = parse_object(sha1); - if (o->type == OBJ_TAG) { - o = deref_tag(o, path, 0); - if (o) - fprintf(cb->refs_file, "^%s\n", - sha1_to_hex(o->sha1)); - } - } - - if ((cb->flags & PACK_REFS_PRUNE) && !do_not_prune(flags)) { - int namelen = strlen(path) + 1; - struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen); - hashcpy(n->sha1, sha1); - strcpy(n->name, path); - n->next = cb->ref_to_prune; - cb->ref_to_prune = n; - } - return 0; -} - -/* make sure nobody touched the ref, and unlink */ -static void prune_ref(struct ref_to_prune *r) -{ - struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1); - - if (lock) { - unlink(git_path("%s", r->name)); - unlock_ref(lock); - } -} - -static void prune_refs(struct ref_to_prune *r) -{ - while (r) { - prune_ref(r); - r = r->next; - } -} - -static struct lock_file packed; - -static int pack_refs(unsigned int flags) -{ - int fd; - struct pack_refs_cb_data cbdata; - - memset(&cbdata, 0, sizeof(cbdata)); - cbdata.flags = flags; - - fd = hold_lock_file_for_update(&packed, git_path("packed-refs"), 1); - cbdata.refs_file = fdopen(fd, "w"); - if (!cbdata.refs_file) - die("unable to create ref-pack file structure (%s)", - strerror(errno)); - - /* perhaps other traits later as well */ - fprintf(cbdata.refs_file, "# pack-refs with: peeled \n"); - - for_each_ref(handle_one_ref, &cbdata); - if (ferror(cbdata.refs_file)) - die("failed to write ref-pack file"); - if (fflush(cbdata.refs_file) || fsync(fd) || fclose(cbdata.refs_file)) - die("failed to write ref-pack file (%s)", strerror(errno)); - /* - * Since the lock file was fdopen()'ed and then fclose()'ed above, - * assign -1 to the lock file descriptor so that commit_lock_file() - * won't try to close() it. - */ - packed.fd = -1; - if (commit_lock_file(&packed) < 0) - die("unable to overwrite old ref-pack file (%s)", strerror(errno)); - if (cbdata.flags & PACK_REFS_PRUNE) - prune_refs(cbdata.ref_to_prune); - return 0; -} +#include "pack-refs.h" static char const * const pack_refs_usage[] = { "git-pack-refs [options]", diff --git a/builtin-pack-refs.c b/pack-refs.c similarity index 80% copy from builtin-pack-refs.c copy to pack-refs.c index 1aaa76d..848d311 100644 --- a/builtin-pack-refs.c +++ b/pack-refs.c @@ -1,9 +1,7 @@ -#include "builtin.h" #include "cache.h" #include "refs.h" -#include "object.h" #include "tag.h" -#include "parse-options.h" +#include "pack-refs.h" struct ref_to_prune { struct ref_to_prune *next; @@ -11,9 +9,6 @@ struct ref_to_prune { char name[FLEX_ARRAY]; }; -#define PACK_REFS_PRUNE 0x0001 -#define PACK_REFS_ALL 0x0002 - struct pack_refs_cb_data { unsigned int flags; struct ref_to_prune *ref_to_prune; @@ -86,7 +81,7 @@ static void prune_refs(struct ref_to_prune *r) static struct lock_file packed; -static int pack_refs(unsigned int flags) +int pack_refs(unsigned int flags) { int fd; struct pack_refs_cb_data cbdata; @@ -120,21 +115,3 @@ static int pack_refs(unsigned int flags) prune_refs(cbdata.ref_to_prune); return 0; } - -static char const * const pack_refs_usage[] = { - "git-pack-refs [options]", - NULL -}; - -int cmd_pack_refs(int argc, const char **argv, const char *prefix) -{ - unsigned int flags = PACK_REFS_PRUNE; - struct option opts[] = { - OPT_BIT(0, "all", &flags, "pack everything", PACK_REFS_ALL), - OPT_BIT(0, "prune", &flags, "prune loose refs (default)", PACK_REFS_PRUNE), - OPT_END(), - }; - if (parse_options(argc, argv, opts, pack_refs_usage, 0)) - usage_with_options(pack_refs_usage, opts); - return pack_refs(flags); -} diff --git a/pack-refs.h b/pack-refs.h new file mode 100644 index 0000000..518acfb --- /dev/null +++ b/pack-refs.h @@ -0,0 +1,18 @@ +#ifndef PACK_REFS_H +#define PACK_REFS_H + +/* + * Flags for controlling behaviour of pack_refs() + * PACK_REFS_PRUNE: Prune loose refs after packing + * PACK_REFS_ALL: Pack _all_ refs, not just tags and already packed refs + */ +#define PACK_REFS_PRUNE 0x0001 +#define PACK_REFS_ALL 0x0002 + +/* + * Write a packed-refs file for the current repository. + * flags: Combination of the above PACK_REFS_* flags. + */ +int pack_refs(unsigned int flags); + +#endif /* PACK_REFS_H */ -- 1.5.6.rc2.128.gf64ae ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs 2008-06-15 14:02 [PATCH 0/4] Teach "git clone" to pack refs Johan Herland 2008-06-15 14:04 ` [PATCH 1/4] Incorporate fetched packs in future object traversal Johan Herland 2008-06-15 14:05 ` [PATCH 2/4] Move pack_refs() and friends into libgit Johan Herland @ 2008-06-15 14:05 ` Johan Herland 2008-06-15 17:54 ` Jeff King 2008-06-15 14:06 ` [PATCH 4/4] Teach "git clone" to pack refs Johan Herland 3 siblings, 1 reply; 14+ messages in thread From: Johan Herland @ 2008-06-15 14:05 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Daniel Barkalow t5515-fetch-merge-logic removes many, but not all, refs between each test. This is done by removing the corresponding refs/foo/* files in the .git/refs hierarchy. However, once "git clone" starts producing packed refs, these refs will no longer be in the .git/refs hierarchy, but rather listed in .git/packed-refs. This patch therefore teaches t5515-fetch-merge-logic to also remove the refs in question from the packed-refs file. Signed-off-by: Johan Herland <johan@herland.net> --- t/t5515-fetch-merge-logic.sh | 19 +++++++++++++++++++ 1 files changed, 19 insertions(+), 0 deletions(-) diff --git a/t/t5515-fetch-merge-logic.sh b/t/t5515-fetch-merge-logic.sh index 3def75e..b3e1b97 100755 --- a/t/t5515-fetch-merge-logic.sh +++ b/t/t5515-fetch-merge-logic.sh @@ -145,6 +145,25 @@ do rm -f .git/refs/heads/* rm -f .git/refs/remotes/rem/* rm -f .git/refs/tags/* + cat .git/packed-refs | \ + while read sha1 refname + do + case "$sha1" in + ^*) # remove peeled tags + ;; + *) + case "$refname" in + refs/heads/*|\ + refs/remotes/rem/*|\ + refs/tags/*) # remove same as above + ;; + *) # keep everything else + echo "$sha1 $refname" + ;; + esac + esac + done > .git/packed-refs.new + mv .git/packed-refs.new .git/packed-refs git fetch "$@" >/dev/null cat .git/FETCH_HEAD } >"$actual_f" && -- 1.5.6.rc2.128.gf64ae ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs 2008-06-15 14:05 ` [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs Johan Herland @ 2008-06-15 17:54 ` Jeff King 2008-06-15 18:04 ` Jakub Narebski 0 siblings, 1 reply; 14+ messages in thread From: Jeff King @ 2008-06-15 17:54 UTC (permalink / raw) To: Johan Herland; +Cc: git, Junio C Hamano, Daniel Barkalow On Sun, Jun 15, 2008 at 04:05:47PM +0200, Johan Herland wrote: > .git/packed-refs. This patch therefore teaches t5515-fetch-merge-logic > to also remove the refs in question from the packed-refs file. > [...] > + cat .git/packed-refs | \ > + while read sha1 refname > + do > + case "$sha1" in > + ^*) # remove peeled tags > + ;; > + *) > + case "$refname" in > + refs/heads/*|\ > + refs/remotes/rem/*|\ > + refs/tags/*) # remove same as above > + ;; > + *) # keep everything else > + echo "$sha1 $refname" > + ;; > + esac > + esac > + done > .git/packed-refs.new > + mv .git/packed-refs.new .git/packed-refs Might it not be simpler to just convert it to use plumbing to delete the refs? Something like piping for-each-ref into update-ref -d? -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs 2008-06-15 17:54 ` Jeff King @ 2008-06-15 18:04 ` Jakub Narebski 2008-06-15 23:16 ` [PATCH 3/4 v2] " Johan Herland 0 siblings, 1 reply; 14+ messages in thread From: Jakub Narebski @ 2008-06-15 18:04 UTC (permalink / raw) To: Jeff King; +Cc: Johan Herland, git, Junio C Hamano, Daniel Barkalow Jeff King <peff@peff.net> writes: > On Sun, Jun 15, 2008 at 04:05:47PM +0200, Johan Herland wrote: > > > .git/packed-refs. This patch therefore teaches t5515-fetch-merge-logic > > to also remove the refs in question from the packed-refs file. > > [...] > > + cat .git/packed-refs | \ > > + while read sha1 refname > > + do > > + case "$sha1" in > > + ^*) # remove peeled tags > > + ;; > > + *) > > + case "$refname" in > > + refs/heads/*|\ > > + refs/remotes/rem/*|\ > > + refs/tags/*) # remove same as above > > + ;; > > + *) # keep everything else > > + echo "$sha1 $refname" > > + ;; > > + esac > > + esac > > + done > .git/packed-refs.new > > + mv .git/packed-refs.new .git/packed-refs > > Might it not be simpler to just convert it to use plumbing to delete the > refs? Something like piping for-each-ref into update-ref -d? Or use git-for-each-ref with --shell option to generate code for deleting refs? -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/4 v2] Prepare testsuite for a "git clone" that packs refs 2008-06-15 18:04 ` Jakub Narebski @ 2008-06-15 23:16 ` Johan Herland 0 siblings, 0 replies; 14+ messages in thread From: Johan Herland @ 2008-06-15 23:16 UTC (permalink / raw) To: Jakub Narebski; +Cc: Jeff King, git, Junio C Hamano, Daniel Barkalow t5515-fetch-merge-logic removes many, but not all, refs between each test. This is done by removing the corresponding refs/foo/* files in the .git/refs hierarchy. However, once "git clone" starts producing packed refs, these refs will no longer be in the .git/refs hierarchy, but rather listed in .git/packed-refs. This patch teaches t5515-fetch-merge-logic to remove the refs using "git update-ref -d" which properly handles packed refs. Signed-off-by: Johan Herland <johan@herland.net> --- On Sunday 15 June 2008, Jakub Narebski wrote: > Jeff King <peff@peff.net> writes: > > Might it not be simpler to just convert it to use plumbing to delete > > the refs? Something like piping for-each-ref into update-ref -d? > > Or use git-for-each-ref with --shell option to generate code for > deleting refs? Thanks for the tip. The following seems to do the job, and looks much better. Have fun! :) ...Johan t/t5515-fetch-merge-logic.sh | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/t/t5515-fetch-merge-logic.sh b/t/t5515-fetch-merge-logic.sh index 3def75e..9de8408 100755 --- a/t/t5515-fetch-merge-logic.sh +++ b/t/t5515-fetch-merge-logic.sh @@ -142,9 +142,12 @@ do set x $cmd; shift git symbolic-ref HEAD refs/heads/$1 ; shift rm -f .git/FETCH_HEAD - rm -f .git/refs/heads/* - rm -f .git/refs/remotes/rem/* - rm -f .git/refs/tags/* + git for-each-ref --format="%(refname)" \ + refs/heads refs/remotes/rem refs/tags | \ + while read refname + do + git update-ref -d "$refname" + done git fetch "$@" >/dev/null cat .git/FETCH_HEAD } >"$actual_f" && -- 1.5.6.rc2.128.gf64ae ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] Teach "git clone" to pack refs 2008-06-15 14:02 [PATCH 0/4] Teach "git clone" to pack refs Johan Herland ` (2 preceding siblings ...) 2008-06-15 14:05 ` [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs Johan Herland @ 2008-06-15 14:06 ` Johan Herland 2008-06-15 17:56 ` Jeff King 3 siblings, 1 reply; 14+ messages in thread From: Johan Herland @ 2008-06-15 14:06 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Daniel Barkalow In repos with many refs, it is unlikely that most refs will ever change. This fact is already exploited by "git gc" by executing "git pack-refs" to consolidate all refs into a single file. When cloning a repo with many refs, it does not make sense to create the loose refs in the first place, just to have the next "git gc" consolidate them into one file. Instead, make "git clone" create the packed refs file immediately, and forego the loose refs completely. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Daniel Barkalow <barkalow@iabervon.org> --- builtin-clone.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/builtin-clone.c b/builtin-clone.c index 7190952..5c5acb4 100644 --- a/builtin-clone.c +++ b/builtin-clone.c @@ -18,6 +18,7 @@ #include "transport.h" #include "strbuf.h" #include "dir.h" +#include "pack-refs.h" /* * Overall FIXMEs: @@ -321,8 +322,11 @@ static struct ref *write_remote_refs(const struct ref *refs, get_fetch_map(refs, tag_refspec, &tail, 0); for (r = local_refs; r; r = r->next) - update_ref(reflog, - r->peer_ref->name, r->old_sha1, NULL, 0, DIE_ON_ERR); + add_extra_ref(r->peer_ref->name, r->old_sha1, 0); + + pack_refs(PACK_REFS_ALL); + clear_extra_refs(); + return local_refs; } -- 1.5.6.rc2.128.gf64ae ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] Teach "git clone" to pack refs 2008-06-15 14:06 ` [PATCH 4/4] Teach "git clone" to pack refs Johan Herland @ 2008-06-15 17:56 ` Jeff King 2008-06-15 22:03 ` Johan Herland 0 siblings, 1 reply; 14+ messages in thread From: Jeff King @ 2008-06-15 17:56 UTC (permalink / raw) To: Johan Herland; +Cc: git, Junio C Hamano, Daniel Barkalow On Sun, Jun 15, 2008 at 04:06:16PM +0200, Johan Herland wrote: > + pack_refs(PACK_REFS_ALL); I haven't looked carefully at the pack_refs code, but my understanding was that this would pack _all_ refs, including branches. Don't we generally try to leave branches unpacked, since they change a lot? IOW, shouldn't this just be "pack_refs(0)"? -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] Teach "git clone" to pack refs 2008-06-15 17:56 ` Jeff King @ 2008-06-15 22:03 ` Johan Herland 2008-06-16 9:57 ` Jeff King 0 siblings, 1 reply; 14+ messages in thread From: Johan Herland @ 2008-06-15 22:03 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano, Daniel Barkalow On Sunday 15 June 2008, Jeff King wrote: > On Sun, Jun 15, 2008 at 04:06:16PM +0200, Johan Herland wrote: > > + pack_refs(PACK_REFS_ALL); > > I haven't looked carefully at the pack_refs code, but my understanding > was that this would pack _all_ refs, including branches. Don't we > generally try to leave branches unpacked, since they change a lot? IOW, > shouldn't this just be "pack_refs(0)"? Yes, for many repos it does not make much sense to pack branches. But in the case where the repo has many inactive branches (I have repos with 1000 branches where at most 5-10 are still active), I'd much rather pack all branches and then later "unpack" the active ones, than write all those "loose" refs as separate files onto the filesystem (e.g. in CygWin *shudder*). In any case, the user normally does not work actively on hundreds of branches, so the overhead of "unpacking" active branches should be fairly negligible in any case. ...Johan -- Johan Herland, <johan@herland.net> www.herland.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] Teach "git clone" to pack refs 2008-06-15 22:03 ` Johan Herland @ 2008-06-16 9:57 ` Jeff King 0 siblings, 0 replies; 14+ messages in thread From: Jeff King @ 2008-06-16 9:57 UTC (permalink / raw) To: Johan Herland; +Cc: git, Junio C Hamano, Daniel Barkalow On Mon, Jun 16, 2008 at 12:03:06AM +0200, Johan Herland wrote: > Yes, for many repos it does not make much sense to pack branches. But in the > case where the repo has many inactive branches (I have repos with 1000 > branches where at most 5-10 are still active), I'd much rather pack all > branches and then later "unpack" the active ones, than write all > those "loose" refs as separate files onto the filesystem (e.g. in CygWin > *shudder*). In any case, the user normally does not work actively on > hundreds of branches, so the overhead of "unpacking" active branches should > be fairly negligible in any case. What I was concerned about was that pack-refs would continue to pack active branches forever, because by default it repacks branches that are already packed. However, it seems that git-gc just passes --all anyway, so it presumably is not a problem. And at the very least, clone is then consistent with the auto-gc behavior, which makes sense. So I withdraw my complaint. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* [RFC/PATCH 0/3] Teach builtin-clone to pack refs @ 2008-03-22 1:10 Johan Herland 2008-04-14 6:10 ` [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs Daniel Barkalow 0 siblings, 1 reply; 14+ messages in thread From: Johan Herland @ 2008-03-22 1:10 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git The following series builds on top of Barkalow's existing builtin-clone work, available in the "builtin-clone" branch at: git://iabervon.org/~barkalow/git.git This patch series teaches builtin-clone to create packed refs. Creating packed refs directly in clone (instead of creating loose refs and have the next "git gc" (re)pack them), makes cloning considerably faster on repos with many refs. In a test repo with 11000 refs (1000 branches, 10000 tags) I get the following numbers (Core 2 Quad, 4GB RAM running Gentoo Linux): - Current "next": ~24.8 seconds - Barkalow's "builtin-clone" branch: 1.47 seconds - "builtin-clone" plus this series: 0.31 seconds Although most of the speedup from current "next" is achieved by the builtin-clone work, there is still a considerable additional improvement from writing all refs to a single file instead of writing one file per ref. I expect the performance improvement to be much bigger on platforms with slower filesystem (aka. Windows). A side-effect of this series is that the cloned refs will not get reflog entries. I don't know how important these "clone: from $URL" entries are to people; I, for one, wouldn't miss them at all. Have fun! :) ...Johan -- Johan Herland, <johan@herland.net> www.herland.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs @ 2008-04-14 6:10 ` Daniel Barkalow 2008-04-14 8:00 ` Johan Herland 0 siblings, 1 reply; 14+ messages in thread From: Daniel Barkalow @ 2008-04-14 6:10 UTC (permalink / raw) To: Johan Herland; +Cc: git On Sat, 22 Mar 2008, Johan Herland wrote: > diff --git a/t/t5515-fetch-merge-logic.sh b/t/t5515-fetch-merge-logic.sh > index 31c1081..4569a13 100755 > --- a/t/t5515-fetch-merge-logic.sh > +++ b/t/t5515-fetch-merge-logic.sh > @@ -143,6 +143,7 @@ do > rm -f .git/refs/heads/* > rm -f .git/refs/remotes/rem/* > rm -f .git/refs/tags/* > + rm -f .git/packed-refs > git fetch "$@" >/dev/null > cat .git/FETCH_HEAD > } >"$actual" && I was just looking over this again, and it doesn't actually work, because it removes the copies of refs/remotes/origin/*, which the tests are expecting to find. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs 2008-04-14 6:10 ` [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs Daniel Barkalow @ 2008-04-14 8:00 ` Johan Herland 2008-04-14 8:03 ` [PATCH 3/4] " Johan Herland 0 siblings, 1 reply; 14+ messages in thread From: Johan Herland @ 2008-04-14 8:00 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git On Monday 14 April 2008, Daniel Barkalow wrote: > On Sat, 22 Mar 2008, Johan Herland wrote: > > > diff --git a/t/t5515-fetch-merge-logic.sh b/t/t5515-fetch-merge-logic.sh > > index 31c1081..4569a13 100755 > > --- a/t/t5515-fetch-merge-logic.sh > > +++ b/t/t5515-fetch-merge-logic.sh > > @@ -143,6 +143,7 @@ do > > rm -f .git/refs/heads/* > > rm -f .git/refs/remotes/rem/* > > rm -f .git/refs/tags/* > > + rm -f .git/packed-refs > > git fetch "$@" >/dev/null > > cat .git/FETCH_HEAD > > } >"$actual" && > > I was just looking over this again, and it doesn't actually work, because > it removes the copies of refs/remotes/origin/*, which the tests are > expecting to find. Yes, I also found that (after rebasing on current 'next'). :) Will shortly post a new series making builtin-clone do packed refs, suitable for current 'next'. Have fun! :) ...Johan -- Johan Herland, <johan@herland.net> www.herland.net ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs 2008-04-14 8:00 ` Johan Herland @ 2008-04-14 8:03 ` Johan Herland 0 siblings, 0 replies; 14+ messages in thread From: Johan Herland @ 2008-04-14 8:03 UTC (permalink / raw) To: Daniel Barkalow; +Cc: git t5515-fetch-merge-logic removes many, but not all, refs between each test. This is done by removing the corresponding refs/foo/* files in the .git/refs hierarchy. However, once "git clone" starts producing packed refs, these refs will no longer be in the .git/refs hierarchy, but rather listed in .git/packed-refs. This patch therefore teaches t5515-fetch-merge-logic to also remove the refs in question from the packed-refs file. Signed-off-by: Johan Herland <johan@herland.net> --- t/t5515-fetch-merge-logic.sh | 19 +++++++++++++++++++ 1 files changed, 19 insertions(+), 0 deletions(-) diff --git a/t/t5515-fetch-merge-logic.sh b/t/t5515-fetch-merge-logic.sh index 65c3774..8a8c35c 100755 --- a/t/t5515-fetch-merge-logic.sh +++ b/t/t5515-fetch-merge-logic.sh @@ -145,6 +145,25 @@ do rm -f .git/refs/heads/* rm -f .git/refs/remotes/rem/* rm -f .git/refs/tags/* + cat .git/packed-refs | \ + while read sha1 refname + do + case "$sha1" in + ^*) # remove peeled tags + ;; + *) + case "$refname" in + refs/heads/*|\ + refs/remotes/rem/*|\ + refs/tags/*) # remove same as above + ;; + *) # keep everything else + echo "$sha1 $refname" + ;; + esac + esac + done > .git/packed-refs.new + mv .git/packed-refs.new .git/packed-refs git fetch "$@" >/dev/null cat .git/FETCH_HEAD } >"$actual_f" && -- 1.5.5.159.g8c84b ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2008-06-16 9:58 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-15 14:02 [PATCH 0/4] Teach "git clone" to pack refs Johan Herland 2008-06-15 14:04 ` [PATCH 1/4] Incorporate fetched packs in future object traversal Johan Herland 2008-06-15 14:05 ` [PATCH 2/4] Move pack_refs() and friends into libgit Johan Herland 2008-06-15 17:52 ` Jeff King 2008-06-15 21:27 ` [PATCH 2/4 v2] " Johan Herland 2008-06-15 14:05 ` [PATCH 3/4] Prepare testsuite for a "git clone" that packs refs Johan Herland 2008-06-15 17:54 ` Jeff King 2008-06-15 18:04 ` Jakub Narebski 2008-06-15 23:16 ` [PATCH 3/4 v2] " Johan Herland 2008-06-15 14:06 ` [PATCH 4/4] Teach "git clone" to pack refs Johan Herland 2008-06-15 17:56 ` Jeff King 2008-06-15 22:03 ` Johan Herland 2008-06-16 9:57 ` Jeff King -- strict thread matches above, loose matches on Subject: below -- 2008-03-22 1:10 [RFC/PATCH 0/3] Teach builtin-clone " Johan Herland 2008-04-14 6:10 ` [RFC/PATCH 2/3] Prepare testsuite for a "git clone" that packs refs Daniel Barkalow 2008-04-14 8:00 ` Johan Herland 2008-04-14 8:03 ` [PATCH 3/4] " Johan Herland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).