* [PATCH] diff: mode bits fixes [not found] <7vy89ums2l.fsf@assigned-by-dhcp.cox.net> @ 2005-06-01 18:38 ` Junio C Hamano 2005-06-02 16:46 ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-01 18:38 UTC (permalink / raw) To: Linus Torvalds; +Cc: git The core GIT repository has trees that record regular file mode in 0664 instead of normalized 0644 pattern. Comparing such a tree with another tree that records the same file in 0644 pattern without content changes with git-diff-tree causes it to feed otherwise unmodified pairs to the diff_change() routine, which triggers a sanity check routine and barfs. This patch fixes the problem, along with the fix to another caller that uses unnormalized mode bits to call diff_change() routine in a similar way. Without this patch, you will see "fatal error" from diff-tree when you run git-deltafy-script on the core GIT repository itself. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Linus, I decided to bite the bullet and audited the callers. *** There was only one that was not quite right, other than the *** diff-tree one. Please disregard the one I sent last night *** which incorrectly said "S_ISDIR() || S_ISREG()". Please *** also disregard the other one that silently ignores *** unmodified filepairs in the output routine. Warning about *** callers should be the most appropriate action as this *** version does. diff.h | 4 ++++ diffcore.h | 4 ---- diff-files.c | 8 +++----- diff-tree.c | 4 +++- diff.c | 12 +++++++----- 5 files changed, 17 insertions(+), 15 deletions(-) diff --git a/diff.h b/diff.h --- a/diff.h +++ b/diff.h @@ -4,6 +4,10 @@ #ifndef DIFF_H #define DIFF_H +#define DIFF_FILE_CANON_MODE(mode) \ + (S_ISREG(mode) ? (S_IFREG | ce_permissions(mode)) : \ + S_ISLNK(mode) ? S_IFLNK : S_IFDIR) + extern void diff_addremove(int addremove, unsigned mode, const unsigned char *sha1, diff --git a/diffcore.h b/diffcore.h --- a/diffcore.h +++ b/diffcore.h @@ -59,10 +59,6 @@ struct diff_filepair { #define DIFF_PAIR_MODE_CHANGED(p) ((p)->one->mode != (p)->two->mode) -#define DIFF_FILE_CANON_MODE(mode) \ - (S_ISREG(mode) ? (S_IFREG | ce_permissions(mode)) : \ - S_ISLNK(mode) ? S_IFLNK : S_IFDIR) - extern void diff_free_filepair(struct diff_filepair *); extern int diff_unmodified_pair(struct diff_filepair *); diff --git a/diff-files.c b/diff-files.c --- a/diff-files.c +++ b/diff-files.c @@ -88,7 +88,7 @@ int main(int argc, const char **argv) for (i = 0; i < entries; i++) { struct stat st; - unsigned int oldmode, mode; + unsigned int oldmode; struct cache_entry *ce = active_cache[i]; int changed; @@ -116,10 +116,8 @@ int main(int argc, const char **argv) continue; oldmode = ntohl(ce->ce_mode); - mode = (S_ISLNK(st.st_mode) ? S_IFLNK : - S_IFREG | ce_permissions(st.st_mode)); - - show_modified(oldmode, mode, ce->sha1, null_sha1, + show_modified(oldmode, DIFF_FILE_CANON_MODE(st.st_mode), + ce->sha1, null_sha1, ce->name); } diffcore_std((1 < argc) ? argv + 1 : NULL, diff --git a/diff-tree.c b/diff-tree.c --- a/diff-tree.c +++ b/diff-tree.c @@ -44,10 +44,12 @@ static const unsigned char *extract(void int len = strlen(tree)+1; const unsigned char *sha1 = tree + len; const char *path = strchr(tree, ' '); + unsigned int mode; - if (!path || size < len + 20 || sscanf(tree, "%o", modep) != 1) + if (!path || size < len + 20 || sscanf(tree, "%o", &mode) != 1) die("corrupt tree file"); *pathp = path+1; + *modep = DIFF_FILE_CANON_MODE(mode); return sha1; } diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -854,12 +854,14 @@ static void diff_resolve_rename_copy(voi else if (memcmp(p->one->sha1, p->two->sha1, 20) || p->one->mode != p->two->mode) p->status = 'M'; - else - /* this is a "no-change" entry. - * should not happen anymore. - * p->status = 'X'; + else { + /* This is a "no-change" entry and should not + * happen anymore, but prepare for broken callers. */ - die("internal error in diffcore: unmodified entry remains"); + error("feeding unmodified %s to diffcore", + p->one->path); + p->status = 'X'; + } } diff_debug_queue("resolve-rename-copy done", q); } ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-01 18:38 ` [PATCH] diff: mode bits fixes Junio C Hamano @ 2005-06-02 16:46 ` Junio C Hamano 2005-06-02 17:03 ` Linus Torvalds 2005-06-02 17:03 ` [PATCH] Handle deltified object correctly in git-*-pull family McMullan, Jason 2005-06-02 16:47 ` [PATCH] Use correct U*MAX Junio C Hamano 2005-06-02 16:49 ` [PATCH] Find size of SHA1 object without inflating everything Junio C Hamano 2 siblings, 2 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 16:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: git When a remote repository is deltified, we need to get the objects that a deltified object we want to obtain is based upon. The initial parts of each retrieved SHA1 file is inflated and inspected to see if it is deltified, and its base object is asked from the remote side when it is. Since this partial inflation and inspection has a small performance hit, it can optionally be skipped by giving -d flag to git-*-pull commands. This flag should be used only when the remote repository is known to have no deltified objects. Rsync transport does not have this problem since it fetches everything the remote side has. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Linus, this uses the new helper you wrote. The interface is *** much more pleasant to use. Documentation/git-http-pull.txt | 6 +++++- Documentation/git-local-pull.txt | 6 +++++- Documentation/git-rpull.txt | 6 +++++- cache.h | 3 +++ pull.h | 3 +++ http-pull.c | 4 +++- local-pull.c | 4 +++- pull.c | 6 ++++++ rpull.c | 4 +++- sha1_file.c | 40 ++++++++++++++++++++++++++++++++++++++ 10 files changed, 76 insertions(+), 6 deletions(-) diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt --- a/Documentation/git-http-pull.txt +++ b/Documentation/git-http-pull.txt @@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r SYNOPSIS -------- -'git-http-pull' [-c] [-t] [-a] [-v] commit-id url +'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url DESCRIPTION ----------- @@ -21,6 +21,10 @@ Downloads a remote GIT repository via HT Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt --- a/Documentation/git-local-pull.txt +++ b/Documentation/git-local-pull.txt @@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT SYNOPSIS -------- -'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path +'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path DESCRIPTION ----------- @@ -23,6 +23,10 @@ OPTIONS Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt --- a/Documentation/git-rpull.txt +++ b/Documentation/git-rpull.txt @@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito SYNOPSIS -------- -'git-rpull' [-c] [-t] [-a] [-v] commit-id url +'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url DESCRIPTION ----------- @@ -25,6 +25,10 @@ OPTIONS Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -158,6 +158,9 @@ extern int write_sha1_file(void *buf, un extern int check_sha1_signature(unsigned char *sha1, void *buf, unsigned long size, const char *type); +extern int sha1_delta_base(const unsigned char *, unsigned char *); + + /* Read a tree into the cache */ extern int read_tree(void *buffer, unsigned long size, int stage); diff --git a/pull.h b/pull.h --- a/pull.h +++ b/pull.h @@ -13,6 +13,9 @@ extern int get_history; /** Set to fetch the trees in the commit history. **/ extern int get_all; +/* Set to zero to skip the check for delta object base. */ +extern int get_delta; + /* Set to be verbose */ extern int get_verbosely; diff --git a/http-pull.c b/http-pull.c --- a/http-pull.c +++ b/http-pull.c @@ -103,6 +103,8 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 0; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; @@ -113,7 +115,7 @@ int main(int argc, char **argv) arg++; } if (argc < arg + 2) { - usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/local-pull.c b/local-pull.c --- a/local-pull.c +++ b/local-pull.c @@ -74,7 +74,7 @@ int fetch(unsigned char *sha1) } static const char *local_pull_usage = -"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path"; +"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path"; /* * By default we only use file copy. @@ -92,6 +92,8 @@ int main(int argc, char **argv) get_tree = 1; else if (argv[arg][1] == 'c') get_history = 1; + else if (argv[arg][1] == 'd') + get_delta = 0; else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; diff --git a/pull.c b/pull.c --- a/pull.c +++ b/pull.c @@ -6,6 +6,7 @@ int get_tree = 0; int get_history = 0; +int get_delta = 1; int get_all = 0; int get_verbosely = 0; static unsigned char current_commit_sha1[20]; @@ -37,6 +38,11 @@ static int make_sure_we_have_it(const ch status = fetch(sha1); if (status && what) report_missing(what, sha1); + if (get_delta) { + char delta_sha1[20]; + if (sha1_delta_base(sha1, delta_sha1)) + status = make_sure_we_have_it(what, delta_sha1); + } return status; } diff --git a/rpull.c b/rpull.c --- a/rpull.c +++ b/rpull.c @@ -27,6 +27,8 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 0; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; @@ -37,7 +39,7 @@ int main(int argc, char **argv) arg++; } if (argc < arg + 2) { - usage("git-rpull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -347,6 +347,46 @@ void * unpack_sha1_file(void *map, unsig return buf; } +int sha1_delta_base(const unsigned char *sha1, unsigned char *delta_sha1) +{ + unsigned long mapsize, size; + void *map; + char type[20]; + char buffer[200]; + z_stream stream; + int ret, bytes, status; + + map = map_sha1_file(sha1, &mapsize); + if (!map) + return 0; + ret = unpack_sha1_header(&stream, map, mapsize, buffer, + sizeof(buffer)); + status = 0; + + if (ret < Z_OK || + sscanf(buffer, "%10s %lu", type, &size) != 2 || + strcmp(type, "delta")) + goto out; + bytes = strlen(buffer) + 1; + if (size - bytes < 20) + goto out; + + memmove(buffer, buffer + bytes, stream.total_out - bytes); + bytes = stream.total_out - bytes; + if (bytes < 20 && ret == Z_OK) { + stream.next_out = buffer + bytes; + stream.avail_out = sizeof(buffer) - bytes; + while (inflate(&stream, Z_FINISH) == Z_OK) + ; /* nothing */ + } + status = 1; + memcpy(delta_sha1, buffer, 20); + out: + inflateEnd(&stream); + munmap(map, mapsize); + return status; +} + void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size) { unsigned long mapsize; ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 16:46 ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano @ 2005-06-02 17:03 ` Linus Torvalds 2005-06-02 18:55 ` Junio C Hamano 2005-06-02 18:57 ` [PATCH] " Junio C Hamano 2005-06-02 17:03 ` [PATCH] Handle deltified object correctly in git-*-pull family McMullan, Jason 1 sibling, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2005-06-02 17:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, 2 Jun 2005, Junio C Hamano wrote: > > *** Linus, this uses the new helper you wrote. The interface is > *** much more pleasant to use. Can you update it for the fact that I split things up even more? In particular, see commit 5180cacc202bb20b15981469487eb8d6b0509997: "Split up unpack_sha1_file() some more". You should be able to unpack a delta with just something like this: int ret; z_stream stream; char hdr[100], type[10]; unsigned long size; ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); if (ret < Z_OK) return NULL; if (parse_sha1_header(hdr, type, &size) < 0) return NULL; if (strcmp("delta", type)) return NULL; buffer = unpack_sha1_rest(&stream, hdr, size); if (!buffer) return NULL; .. we now have the delta of size "size" in "buffer" .. ie it now has proper helper functions for every single stage in unpacking an object. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 17:03 ` Linus Torvalds @ 2005-06-02 18:55 ` Junio C Hamano 2005-06-02 21:31 ` Nicolas Pitre 2005-06-02 18:57 ` [PATCH] " Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 18:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: git When a remote repository is deltified, we need to get the objects that a deltified object we want to obtain is based upon. The initial parts of each retrieved SHA1 file is inflated and inspected to see if it is deltified, and its base object is asked from the remote side when it is. Since this partial inflation and inspection has a small performance hit, it can optionally be skipped by giving -d flag to git-*-pull commands. This flag should be used only when the remote repository is known to have no deltified objects. Rsync transport does not have this problem since it fetches everything the remote side has. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Now uses parse_sha1_header() and unpack_sha1_rest(). I *** decided not to make it the callers responsibility to check *** what we have already got and fixed unpack_sha1_rest() to *** avoid copying more than size bytes. Documentation/git-http-pull.txt | 6 ++++- Documentation/git-local-pull.txt | 6 ++++- Documentation/git-rpull.txt | 6 ++++- cache.h | 1 + pull.h | 3 +++ http-pull.c | 4 +++- local-pull.c | 4 +++- pull.c | 7 ++++++ rpull.c | 4 +++- sha1_file.c | 43 +++++++++++++++++++++++++++++++++++++- 10 files changed, 77 insertions(+), 7 deletions(-) diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt --- a/Documentation/git-http-pull.txt +++ b/Documentation/git-http-pull.txt @@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r SYNOPSIS -------- -'git-http-pull' [-c] [-t] [-a] [-v] commit-id url +'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url DESCRIPTION ----------- @@ -21,6 +21,10 @@ Downloads a remote GIT repository via HT Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt --- a/Documentation/git-local-pull.txt +++ b/Documentation/git-local-pull.txt @@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT SYNOPSIS -------- -'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path +'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path DESCRIPTION ----------- @@ -23,6 +23,10 @@ OPTIONS Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt --- a/Documentation/git-rpull.txt +++ b/Documentation/git-rpull.txt @@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito SYNOPSIS -------- -'git-rpull' [-c] [-t] [-a] [-v] commit-id url +'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url DESCRIPTION ----------- @@ -25,6 +25,10 @@ OPTIONS Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -153,6 +153,7 @@ extern char *sha1_file_name(const unsign extern void * map_sha1_file(const unsigned char *sha1, unsigned long *size); extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size); extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep); +extern int sha1_delta_base(const unsigned char *, unsigned char *); extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size); extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size); extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *return_sha1); diff --git a/pull.h b/pull.h --- a/pull.h +++ b/pull.h @@ -13,6 +13,9 @@ extern int get_history; /** Set to fetch the trees in the commit history. **/ extern int get_all; +/* Set to zero to skip the check for delta object base. */ +extern int get_delta; + /* Set to be verbose */ extern int get_verbosely; diff --git a/http-pull.c b/http-pull.c --- a/http-pull.c +++ b/http-pull.c @@ -103,6 +103,8 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 0; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; @@ -113,7 +115,7 @@ int main(int argc, char **argv) arg++; } if (argc < arg + 2) { - usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/local-pull.c b/local-pull.c --- a/local-pull.c +++ b/local-pull.c @@ -74,7 +74,7 @@ int fetch(unsigned char *sha1) } static const char *local_pull_usage = -"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path"; +"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path"; /* * By default we only use file copy. @@ -92,6 +92,8 @@ int main(int argc, char **argv) get_tree = 1; else if (argv[arg][1] == 'c') get_history = 1; + else if (argv[arg][1] == 'd') + get_delta = 0; else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; diff --git a/pull.c b/pull.c --- a/pull.c +++ b/pull.c @@ -6,6 +6,7 @@ int get_tree = 0; int get_history = 0; +int get_delta = 1; int get_all = 0; int get_verbosely = 0; static unsigned char current_commit_sha1[20]; @@ -37,6 +38,12 @@ static int make_sure_we_have_it(const ch status = fetch(sha1); if (status && what) report_missing(what, sha1); + if (get_delta) { + char delta_sha1[20]; + status = sha1_delta_base(sha1, delta_sha1); + if (0 < status) + status = make_sure_we_have_it(what, delta_sha1); + } return status; } diff --git a/rpull.c b/rpull.c --- a/rpull.c +++ b/rpull.c @@ -27,6 +27,8 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 0; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; @@ -37,7 +39,7 @@ int main(int argc, char **argv) arg++; } if (argc < arg + 2) { - usage("git-rpull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -325,7 +325,13 @@ void *unpack_sha1_rest(z_stream *stream, int bytes = strlen(buffer) + 1; char *buf = xmalloc(1+size); - memcpy(buf, buffer + bytes, stream->total_out - bytes); + /* (stream->total_out - bytes) is what we already have. The + * caller could be asking for something smaller than that. + */ + if (size < stream->total_out - bytes) + memcpy(buf, buffer + bytes, size); + else + memcpy(buf, buffer + bytes, stream->total_out - bytes); bytes = stream->total_out - bytes; if (bytes < size) { stream->next_out = buf + bytes; @@ -401,6 +407,41 @@ void * unpack_sha1_file(void *map, unsig return unpack_sha1_rest(&stream, hdr, *size); } +int sha1_delta_base(const unsigned char *sha1, unsigned char *base_sha1) +{ + int ret; + unsigned long mapsize, size; + void *map; + z_stream stream; + char hdr[1024], type[20]; + void *delta_data_head; + + map = map_sha1_file(sha1, &mapsize); + if (!map) + return -1; + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); + if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) { + ret = -1; + goto out; + } + if (strcmp(type, "delta")) { + ret = 0; + goto out; + } + delta_data_head = unpack_sha1_rest(&stream, hdr, 20); + if (!delta_data_head) { + ret = -1; + goto out; + } + ret = 1; + memcpy(base_sha1, delta_data_head, 20); + free(delta_data_head); + out: + inflateEnd(&stream); + munmap(map, mapsize); + return ret; +} + void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size) { unsigned long mapsize; ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 18:55 ` Junio C Hamano @ 2005-06-02 21:31 ` Nicolas Pitre 2005-06-02 21:36 ` Nicolas Pitre 0 siblings, 1 reply; 23+ messages in thread From: Nicolas Pitre @ 2005-06-02 21:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, git On Thu, 2 Jun 2005, Junio C Hamano wrote: > The initial parts of each retrieved SHA1 file is inflated and > inspected to see if it is deltified, and its base object is > asked from the remote side when it is. Since this partial > inflation and inspection has a small performance hit, it can > optionally be skipped by giving -d flag to git-*-pull commands. It is still way more expensive than it could. > diff --git a/sha1_file.c b/sha1_file.c > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -325,7 +325,13 @@ void *unpack_sha1_rest(z_stream *stream, > int bytes = strlen(buffer) + 1; > char *buf = xmalloc(1+size); > > - memcpy(buf, buffer + bytes, stream->total_out - bytes); > + /* (stream->total_out - bytes) is what we already have. The > + * caller could be asking for something smaller than that. > + */ > + if (size < stream->total_out - bytes) > + memcpy(buf, buffer + bytes, size); > + else > + memcpy(buf, buffer + bytes, stream->total_out - bytes); > bytes = stream->total_out - bytes; > if (bytes < size) { > stream->next_out = buf + bytes; This hunk is completely unneeded. > @@ -401,6 +407,41 @@ void * unpack_sha1_file(void *map, unsig > return unpack_sha1_rest(&stream, hdr, *size); > } > > +int sha1_delta_base(const unsigned char *sha1, unsigned char *base_sha1) > +{ > + int ret; > + unsigned long mapsize, size; > + void *map; > + z_stream stream; > + char hdr[1024], type[20]; Don't make hdr 1024 bytes long. If you do so unpack_sha1_header() will uncompress up to 1024 bytes which is way above required. Instead, consider a value of say 64 which is plenty sufficient (10 for the type string, another 10 for the size, 20 for the reference sha1 and another 10 for the beginning of the delta data that must include the size, and the rest for good measure). > + void *delta_data_head; > + > + map = map_sha1_file(sha1, &mapsize); > + if (!map) > + return -1; > + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); > + if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) { > + ret = -1; > + goto out; > + } > + if (strcmp(type, "delta")) { > + ret = 0; > + goto out; > + } > + delta_data_head = unpack_sha1_rest(&stream, hdr, 20); Here you don't need to call unpack_sha1_rest() at all which would call xmalloc and another memcpy needlessly. Instead, just use: memcpy(base_sha1, hdr + strlen(hdr) + 1, 20); and you're done. No need to call an extra free() either. And maybe this function should live in delta.c instead? Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 21:31 ` Nicolas Pitre @ 2005-06-02 21:36 ` Nicolas Pitre 2005-06-02 22:19 ` [PATCH 1/2] " Junio C Hamano 2005-06-02 22:20 ` [PATCH 2/2] Find size of SHA1 object without inflating everything Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Nicolas Pitre @ 2005-06-02 21:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, git On Thu, 2 Jun 2005, Nicolas Pitre wrote: > Here you don't need to call unpack_sha1_rest() at all which would call > xmalloc and another memcpy needlessly. Instead, just use: > > memcpy(base_sha1, hdr + strlen(hdr) + 1, 20); > > and you're done. No need to call an extra free() either. > > And maybe this function should live in delta.c instead? Forget about my suggestion of moving it to delta.c. Since it assumes knowledge of the object header format it better stay close to the other functions in sha1_file.c. Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/2] Handle deltified object correctly in git-*-pull family. 2005-06-02 21:36 ` Nicolas Pitre @ 2005-06-02 22:19 ` Junio C Hamano 2005-06-02 22:48 ` Linus Torvalds 2005-06-02 22:20 ` [PATCH 2/2] Find size of SHA1 object without inflating everything Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 22:19 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, git >>>>> "NP" == Nicolas Pitre <nico@cam.org> writes: >> Here you don't need to call unpack_sha1_rest() at all which would call >> xmalloc and another memcpy needlessly. Instead,... Like this... ------------ When a remote repository is deltified, we need to get the objects that a deltified object we want to obtain is based upon. The initial parts of each retrieved SHA1 file is inflated and inspected to see if it is deltified, and its base object is asked from the remote side when it is. Since this partial inflation and inspection has a small performance hit, it can optionally be skipped by giving -d flag to git-*-pull commands. This flag should be used only when the remote repository is known to have no deltified objects. Rsync transport does not have this problem since it fetches everything the remote side has. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Thanks and credits goes to Nico for suggesting not to *** use unpack_sha1_rest(). Documentation/git-http-pull.txt | 6 +++++- Documentation/git-local-pull.txt | 6 +++++- Documentation/git-rpull.txt | 6 +++++- cache.h | 1 + pull.h | 3 +++ http-pull.c | 4 +++- local-pull.c | 4 +++- pull.c | 7 +++++++ rpull.c | 4 +++- sha1_file.c | 31 +++++++++++++++++++++++++++++++ 10 files changed, 66 insertions(+), 6 deletions(-) diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt --- a/Documentation/git-http-pull.txt +++ b/Documentation/git-http-pull.txt @@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r SYNOPSIS -------- -'git-http-pull' [-c] [-t] [-a] [-v] commit-id url +'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url DESCRIPTION ----------- @@ -21,6 +21,10 @@ Downloads a remote GIT repository via HT Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt --- a/Documentation/git-local-pull.txt +++ b/Documentation/git-local-pull.txt @@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT SYNOPSIS -------- -'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path +'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path DESCRIPTION ----------- @@ -23,6 +23,10 @@ OPTIONS Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt --- a/Documentation/git-rpull.txt +++ b/Documentation/git-rpull.txt @@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito SYNOPSIS -------- -'git-rpull' [-c] [-t] [-a] [-v] commit-id url +'git-rpull' [-c] [-t] [-a] [-d] [-v] commit-id url DESCRIPTION ----------- @@ -25,6 +25,10 @@ OPTIONS Get trees associated with the commit objects. -a:: Get all the objects. +-d:: + Do not check for delta base objects (use this option + only when you know the remote repository is not + deltified). -v:: Report what is downloaded. diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -153,6 +153,7 @@ extern char *sha1_file_name(const unsign extern void * map_sha1_file(const unsigned char *sha1, unsigned long *size); extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size); extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep); +extern int sha1_delta_base(const unsigned char *, unsigned char *); extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size); extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size); extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *return_sha1); diff --git a/pull.h b/pull.h --- a/pull.h +++ b/pull.h @@ -13,6 +13,9 @@ extern int get_history; /** Set to fetch the trees in the commit history. **/ extern int get_all; +/* Set to zero to skip the check for delta object base. */ +extern int get_delta; + /* Set to be verbose */ extern int get_verbosely; diff --git a/http-pull.c b/http-pull.c --- a/http-pull.c +++ b/http-pull.c @@ -103,6 +103,8 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 0; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; @@ -113,7 +115,7 @@ int main(int argc, char **argv) arg++; } if (argc < arg + 2) { - usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/local-pull.c b/local-pull.c --- a/local-pull.c +++ b/local-pull.c @@ -74,7 +74,7 @@ int fetch(unsigned char *sha1) } static const char *local_pull_usage = -"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path"; +"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path"; /* * By default we only use file copy. @@ -92,6 +92,8 @@ int main(int argc, char **argv) get_tree = 1; else if (argv[arg][1] == 'c') get_history = 1; + else if (argv[arg][1] == 'd') + get_delta = 0; else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; diff --git a/pull.c b/pull.c --- a/pull.c +++ b/pull.c @@ -6,6 +6,7 @@ int get_tree = 0; int get_history = 0; +int get_delta = 1; int get_all = 0; int get_verbosely = 0; static unsigned char current_commit_sha1[20]; @@ -37,6 +38,12 @@ static int make_sure_we_have_it(const ch status = fetch(sha1); if (status && what) report_missing(what, sha1); + if (get_delta) { + char delta_sha1[20]; + status = sha1_delta_base(sha1, delta_sha1); + if (0 < status) + status = make_sure_we_have_it(what, delta_sha1); + } return status; } diff --git a/rpull.c b/rpull.c --- a/rpull.c +++ b/rpull.c @@ -27,6 +27,8 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 0; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; @@ -37,7 +39,7 @@ int main(int argc, char **argv) arg++; } if (argc < arg + 2) { - usage("git-rpull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -401,6 +401,37 @@ void * unpack_sha1_file(void *map, unsig return unpack_sha1_rest(&stream, hdr, *size); } +int sha1_delta_base(const unsigned char *sha1, unsigned char *base_sha1) +{ + int ret; + unsigned long mapsize, size; + void *map; + z_stream stream; + char hdr[64], type[20]; + void *delta_data_head; + + map = map_sha1_file(sha1, &mapsize); + if (!map) + return -1; + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); + if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) { + ret = -1; + goto out; + } + if (strcmp(type, "delta")) { + ret = 0; + goto out; + } + + delta_data_head = hdr + strlen(hdr) + 1; + ret = 1; + memcpy(base_sha1, delta_data_head, 20); + out: + inflateEnd(&stream); + munmap(map, mapsize); + return ret; +} + void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size) { unsigned long mapsize; ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] Handle deltified object correctly in git-*-pull family. 2005-06-02 22:19 ` [PATCH 1/2] " Junio C Hamano @ 2005-06-02 22:48 ` Linus Torvalds 0 siblings, 0 replies; 23+ messages in thread From: Linus Torvalds @ 2005-06-02 22:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nicolas Pitre, git On Thu, 2 Jun 2005, Junio C Hamano wrote: > > Like this... Yup. Applied, thanks, Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/2] Find size of SHA1 object without inflating everything. 2005-06-02 21:36 ` Nicolas Pitre 2005-06-02 22:19 ` [PATCH 1/2] " Junio C Hamano @ 2005-06-02 22:20 ` Junio C Hamano 1 sibling, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 22:20 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, git This adds sha1_file_size() helper function and uses it in the rename/copy similarity estimator. The helper function handles deltified object as well. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Thanks and credits goes to Nico for suggesting not to *** use unpack_sha1_rest(). cache.h | 1 + diff.c | 11 ++++++----- sha1_file.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+), 5 deletions(-) diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -154,6 +154,7 @@ extern void * map_sha1_file(const unsign extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size); extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep); extern int sha1_delta_base(const unsigned char *, unsigned char *); +extern int sha1_file_size(const unsigned char *, unsigned long *); extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size); extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size); extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *return_sha1); diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -333,7 +333,6 @@ int diff_populate_filespec(struct diff_f close(fd); } else { - /* We cannot do size only for SHA1 blobs */ char type[20]; struct sha1_size_cache *e; @@ -343,11 +342,13 @@ int diff_populate_filespec(struct diff_f s->size = e->size; return 0; } + if (!sha1_file_size(s->sha1, &s->size)) + locate_size_cache(s->sha1, s->size); + } + else { + s->data = read_sha1_file(s->sha1, type, &s->size); + s->should_free = 1; } - s->data = read_sha1_file(s->sha1, type, &s->size); - s->should_free = 1; - if (s->data && size_only) - locate_size_cache(s->sha1, s->size); } return 0; } diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -432,6 +432,66 @@ int sha1_delta_base(const unsigned char return ret; } +int sha1_file_size(const unsigned char *sha1, unsigned long *sizep) +{ + int ret, status; + unsigned long mapsize, size; + void *map; + z_stream stream; + char hdr[64], type[20]; + const unsigned char *data; + unsigned char cmd; + int i; + + map = map_sha1_file(sha1, &mapsize); + if (!map) + return -1; + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); + status = -1; + if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) + goto out; + if (strcmp(type, "delta")) { + *sizep = size; + status = 0; + goto out; + } + + /* We are dealing with a delta object. Inflated, the first + * 20 bytes hold the base object SHA1, and delta data follows + * immediately after it. + * + * The initial part of the delta starts at delta_data_head + + * 20. Borrow code from patch-delta to read the result size. + */ + data = hdr + strlen(hdr) + 1 + 20; + + /* Skip over the source size; we are not interested in + * it and we cannot verify it because we do not want + * to read the base object. + */ + cmd = *data++; + while (cmd) { + if (cmd & 1) + data++; + cmd >>= 1; + } + /* Read the result size */ + size = i = 0; + cmd = *data++; + while (cmd) { + if (cmd & 1) + size |= *data++ << i; + i += 8; + cmd >>= 1; + } + *sizep = size; + status = 0; + out: + inflateEnd(&stream); + munmap(map, mapsize); + return status; +} + void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size) { unsigned long mapsize; ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Find size of SHA1 object without inflating everything. 2005-06-02 17:03 ` Linus Torvalds 2005-06-02 18:55 ` Junio C Hamano @ 2005-06-02 18:57 ` Junio C Hamano 2005-06-02 22:10 ` Linus Torvalds 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 18:57 UTC (permalink / raw) To: Linus Torvalds; +Cc: git This adds sha1_file_size() helper function and uses it in the rename/copy similarity estimator. The helper function handles deltified object as well. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** The U*MAX fix patch from previous round still applies, so I am *** resending it to you. This probably would depend on it, so *** please apply U*MAX fix patch before this one. cache.h | 1 + diff.c | 11 ++++++---- sha1_file.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 71 insertions(+), 5 deletions(-) diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -154,6 +154,7 @@ extern void * map_sha1_file(const unsign extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size); extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep); extern int sha1_delta_base(const unsigned char *, unsigned char *); +extern int sha1_file_size(const unsigned char *, unsigned long *); extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size); extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size); extern int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned char *return_sha1); diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -333,7 +333,6 @@ int diff_populate_filespec(struct diff_f close(fd); } else { - /* We cannot do size only for SHA1 blobs */ char type[20]; struct sha1_size_cache *e; @@ -343,11 +342,13 @@ int diff_populate_filespec(struct diff_f s->size = e->size; return 0; } + if (!sha1_file_size(s->sha1, &s->size)) + locate_size_cache(s->sha1, s->size); + } + else { + s->data = read_sha1_file(s->sha1, type, &s->size); + s->should_free = 1; } - s->data = read_sha1_file(s->sha1, type, &s->size); - s->should_free = 1; - if (s->data && size_only) - locate_size_cache(s->sha1, s->size); } return 0; } diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -442,6 +442,70 @@ int sha1_delta_base(const unsigned char return ret; } +int sha1_file_size(const unsigned char *sha1, unsigned long *sizep) +{ + int ret, status; + unsigned long mapsize, size; + void *map; + z_stream stream; + char hdr[1024], type[20]; + void *delta_data_head; + const unsigned char *data; + unsigned char cmd; + int i; + + map = map_sha1_file(sha1, &mapsize); + if (!map) + return -1; + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); + status = -1; + if (ret < Z_OK || parse_sha1_header(hdr, type, &size) < 0) + goto out; + if (strcmp(type, "delta")) { + *sizep = size; + status = 0; + goto out; + } + + /* We are dealing with delta object. Inflated, the first 20 + * bytes hold the base object SHA1, and delta data follows + * immediately after it. + */ + delta_data_head = unpack_sha1_rest(&stream, hdr, 200); + + /* The initial part of the delta starts at delta_data_head + + * 20. Borrow code from patch-delta to read the result size. + */ + + data = delta_data_head + 20; + + /* Skip over the source size; we are not interested in + * it and we cannot verify it because we do not want + * to read the base object. + */ + cmd = *data++; + while (cmd) { + if (cmd & 1) + data++; + cmd >>= 1; + } + /* Read the result size */ + size = i = 0; + cmd = *data++; + while (cmd) { + if (cmd & 1) + size |= *data++ << i; + i += 8; + cmd >>= 1; + } + *sizep = size; + status = 0; + out: + inflateEnd(&stream); + munmap(map, mapsize); + return status; +} + void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size) { unsigned long mapsize; ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Find size of SHA1 object without inflating everything. 2005-06-02 18:57 ` [PATCH] " Junio C Hamano @ 2005-06-02 22:10 ` Linus Torvalds 2005-06-02 22:47 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-06-02 22:10 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, 2 Jun 2005, Junio C Hamano wrote: > > +int sha1_file_size(const unsigned char *sha1, unsigned long *sizep) ... > + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); ... > + delta_data_head = unpack_sha1_rest(&stream, hdr, 200); Why do you do this? You've already unpacked 1024 bytes (including the header), now you want to unpack at least 200 bytes past the header (which is less than what you already did. So here "unpack_sha1_rest()" just ends up being a "xmalloc + memcpy", but since you don't actually want the malloc (indeed, you're leaking it, as far as I can tell), it seems to be all bad.. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Find size of SHA1 object without inflating everything. 2005-06-02 22:10 ` Linus Torvalds @ 2005-06-02 22:47 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 22:47 UTC (permalink / raw) To: Linus Torvalds; +Cc: git >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: LT> On Thu, 2 Jun 2005, Junio C Hamano wrote: >> >> +int sha1_file_size(const unsigned char *sha1, unsigned long *sizep) LT> ... >> + ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)); LT> ... >> + delta_data_head = unpack_sha1_rest(&stream, hdr, 200); LT> Why do you do this? Because I was not thinking when I wrote "char hdr[1024]". Nico pointed out the same problem and you have a fixed version of both in your mailbox. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 16:46 ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano 2005-06-02 17:03 ` Linus Torvalds @ 2005-06-02 17:03 ` McMullan, Jason 2005-06-02 18:02 ` Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: McMullan, Jason @ 2005-06-02 17:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, GIT Mailling list [-- Attachment #1: Type: text/plain, Size: 1276 bytes --] On Thu, 2005-06-02 at 09:46 -0700, Junio C Hamano wrote: > When a remote repository is deltified, we need to get the > objects that a deltified object we want to obtain is based upon. > The initial parts of each retrieved SHA1 file is inflated and > inspected to see if it is deltified, and its base object is > asked from the remote side when it is. Since this partial > inflation and inspection has a small performance hit, it can > optionally be skipped by giving -d flag to git-*-pull commands. > This flag should be used only when the remote repository is > known to have no deltified objects. Eww. Don't you want to attempt to get the referenced sha1 *before* you stick the delta blob into the repository? Otherwise, if a transfer fails, there's no good way to recover your database through the git-*pull interfaces, ie: Try to pull session 1: pull tree a pull delta b (references blob c) pull blob c -> fails! Try to pull session 2: pull tree a delta b - Found in database! ... Or do I not understand your code properly? (I'm working on a similar piece of code for my git-daemon protocol, which is why I'm worried about this issue) -- Jason McMullan <jason.mcmullan@timesys.com> TimeSys Corporation [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 17:03 ` [PATCH] Handle deltified object correctly in git-*-pull family McMullan, Jason @ 2005-06-02 18:02 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 18:02 UTC (permalink / raw) To: McMullan, Jason; +Cc: Linus Torvalds, GIT Mailling list >>>>> "JM" == McMullan, Jason <jason.mcmullan@timesys.com> writes: JM> Eww. Don't you want to attempt to get the referenced sha1 *before* JM> you stick the delta blob into the repository? That issue crossed my mind, and I admit I haven't looked at the issues closely enough, but I suspect that it may not worth it with the current pull.c structure. The current pull code fetches and stores a commit object before it retrieves the tree object associate with it, and similarly a tree object before its subtree and blobs, which has the same issue. Adding -r (recover) option to the pull family to not check for the existence of required object but its dependents would be necessary if my suspition turns out to be correct, and delta dependency should be handled the same way commit and tree dependencies are handled there. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Use correct U*MAX. 2005-06-01 18:38 ` [PATCH] diff: mode bits fixes Junio C Hamano 2005-06-02 16:46 ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano @ 2005-06-02 16:47 ` Junio C Hamano 2005-06-03 23:02 ` Petr Baudis 2005-06-02 16:49 ` [PATCH] Find size of SHA1 object without inflating everything Junio C Hamano 2 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 16:47 UTC (permalink / raw) To: Linus Torvalds; +Cc: git The largest "unsigned long" value is ULONG_MAX, not UINT_MAX. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Linus, you may be unable to spell, but I cannot type ;-). count-delta.c | 4 ++-- diff.c | 4 ++-- diffcore-break.c | 2 +- diffcore-rename.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/count-delta.c b/count-delta.c --- a/count-delta.c +++ b/count-delta.c @@ -46,7 +46,7 @@ unsigned long count_delta(void *delta_bu /* the smallest delta size possible is 6 bytes */ if (delta_size < 6) - return UINT_MAX; + return ULONG_MAX; data = delta_buf; top = delta_buf + delta_size; @@ -83,7 +83,7 @@ unsigned long count_delta(void *delta_bu /* sanity check */ if (data != top || out != dst_size) - return UINT_MAX; + return ULONG_MAX; /* delete size is what was _not_ copied from source. * edit size is that and literal additions. diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -256,7 +256,7 @@ static struct sha1_size_cache *locate_si first = next+1; } /* not found */ - if (size == UINT_MAX) + if (size == ULONG_MAX) return NULL; /* insert to make it at "first" */ if (sha1_size_cache_alloc <= sha1_size_cache_nr) { @@ -338,7 +338,7 @@ int diff_populate_filespec(struct diff_f struct sha1_size_cache *e; if (size_only) { - e = locate_size_cache(s->sha1, UINT_MAX); + e = locate_size_cache(s->sha1, ULONG_MAX); if (e) { s->size = e->size; return 0; diff --git a/diffcore-break.c b/diffcore-break.c --- a/diffcore-break.c +++ b/diffcore-break.c @@ -63,7 +63,7 @@ static int very_different(struct diff_fi /* Estimate the edit size by interpreting delta. */ delta_size = count_delta(delta, delta_size); free(delta); - if (delta_size == UINT_MAX) + if (delta_size == ULONG_MAX) return 0; /* error in delta computation */ if (base_size < delta_size) diff --git a/diffcore-rename.c b/diffcore-rename.c --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -176,7 +176,7 @@ static int estimate_similarity(struct di /* Estimate the edit size by interpreting delta. */ delta_size = count_delta(delta, delta_size); free(delta); - if (delta_size == UINT_MAX) + if (delta_size == ULONG_MAX) return 0; /* ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Use correct U*MAX. 2005-06-02 16:47 ` [PATCH] Use correct U*MAX Junio C Hamano @ 2005-06-03 23:02 ` Petr Baudis 2005-06-03 23:40 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Petr Baudis @ 2005-06-03 23:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, git Dear diary, on Thu, Jun 02, 2005 at 06:47:33PM CEST, I got a letter where Junio C Hamano <junkio@cox.net> told me that... > The largest "unsigned long" value is ULONG_MAX, not UINT_MAX. > > Signed-off-by: Junio C Hamano <junkio@cox.net> > diff --git a/diff.c b/diff.c > --- a/diff.c > +++ b/diff.c > @@ -256,7 +256,7 @@ static struct sha1_size_cache *locate_si > first = next+1; > } > /* not found */ > - if (size == UINT_MAX) > + if (size == ULONG_MAX) > return NULL; > /* insert to make it at "first" */ > if (sha1_size_cache_alloc <= sha1_size_cache_nr) { > @@ -338,7 +338,7 @@ int diff_populate_filespec(struct diff_f > struct sha1_size_cache *e; > > if (size_only) { > - e = locate_size_cache(s->sha1, UINT_MAX); > + e = locate_size_cache(s->sha1, ULONG_MAX); > if (e) { > s->size = e->size; > return 0; This one still applies, but it might be better to get rid of it altogether, like... [PATCH] Kill UINT_MAX usage in locate_size_cache() Use -1 instead of UINT_MAX to indicate that locate_size_cache() should do only the lookup and not create new stuff in case the looked up hash was not found. Signed-off-by: Petr Baudis <pasky@ucw.cz> diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -256,7 +256,7 @@ static struct sha1_size_cache *locate_si first = next+1; } /* not found */ - if (size == UINT_MAX) + if (size == -1) return NULL; /* insert to make it at "first" */ if (sha1_size_cache_alloc <= sha1_size_cache_nr) { @@ -337,7 +337,7 @@ int diff_populate_filespec(struct diff_f struct sha1_size_cache *e; if (size_only) { - e = locate_size_cache(s->sha1, UINT_MAX); + e = locate_size_cache(s->sha1, -1); if (e) { s->size = e->size; return 0; | -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Use correct U*MAX. 2005-06-03 23:02 ` Petr Baudis @ 2005-06-03 23:40 ` Junio C Hamano 2005-06-04 0:00 ` Petr Baudis 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-03 23:40 UTC (permalink / raw) To: Petr Baudis; +Cc: Linus Torvalds, git I'd rather see it use the correct U*MAX. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Use correct U*MAX. 2005-06-03 23:40 ` Junio C Hamano @ 2005-06-04 0:00 ` Petr Baudis 2005-06-04 0:09 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Petr Baudis @ 2005-06-04 0:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, git Dear diary, on Sat, Jun 04, 2005 at 01:40:21AM CEST, I got a letter where Junio C Hamano <junkio@cox.net> told me that... > I'd rather see it use the correct U*MAX. Care to elaborate? It doesn't make sense to me to use any U*MAX stuff there whatsoever. (And how do you define the "correct" U*MAX anyway?) P.S.: Could you please always keep at least some context in the mail? Thanks. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Use correct U*MAX. 2005-06-04 0:00 ` Petr Baudis @ 2005-06-04 0:09 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-04 0:09 UTC (permalink / raw) To: Petr Baudis; +Cc: Linus Torvalds, git >>>>> "PB" == Petr Baudis <pasky@ucw.cz> writes: >> I'd rather see it use the correct U*MAX. PB> Care to elaborate? It doesn't make sense to me to use any U*MAX stuff PB> there whatsoever. (And how do you define the "correct" U*MAX anyway?) It just feels wrong to spell a parameter to the function locate_size_cache() "-1" when I know the argument it expects is of type unsigned long. And correct U*MAX for that case is obviously ULONG_MAX, _assuming_ that you agree to the function's (unwritten) calling convention of "passing the largest possible value to me means 'do not create', not 'you are telling me that sha1 is such a large file'". If you feel strongly about that calling convention, you could rewrite it to take the third argument "int do_not_create" and pass that information separately, which is conceptually cleaner. I just did not think that was worth it for such an internal helper when I wrote it. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Find size of SHA1 object without inflating everything. 2005-06-01 18:38 ` [PATCH] diff: mode bits fixes Junio C Hamano 2005-06-02 16:46 ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano 2005-06-02 16:47 ` [PATCH] Use correct U*MAX Junio C Hamano @ 2005-06-02 16:49 ` Junio C Hamano 2 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 16:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: git This adds sha1_file_size() helper function and uses it in the rename/copy similarity estimator. The helper function handles deltified object as well. Signed-off-by: Junio C Hamano <junkio@cox.net> --- *** Linus, this also uses the new helper you wrote. cache.h | 2 +- diff.c | 11 +++++--- sha1_file.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 85 insertions(+), 6 deletions(-) diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -159,7 +159,7 @@ extern int write_sha1_file(void *buf, un extern int check_sha1_signature(unsigned char *sha1, void *buf, unsigned long size, const char *type); extern int sha1_delta_base(const unsigned char *, unsigned char *); - +extern int sha1_file_size(const unsigned char *, unsigned long *); /* Read a tree into the cache */ extern int read_tree(void *buffer, unsigned long size, int stage); diff --git a/diff.c b/diff.c --- a/diff.c +++ b/diff.c @@ -333,7 +333,6 @@ int diff_populate_filespec(struct diff_f close(fd); } else { - /* We cannot do size only for SHA1 blobs */ char type[20]; struct sha1_size_cache *e; @@ -343,11 +342,13 @@ int diff_populate_filespec(struct diff_f s->size = e->size; return 0; } + if (!sha1_file_size(s->sha1, &s->size)) + locate_size_cache(s->sha1, s->size); + } + else { + s->data = read_sha1_file(s->sha1, type, &s->size); + s->should_free = 1; } - s->data = read_sha1_file(s->sha1, type, &s->size); - s->should_free = 1; - if (s->data && size_only) - locate_size_cache(s->sha1, s->size); } return 0; } diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -387,6 +387,84 @@ int sha1_delta_base(const unsigned char return status; } +int sha1_file_size(const unsigned char *sha1, unsigned long *sizep) +{ + unsigned long mapsize, size; + void *map; + char type[20]; + char buffer[200]; + z_stream stream; + int ret, bytes, status; + + map = map_sha1_file(sha1, &mapsize); + if (!map) + return 0; + ret = unpack_sha1_header(&stream, map, mapsize, buffer, + sizeof(buffer)); + status = -1; + + if (ret < Z_OK || sscanf(buffer, "%10s %lu", type, &size) != 2) + goto out; + if (strcmp(type, "delta")) { + *sizep = size; + status = 0; + goto out; + } + + /* We are dealing with delta object. Inflated, the first 20 + * bytes hold the base object SHA1, and delta data follows + * immediately after it. + */ + bytes = strlen(buffer) + 1; + if (size < bytes + 6 + 20) + goto out; /* the smallest delta size is 6 bytes */ + + memmove(buffer, buffer + bytes, stream.total_out - bytes); + bytes = stream.total_out - bytes; + if (bytes < sizeof(buffer) && ret == Z_OK) { + stream.next_out = buffer + bytes; + stream.avail_out = sizeof(buffer) - bytes; + while (inflate(&stream, Z_FINISH) == Z_OK) + ; /* nothing */ + } + + /* We have read initial part of the delta, which starts at + * buffer+20. Borrow code from patch-delta to read the + * result size. + */ + { + const unsigned char *data = buffer + 20; + unsigned char cmd; + int i; + + /* Skip over the source size; we are not interested in + * it and we cannot verify it because we do not want + * to read the base object. + */ + cmd = *data++; + while (cmd) { + if (cmd & 1) + data++; + cmd >>= 1; + } + /* Read the result size */ + size = i = 0; + cmd = *data++; + while (cmd) { + if (cmd & 1) + size |= *data++ << i; + i += 8; + cmd >>= 1; + } + *sizep = size; + } + status = 0; + out: + inflateEnd(&stream); + munmap(map, mapsize); + return status; +} + void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size) { unsigned long mapsize; ------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* I want to release a "git-1.0" @ 2005-05-30 20:00 Linus Torvalds 2005-05-30 20:49 ` Nicolas Pitre 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-05-30 20:00 UTC (permalink / raw) To: Git Mailing List Ok, I'm at the point where I really think it's getting close to a 1.0, and make another tar-ball etc. I obviously feel that it's already way superior to CVS, but I also realize that somebody who is used to CVS may not actually realize that very easily. So before I do a 1.0 release, I want to write some stupid git tutorial for a complete beginner that has only used CVS before, with a real example of how to use raw git, and along those lines I actually want the thing to show how to do something useful. So before I do that, is there something people think is just too hard for somebody coming from the CVS world to understand? I already realized that the "git-write-tree" + "git-commit-tree" interfaces were just _too_ hard to put into a sane tutorial. I was showing off raw git to Steve Chamberlain yesterday, and showing it to him made some things pretty obvious - one of them being that "git-init-db" really needed to set up the initial refs etc). So I wrote this silly "git-commit-script" to make it at least half-way palatable, but what else do people feel is "too hard"? I think I'll move the "cvs2git" script thing to git proper before the 1.0 release (again, in order to have the tutorial able to show what to do if you already have an existing CVS tree), what else? Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: I want to release a "git-1.0" 2005-05-30 20:00 I want to release a "git-1.0" Linus Torvalds @ 2005-05-30 20:49 ` Nicolas Pitre 2005-06-01 6:52 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Nicolas Pitre @ 2005-05-30 20:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: Git Mailing List On Mon, 30 May 2005, Linus Torvalds wrote: > > Ok, I'm at the point where I really think it's getting close to a 1.0, and > make another tar-ball etc. Any chance you could merge my latest mkdelta patch _please_ ??? I just posted it twice in the last 4 days and it still didn't appear in your repository. Again, the current version of mkdelta in your tree has a bug that can screw things up, and it is fixed in the latest patch of course. Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: I want to release a "git-1.0" 2005-05-30 20:49 ` Nicolas Pitre @ 2005-06-01 6:52 ` Junio C Hamano 2005-06-01 8:24 ` [PATCH] Add -d flag to git-pull-* family Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-01 6:52 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, Git Mailing List I just remembered that I mentioned potential problems with non rsync pulls with delta objects, especially when the git-*-pull commands are used in "things only close to the tip" mode, i.e. without "-a" option. Do you think we should do something about it before GIT 1.0 happens? It may be enough if we just tell people not to deltify their public non-rsync repositories in the documentation. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Add -d flag to git-pull-* family. 2005-06-01 6:52 ` Junio C Hamano @ 2005-06-01 8:24 ` Junio C Hamano 2005-06-01 14:39 ` Nicolas Pitre 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-01 8:24 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, Git Mailing List When a remote repository is deltified, we need to get the objects that a deltified object we want to obtain is based upon. Since checking representation type of all objects we retreive from remote side may be costly, this is made into a separate option -d; -a implies it for convenience and safety. Rsync transport does not have this problem since it fetches everything the remote side has. Signed-off-by: Junio C Hamano <junkio@cox.net> --- Documentation/git-http-pull.txt | 4 +++- Documentation/git-local-pull.txt | 4 +++- Documentation/git-rpull.txt | 4 +++- http-pull.c | 5 ++++- local-pull.c | 5 ++++- pull.c | 15 +++++++++++++++ pull.h | 3 +++ rpull.c | 5 ++++- 8 files changed, 39 insertions(+), 6 deletions(-) diff --git a/Documentation/git-http-pull.txt b/Documentation/git-http-pull.txt --- a/Documentation/git-http-pull.txt +++ b/Documentation/git-http-pull.txt @@ -9,7 +9,7 @@ git-http-pull - Downloads a remote GIT r SYNOPSIS -------- -'git-http-pull' [-c] [-t] [-a] [-v] commit-id url +'git-http-pull' [-c] [-t] [-a] [-v] [-d] commit-id url DESCRIPTION ----------- @@ -17,6 +17,8 @@ Downloads a remote GIT repository via HT -c:: Get the commit objects. +-d:: + Get objects that deltified objects are based upon. -t:: Get trees associated with the commit objects. -a:: diff --git a/Documentation/git-local-pull.txt b/Documentation/git-local-pull.txt --- a/Documentation/git-local-pull.txt +++ b/Documentation/git-local-pull.txt @@ -9,7 +9,7 @@ git-local-pull - Duplicates another GIT SYNOPSIS -------- -'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path +'git-local-pull' [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path DESCRIPTION ----------- @@ -19,6 +19,8 @@ OPTIONS ------- -c:: Get the commit objects. +-d:: + Get objects that deltified objects are based upon. -t:: Get trees associated with the commit objects. -a:: diff --git a/Documentation/git-rpull.txt b/Documentation/git-rpull.txt --- a/Documentation/git-rpull.txt +++ b/Documentation/git-rpull.txt @@ -10,7 +10,7 @@ git-rpull - Pulls from a remote reposito SYNOPSIS -------- -'git-rpull' [-c] [-t] [-a] [-v] commit-id url +'git-rpull' [-c] [-t] [-a] [-v] [-d] commit-id url DESCRIPTION ----------- @@ -21,6 +21,8 @@ OPTIONS ------- -c:: Get the commit objects. +-d:: + Get objects that deltified objects are based upon. -t:: Get trees associated with the commit objects. -a:: diff --git a/http-pull.c b/http-pull.c --- a/http-pull.c +++ b/http-pull.c @@ -103,17 +103,20 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 1; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; get_history = 1; + get_delta = 1; } else if (argv[arg][1] == 'v') { get_verbosely = 1; } arg++; } if (argc < arg + 2) { - usage("git-http-pull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-http-pull [-c] [-t] [-a] [-d] [-v] commit-id url"); return 1; } commit_id = argv[arg]; diff --git a/local-pull.c b/local-pull.c --- a/local-pull.c +++ b/local-pull.c @@ -74,7 +74,7 @@ int fetch(unsigned char *sha1) } static const char *local_pull_usage = -"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] commit-id path"; +"git-local-pull [-c] [-t] [-a] [-l] [-s] [-n] [-v] [-d] commit-id path"; /* * By default we only use file copy. @@ -92,10 +92,13 @@ int main(int argc, char **argv) get_tree = 1; else if (argv[arg][1] == 'c') get_history = 1; + else if (argv[arg][1] == 'd') + get_delta = 1; else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; get_history = 1; + get_delta = 1; } else if (argv[arg][1] == 'l') use_link = 1; diff --git a/pull.c b/pull.c --- a/pull.c +++ b/pull.c @@ -6,6 +6,7 @@ int get_tree = 0; int get_history = 0; +int get_delta = 0; int get_all = 0; int get_verbosely = 0; static unsigned char current_commit_sha1[20]; @@ -37,6 +38,20 @@ static int make_sure_we_have_it(const ch status = fetch(sha1); if (status && what) report_missing(what, sha1); + if (get_delta) { + unsigned long mapsize, size; + void *map, *buf; + char type[20]; + + map = map_sha1_file(sha1, &mapsize); + if (map) { + buf = unpack_sha1_file(map, mapsize, type, &size); + munmap(map, mapsize); + if (buf && !strcmp(type, "delta")) + status = make_sure_we_have_it(what, buf); + free(buf); + } + } return status; } diff --git a/pull.h b/pull.h --- a/pull.h +++ b/pull.h @@ -13,6 +13,9 @@ extern int get_history; /** Set to fetch the trees in the commit history. **/ extern int get_all; +/* Set to fetch the base of delta objects.*/ +extern int get_delta; + /* Set to be verbose */ extern int get_verbosely; diff --git a/rpull.c b/rpull.c --- a/rpull.c +++ b/rpull.c @@ -27,17 +27,20 @@ int main(int argc, char **argv) get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; + } else if (argv[arg][1] == 'd') { + get_delta = 1; } else if (argv[arg][1] == 'a') { get_all = 1; get_tree = 1; get_history = 1; + get_delta = 1; } else if (argv[arg][1] == 'v') { get_verbosely = 1; } arg++; } if (argc < arg + 2) { - usage("git-rpull [-c] [-t] [-a] [-v] commit-id url"); + usage("git-rpull [-c] [-t] [-a] [-v] [-d] commit-id url"); return 1; } commit_id = argv[arg]; ------------------------------------------------ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Add -d flag to git-pull-* family. 2005-06-01 8:24 ` [PATCH] Add -d flag to git-pull-* family Junio C Hamano @ 2005-06-01 14:39 ` Nicolas Pitre 2005-06-01 16:00 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Nicolas Pitre @ 2005-06-01 14:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List On Wed, 1 Jun 2005, Junio C Hamano wrote: > When a remote repository is deltified, we need to get the > objects that a deltified object we want to obtain is based upon. > Since checking representation type of all objects we retreive > from remote side may be costly, this is made into a separate > option -d; -a implies it for convenience and safety. I wonder if making this optional makes sense. In fact, if you believe having the option is useful then it should probably be the other way around i.e. to _not_ look at deltas when it is specified. Otherwise you'll end up with an incoherent repository. To minimize the cost a lot it could be possible to uncompress just the first 40 bytes or so which is enough to determine if the object is a delta and if so what object it is against. What do you think? Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Add -d flag to git-pull-* family. 2005-06-01 14:39 ` Nicolas Pitre @ 2005-06-01 16:00 ` Junio C Hamano [not found] ` <7v1x7lk8fl.fsf_-_@assigned-by-dhcp.cox.net> 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-06-01 16:00 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, Git Mailing List >>>>> "NP" == Nicolas Pitre <nico@cam.org> writes: NP> What do you think? What you say makes a lot more sense than my quick hack on both counts. ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <7v1x7lk8fl.fsf_-_@assigned-by-dhcp.cox.net>]
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. [not found] ` <7v1x7lk8fl.fsf_-_@assigned-by-dhcp.cox.net> @ 2005-06-02 0:47 ` Nicolas Pitre 2005-06-02 0:58 ` Linus Torvalds 1 sibling, 0 replies; 23+ messages in thread From: Nicolas Pitre @ 2005-06-02 0:47 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, Daniel Barkalow <barkalow@iabervon.org> Git Mailing List On Wed, 1 Jun 2005, Junio C Hamano wrote: > *** Dan and Nico, could you check this for correctness? I've > *** tested it with a deltified core GIT repository and pulling > *** with local-pull from there. I have verified that a pull > *** that fails with -d flag retrieves the right base-object to > *** complete a deltified ones. The delta part looks fine to me. Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. [not found] ` <7v1x7lk8fl.fsf_-_@assigned-by-dhcp.cox.net> 2005-06-02 0:47 ` [PATCH] Handle deltified object correctly in git-*-pull family Nicolas Pitre @ 2005-06-02 0:58 ` Linus Torvalds 2005-06-02 1:43 ` Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-06-02 0:58 UTC (permalink / raw) To: Junio C Hamano Cc: Nicolas Pitre, Daniel Barkalow <barkalow@iabervon.org> Git Mailing List On Wed, 1 Jun 2005, Junio C Hamano wrote: > > *** Linus, I have a hook in sha1_file.c to let me figure out the > *** size of the SHA1 file without fully expanding it. This > *** patch does not use it, but you already know where I am > *** heading, so please leave it there ;-). Argh. This is just adding conceptual complexity without any real advantage. Why not just split out the current "unpack_sha1_file()" into two stages: "unpack_sha1_header()" and the rest. Then you can just decide to call "unpack_sha1_header()" when you want the header information. Hmm. I just committed something like that. If you want to just see the type of an object, you can map the object in memory, and just do z_stream stream; char buffer[100]; if (unpack_sha1_header(&stream, map, mapsize, buffer, sizeof(buffer) < 0) return NULL; if (sscanf(buffer, %10s %lu", type, size) != 0) return NULL; .. there you have it .. which is a lot simpler than worrying about callbacks etc. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Handle deltified object correctly in git-*-pull family. 2005-06-02 0:58 ` Linus Torvalds @ 2005-06-02 1:43 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-06-02 1:43 UTC (permalink / raw) To: Linus Torvalds Cc: Nicolas Pitre, Daniel Barkalow <barkalow@iabervon.org> Git Mailing List >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: LT> Why not just split out the current "unpack_sha1_file()" into two stages: LT> "unpack_sha1_header()" and the rest. LT> which is a lot simpler than worrying about callbacks etc. Alright. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2005-06-04 0:09 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <7vy89ums2l.fsf@assigned-by-dhcp.cox.net> 2005-06-01 18:38 ` [PATCH] diff: mode bits fixes Junio C Hamano 2005-06-02 16:46 ` [PATCH] Handle deltified object correctly in git-*-pull family Junio C Hamano 2005-06-02 17:03 ` Linus Torvalds 2005-06-02 18:55 ` Junio C Hamano 2005-06-02 21:31 ` Nicolas Pitre 2005-06-02 21:36 ` Nicolas Pitre 2005-06-02 22:19 ` [PATCH 1/2] " Junio C Hamano 2005-06-02 22:48 ` Linus Torvalds 2005-06-02 22:20 ` [PATCH 2/2] Find size of SHA1 object without inflating everything Junio C Hamano 2005-06-02 18:57 ` [PATCH] " Junio C Hamano 2005-06-02 22:10 ` Linus Torvalds 2005-06-02 22:47 ` Junio C Hamano 2005-06-02 17:03 ` [PATCH] Handle deltified object correctly in git-*-pull family McMullan, Jason 2005-06-02 18:02 ` Junio C Hamano 2005-06-02 16:47 ` [PATCH] Use correct U*MAX Junio C Hamano 2005-06-03 23:02 ` Petr Baudis 2005-06-03 23:40 ` Junio C Hamano 2005-06-04 0:00 ` Petr Baudis 2005-06-04 0:09 ` Junio C Hamano 2005-06-02 16:49 ` [PATCH] Find size of SHA1 object without inflating everything Junio C Hamano 2005-05-30 20:00 I want to release a "git-1.0" Linus Torvalds 2005-05-30 20:49 ` Nicolas Pitre 2005-06-01 6:52 ` Junio C Hamano 2005-06-01 8:24 ` [PATCH] Add -d flag to git-pull-* family Junio C Hamano 2005-06-01 14:39 ` Nicolas Pitre 2005-06-01 16:00 ` Junio C Hamano [not found] ` <7v1x7lk8fl.fsf_-_@assigned-by-dhcp.cox.net> 2005-06-02 0:47 ` [PATCH] Handle deltified object correctly in git-*-pull family Nicolas Pitre 2005-06-02 0:58 ` Linus Torvalds 2005-06-02 1:43 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).