* [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file @ 2007-06-10 9:00 Sam Vilain 2007-06-10 21:24 ` Eric Wong 0 siblings, 1 reply; 6+ messages in thread From: Sam Vilain @ 2007-06-10 9:00 UTC (permalink / raw) To: Eric Wong; +Cc: git This saves a bit of time when rebuilding the git-svn index. Signed-off-by: Sam Vilain <sam@vilain.net> --- git-svn.perl | 30 +++++++++++++++++++----------- 1 files changed, 19 insertions(+), 11 deletions(-) diff --git a/git-svn.perl b/git-svn.perl index e350061..610563c 100755 --- a/git-svn.perl +++ b/git-svn.perl @@ -802,10 +802,15 @@ sub cmt_metadata { sub working_head_info { my ($head, $refs) = @_; - my ($fh, $ctx) = command_output_pipe('rev-list', $head); - while (my $hash = <$fh>) { - chomp($hash); - my ($url, $rev, $uuid) = cmt_metadata($hash); + my ($fh, $ctx) = command_output_pipe('log', $head); + my $hash; + while (<$fh>) { + if ( m{^commit ($::sha1)$} ) { + $hash = $1; + next; + } + next unless s{^\s+(git-svn-id:)}{$1}; + my ($url, $rev, $uuid) = extract_metadata($_); if (defined $url && defined $rev) { if (my $gs = Git::SVN->find_by_url($url)) { my $c = $gs->rev_db_get($rev); @@ -1964,16 +1969,19 @@ sub rebuild { return; } print "Rebuilding $db_path ...\n"; - my ($rev_list, $ctx) = command_output_pipe("rev-list", $self->refname); + my ($log, $ctx) = command_output_pipe("log", $self->refname); my $latest; my $full_url = $self->full_url; remove_username($full_url); my $svn_uuid; - while (<$rev_list>) { - chomp; - my $c = $_; - die "Non-SHA1: $c\n" unless $c =~ /^$::sha1$/o; - my ($url, $rev, $uuid) = ::cmt_metadata($c); + my $c; + while (<$log>) { + if ( m{^commit ($::sha1)$} ) { + $c = $1; + next; + } + next unless s{^\s*(git-svn-id:)}{$1}; + my ($url, $rev, $uuid) = ::extract_metadata($_); remove_username($url); # ignore merges (from set-tree) @@ -1991,7 +1999,7 @@ sub rebuild { $self->rev_db_set($rev, $c); print "r$rev = $c\n"; } - command_close_pipe($rev_list, $ctx); + command_close_pipe($log, $ctx); print "Done rebuilding $db_path\n"; } -- 1.5.0.4.210.gf8a7c-dirty ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file 2007-06-10 9:00 [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Sam Vilain @ 2007-06-10 21:24 ` Eric Wong 2007-06-11 7:34 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Eric Wong @ 2007-06-10 21:24 UTC (permalink / raw) To: Sam Vilain; +Cc: git Sam Vilain <sam@vilain.net> wrote: > This saves a bit of time when rebuilding the git-svn index. Does git-log still have the 16k buffer limit? If so then we can't use it because commit messages over 16k will be truncated and the git-svn-id line will not show up. Also, if that limit is removed I'd prefer to just add --pretty=raw to rev-list because git-log is stil porcelain and more likely to change. > Signed-off-by: Sam Vilain <sam@vilain.net> > --- > git-svn.perl | 30 +++++++++++++++++++----------- > 1 files changed, 19 insertions(+), 11 deletions(-) > > diff --git a/git-svn.perl b/git-svn.perl > index e350061..610563c 100755 > --- a/git-svn.perl > +++ b/git-svn.perl > @@ -802,10 +802,15 @@ sub cmt_metadata { > > sub working_head_info { > my ($head, $refs) = @_; > - my ($fh, $ctx) = command_output_pipe('rev-list', $head); > - while (my $hash = <$fh>) { > - chomp($hash); > - my ($url, $rev, $uuid) = cmt_metadata($hash); > + my ($fh, $ctx) = command_output_pipe('log', $head); > + my $hash; > + while (<$fh>) { > + if ( m{^commit ($::sha1)$} ) { > + $hash = $1; > + next; > + } > + next unless s{^\s+(git-svn-id:)}{$1}; > + my ($url, $rev, $uuid) = extract_metadata($_); > if (defined $url && defined $rev) { > if (my $gs = Git::SVN->find_by_url($url)) { > my $c = $gs->rev_db_get($rev); > @@ -1964,16 +1969,19 @@ sub rebuild { > return; > } > print "Rebuilding $db_path ...\n"; > - my ($rev_list, $ctx) = command_output_pipe("rev-list", $self->refname); > + my ($log, $ctx) = command_output_pipe("log", $self->refname); > my $latest; > my $full_url = $self->full_url; > remove_username($full_url); > my $svn_uuid; > - while (<$rev_list>) { > - chomp; > - my $c = $_; > - die "Non-SHA1: $c\n" unless $c =~ /^$::sha1$/o; > - my ($url, $rev, $uuid) = ::cmt_metadata($c); > + my $c; > + while (<$log>) { > + if ( m{^commit ($::sha1)$} ) { > + $c = $1; > + next; > + } > + next unless s{^\s*(git-svn-id:)}{$1}; > + my ($url, $rev, $uuid) = ::extract_metadata($_); > remove_username($url); > > # ignore merges (from set-tree) > @@ -1991,7 +1999,7 @@ sub rebuild { > $self->rev_db_set($rev, $c); > print "r$rev = $c\n"; > } > - command_close_pipe($rev_list, $ctx); > + command_close_pipe($log, $ctx); > print "Done rebuilding $db_path\n"; > } > > -- > 1.5.0.4.210.gf8a7c-dirty > -- Eric Wong ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file 2007-06-10 21:24 ` Eric Wong @ 2007-06-11 7:34 ` Junio C Hamano 2007-06-12 5:34 ` [PATCH] Extend --pretty=oneline to cover the first paragraph, so that an ugly commit message like this can be handled sanely Junio C Hamano 2007-06-12 6:17 ` [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Eric Wong 0 siblings, 2 replies; 6+ messages in thread From: Junio C Hamano @ 2007-06-11 7:34 UTC (permalink / raw) To: Eric Wong; +Cc: Sam Vilain, git Eric Wong <normalperson@yhbt.net> writes: > Sam Vilain <sam@vilain.net> wrote: >> This saves a bit of time when rebuilding the git-svn index. > > Does git-log still have the 16k buffer limit? If so then we can't use > it because commit messages over 16k will be truncated and the git-svn-id > line will not show up. Also, if that limit is removed I'd prefer to > just add --pretty=raw to rev-list because git-log is stil porcelain and > more likely to change. How about this? It passes the test suite, but other than that hasn't seen much test yet. I tried to be careful, but sanity checking by extra sets of eyeballs would be needed. It changes length from int unsigned long in several places, and you would need to look out for a boolean test like this: if (current_length < limit_length - slop) ... do something ... as it now should be written like this: if (current_length + slop < limit_length) ... do something ... -- >8 -- Subject: Lift 16kB limit of log message output Traditionally we had 16kB limit when formatting log messages for output, because it was easier to arrange for the caller to have a reasonably big buffer and pass it down without ever worrying about reallocating. This changes the calling convention of pretty_print_commit() to lift this limit. Instead of the buffer and remaining length, it now takes a pointer to the pointer that points at the allocated buffer, and another pointer to the location that stores the allocated length, and reallocates the buffer as necessary. To support the user format, the error return of interpolate() needed to be changed. It used to return a bool telling "Ok the result fits", or "Sorry, I had to truncate it". Now it returns 0 on success, and returns the size of the buffer it wants in order to fit the whole result. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- builtin-branch.c | 17 +++++++++----- builtin-log.c | 6 +++- builtin-rev-list.c | 8 ++++-- builtin-show-branch.c | 23 +++++++++++--------- commit.c | 55 ++++++++++++++++++++++++++++++++++++++---------- commit.h | 2 +- interpolate.c | 46 ++++++++++++++++++---------------------- interpolate.h | 6 ++-- log-tree.c | 35 ++++++++++++++++++++---------- 9 files changed, 124 insertions(+), 74 deletions(-) diff --git a/builtin-branch.c b/builtin-branch.c index da48051..d7c321a 100644 --- a/builtin-branch.c +++ b/builtin-branch.c @@ -242,7 +242,6 @@ static void print_ref_item(struct ref_item *item, int maxwidth, int verbose, char c; int color; struct commit *commit; - char subject[256]; switch (item->kind) { case REF_LOCAL_BRANCH: @@ -263,17 +262,23 @@ static void print_ref_item(struct ref_item *item, int maxwidth, int verbose, } if (verbose) { + char *subject = NULL; + unsigned long subject_len = 0; + const char *sub = " **** invalid ref ****"; + commit = lookup_commit(item->sha1); - if (commit && !parse_commit(commit)) + if (commit && !parse_commit(commit)) { pretty_print_commit(CMIT_FMT_ONELINE, commit, ~0, - subject, sizeof(subject), 0, + &subject, &subject_len, 0, NULL, NULL, 0); - else - strcpy(subject, " **** invalid ref ****"); + sub = subject; + } printf("%c %s%-*s%s %s %s\n", c, branch_get_color(color), maxwidth, item->name, branch_get_color(COLOR_BRANCH_RESET), - find_unique_abbrev(item->sha1, abbrev), subject); + find_unique_abbrev(item->sha1, abbrev), sub); + if (subject) + free(subject); } else { printf("%c %s%s%s\n", c, branch_get_color(color), item->name, branch_get_color(COLOR_BRANCH_RESET)); diff --git a/builtin-log.c b/builtin-log.c index 0aede76..b9035ab 100644 --- a/builtin-log.c +++ b/builtin-log.c @@ -742,11 +742,13 @@ int cmd_cherry(int argc, const char **argv, const char *prefix) sign = '-'; if (verbose) { - static char buf[16384]; + char *buf = NULL; + unsigned long buflen = 0; pretty_print_commit(CMIT_FMT_ONELINE, commit, ~0, - buf, sizeof(buf), 0, NULL, NULL, 0); + &buf, &buflen, 0, NULL, NULL, 0); printf("%c %s %s\n", sign, sha1_to_hex(commit->object.sha1), buf); + free(buf); } else { printf("%c %s\n", sign, diff --git a/builtin-rev-list.c b/builtin-rev-list.c index ebf53f5..813aadf 100644 --- a/builtin-rev-list.c +++ b/builtin-rev-list.c @@ -92,11 +92,13 @@ static void show_commit(struct commit *commit) putchar('\n'); if (revs.verbose_header) { - static char pretty_header[16384]; + char *buf = NULL; + unsigned long buflen = 0; pretty_print_commit(revs.commit_format, commit, ~0, - pretty_header, sizeof(pretty_header), + &buf, &buflen, revs.abbrev, NULL, NULL, revs.date_mode); - printf("%s%c", pretty_header, hdr_termination); + printf("%s%c", buf, hdr_termination); + free(buf); } fflush(stdout); if (commit->parents) { diff --git a/builtin-show-branch.c b/builtin-show-branch.c index c892f1f..4fa87f6 100644 --- a/builtin-show-branch.c +++ b/builtin-show-branch.c @@ -259,17 +259,19 @@ static void join_revs(struct commit_list **list_p, static void show_one_commit(struct commit *commit, int no_name) { - char pretty[256], *cp; + char *pretty = NULL; + const char *pretty_str = "(unavailable)"; + unsigned long pretty_len = 0; struct commit_name *name = commit->util; - if (commit->object.parsed) + + if (commit->object.parsed) { pretty_print_commit(CMIT_FMT_ONELINE, commit, ~0, - pretty, sizeof(pretty), 0, NULL, NULL, 0); - else - strcpy(pretty, "(unavailable)"); - if (!prefixcmp(pretty, "[PATCH] ")) - cp = pretty + 8; - else - cp = pretty; + &pretty, &pretty_len, + 0, NULL, NULL, 0); + pretty_str = pretty; + } + if (!prefixcmp(pretty_str, "[PATCH] ")) + pretty_str += 8; if (!no_name) { if (name && name->head_name) { @@ -286,7 +288,8 @@ static void show_one_commit(struct commit *commit, int no_name) printf("[%s] ", find_unique_abbrev(commit->object.sha1, 7)); } - puts(cp); + puts(pretty_str); + free(pretty); } static char *ref_name[MAX_REVS + 1]; diff --git a/commit.c b/commit.c index 4ca4d44..d43a68e 100644 --- a/commit.c +++ b/commit.c @@ -776,7 +776,7 @@ static void fill_person(struct interp *table, const char *msg, int len) } static long format_commit_message(const struct commit *commit, - const char *msg, char *buf, unsigned long space) + const char *msg, char **buf_p, unsigned long *space_p) { struct interp table[] = { { "%H" }, /* commit hash */ @@ -905,16 +905,27 @@ static long format_commit_message(const struct commit *commit, if (!table[i].value) interp_set_entry(table, i, "<unknown>"); - interpolate(buf, space, user_format, table, ARRAY_SIZE(table)); + do { + char *buf = *buf_p; + unsigned long space = *space_p; + + space = interpolate(buf, space, user_format, + table, ARRAY_SIZE(table)); + if (!space) + break; + buf = xrealloc(buf, space); + *buf_p = buf; + *space_p = space; + } while (1); interp_clear_table(table, ARRAY_SIZE(table)); - return strlen(buf); + return strlen(*buf_p); } unsigned long pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit, unsigned long len, - char *buf, unsigned long space, + char **buf_p, unsigned long *space_p, int abbrev, const char *subject, const char *after_subject, enum date_mode dmode) @@ -927,9 +938,11 @@ unsigned long pretty_print_commit(enum cmit_fmt fmt, int plain_non_ascii = 0; char *reencoded; const char *encoding; + char *buf; + unsigned long space, slop; if (fmt == CMIT_FMT_USERFORMAT) - return format_commit_message(commit, msg, buf, space); + return format_commit_message(commit, msg, buf_p, space_p); encoding = (git_log_output_encoding ? git_log_output_encoding @@ -969,6 +982,26 @@ unsigned long pretty_print_commit(enum cmit_fmt fmt, } } + space = *space_p; + buf = *buf_p; + + /* + * We do not want to repeatedly realloc below, so + * preallocate with enough slop to hold MIME headers, + * "Subject: " prefix, etc. + */ + slop = 1000; + if (subject) + slop += strlen(subject); + if (after_subject) + slop += strlen(after_subject); + if (space < strlen(msg) + slop) { + space = strlen(msg) + slop; + buf = xrealloc(buf, space); + *space_p = space; + *buf_p = buf; + } + for (;;) { const char *line = msg; int linelen = get_one_line(msg, len); @@ -976,14 +1009,12 @@ unsigned long pretty_print_commit(enum cmit_fmt fmt, if (!linelen) break; - /* - * We want some slop for indentation and a possible - * final "...". Thus the "+ 20". - */ + /* 20 would cover indent and leave us some slop */ if (offset + linelen + 20 > space) { - memcpy(buf + offset, " ...\n", 8); - offset += 8; - break; + space = offset + linelen + 20; + buf = xrealloc(buf, space); + *buf_p = buf; + *space_p = space; } msg += linelen; diff --git a/commit.h b/commit.h index a313b53..467872e 100644 --- a/commit.h +++ b/commit.h @@ -61,7 +61,7 @@ enum cmit_fmt { }; extern enum cmit_fmt get_commit_format(const char *arg); -extern unsigned long pretty_print_commit(enum cmit_fmt fmt, const struct commit *, unsigned long len, char *buf, unsigned long space, int abbrev, const char *subject, const char *after_subject, enum date_mode dmode); +extern unsigned long pretty_print_commit(enum cmit_fmt fmt, const struct commit *, unsigned long len, char **buf_p, unsigned long *space_p, int abbrev, const char *subject, const char *after_subject, enum date_mode dmode); /** Removes the first commit from a list sorted by date, and adds all * of its parents. diff --git a/interpolate.c b/interpolate.c index fb30694..0082677 100644 --- a/interpolate.c +++ b/interpolate.c @@ -44,33 +44,33 @@ void interp_clear_table(struct interp *table, int ninterps) * { "%%", "%"}, * } * - * Returns 1 on a successful substitution pass that fits in result, - * Returns 0 on a failed or overflowing substitution pass. + * Returns 0 on a successful substitution pass that fits in result, + * Returns a number of bytes needed to hold the full substituted + * string otherwise. */ -int interpolate(char *result, int reslen, +unsigned long interpolate(char *result, unsigned long reslen, const char *orig, const struct interp *interps, int ninterps) { const char *src = orig; char *dest = result; - int newlen = 0; + unsigned long newlen = 0; const char *name, *value; - int namelen, valuelen; + unsigned long namelen, valuelen; int i; char c; memset(result, 0, reslen); - while ((c = *src) && newlen < reslen - 1) { + while ((c = *src)) { if (c == '%') { /* Try to match an interpolation string. */ for (i = 0; i < ninterps; i++) { name = interps[i].name; namelen = strlen(name); - if (strncmp(src, name, namelen) == 0) { + if (strncmp(src, name, namelen) == 0) break; - } } /* Check for valid interpolation. */ @@ -78,29 +78,25 @@ int interpolate(char *result, int reslen, value = interps[i].value; valuelen = strlen(value); - if (newlen + valuelen < reslen - 1) { + if (newlen + valuelen + 1 < reslen) { /* Substitute. */ strncpy(dest, value, valuelen); - newlen += valuelen; dest += valuelen; - src += namelen; - } else { - /* Something's not fitting. */ - return 0; } - - } else { - /* Skip bogus interpolation. */ - *dest++ = *src++; - newlen++; + newlen += valuelen; + src += namelen; + continue; } - - } else { - /* Straight copy one non-interpolation character. */ - *dest++ = *src++; - newlen++; } + /* Straight copy one non-interpolation character. */ + if (newlen + 1 < reslen) + *dest++ = *src; + src++; + newlen++; } - return newlen < reslen - 1; + if (newlen + 1 < reslen) + return 0; + else + return newlen + 2; } diff --git a/interpolate.h b/interpolate.h index 16a26b9..77407e6 100644 --- a/interpolate.h +++ b/interpolate.h @@ -19,8 +19,8 @@ struct interp { extern void interp_set_entry(struct interp *table, int slot, const char *value); extern void interp_clear_table(struct interp *table, int ninterps); -extern int interpolate(char *result, int reslen, - const char *orig, - const struct interp *interps, int ninterps); +extern unsigned long interpolate(char *result, unsigned long reslen, + const char *orig, + const struct interp *interps, int ninterps); #endif /* INTERPOLATE_H */ diff --git a/log-tree.c b/log-tree.c index 4bef909..0cf21bc 100644 --- a/log-tree.c +++ b/log-tree.c @@ -79,16 +79,25 @@ static int detect_any_signoff(char *letter, int size) return seen_head && seen_name; } -static int append_signoff(char *buf, int buf_sz, int at, const char *signoff) +static unsigned long append_signoff(char **buf_p, unsigned long *buf_sz_p, + unsigned long at, const char *signoff) { static const char signed_off_by[] = "Signed-off-by: "; - int signoff_len = strlen(signoff); + size_t signoff_len = strlen(signoff); int has_signoff = 0; - char *cp = buf; - - /* Do we have enough space to add it? */ - if (buf_sz - at <= strlen(signed_off_by) + signoff_len + 3) - return at; + char *cp; + char *buf; + unsigned long buf_sz; + + buf = *buf_p; + buf_sz = *buf_sz_p; + if (buf_sz <= at + strlen(signed_off_by) + signoff_len + 3) { + buf_sz += strlen(signed_off_by) + signoff_len + 3; + buf = xrealloc(buf, buf_sz); + *buf_p = buf; + *buf_sz_p = buf_sz; + } + cp = buf; /* First see if we already have the sign-off by the signer */ while ((cp = strstr(cp, signed_off_by))) { @@ -133,7 +142,8 @@ static unsigned int digits_in_number(unsigned int number) void show_log(struct rev_info *opt, const char *sep) { - static char this_header[16384]; + char *msgbuf = NULL; + unsigned long msgbuf_len = 0; struct log_info *log = opt->loginfo; struct commit *commit = log->commit, *parent = log->parent; int abbrev = opt->diffopt.abbrev; @@ -278,14 +288,15 @@ void show_log(struct rev_info *opt, const char *sep) /* * And then the pretty-printed message itself */ - len = pretty_print_commit(opt->commit_format, commit, ~0u, this_header, - sizeof(this_header), abbrev, subject, + len = pretty_print_commit(opt->commit_format, commit, ~0u, + &msgbuf, &msgbuf_len, abbrev, subject, extra_headers, opt->date_mode); if (opt->add_signoff) - len = append_signoff(this_header, sizeof(this_header), len, + len = append_signoff(&msgbuf, &msgbuf_len, len, opt->add_signoff); - printf("%s%s%s", this_header, extra, sep); + printf("%s%s%s", msgbuf, extra, sep); + free(msgbuf); } int log_tree_diff_flush(struct rev_info *opt) ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] Extend --pretty=oneline to cover the first paragraph, so that an ugly commit message like this can be handled sanely. 2007-06-11 7:34 ` Junio C Hamano @ 2007-06-12 5:34 ` Junio C Hamano 2007-06-12 6:17 ` [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Eric Wong 1 sibling, 0 replies; 6+ messages in thread From: Junio C Hamano @ 2007-06-12 5:34 UTC (permalink / raw) To: git Currently, --pretty=oneline and --pretty=email (hence format-patch) take and use only the first line of the commit log message. This changes them to: - Take the first paragraph, where the definition of the first paragraph is "skip all blank lines from the beginning, and then grab everything up to the next empty line". - Replace all line breaks with a whitespace. This change would not affect a well-behaved commit message that adheres to the convention of "single line summary, a blank line, and then body of message", as its first paragraph always consists of a single line. Commit messages from different culture, such as the ones imported from CVS/SVN, can however get chomped with the existing behaviour at the first linebreak in the middle of sentence right now, which would become much easier to see with this change. The Subject: and --pretty=oneline output would become very long and unsightly for non-conforming commits, but their messages are already ugly anyway, and thischange at least avoids the loss of information. The Subject: line from a multi-line paragraph is folded using RFC2822 line folding rules at the places where line breaks were in the original. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * This is on top of the previous "Lift 16kB limit" clean-up. I haven't checked what mailinfo does when it unfolds a folded Subject: line yet, but it may have to be updated to match this, so that "format-patch --stdout | am" behaves sanely on commits with multi-line first paragraph. commit.c | 396 +++++++++++++++++++++++++++++++++++++++++--------------------- 1 files changed, 262 insertions(+), 134 deletions(-) diff --git a/commit.c b/commit.c index d43a68e..e2fd9ba 100644 --- a/commit.c +++ b/commit.c @@ -529,6 +529,14 @@ static int add_rfc2047(char *buf, const char *line, int len, return bp - buf; } +static unsigned long bound_rfc2047(unsigned long len, const char *encoding) +{ + /* upper bound of q encoded string of length 'len' */ + unsigned long elen = strlen(encoding); + + return len * 3 + elen + 100; +} + static int add_user_info(const char *what, enum cmit_fmt fmt, char *buf, const char *line, enum date_mode dmode, const char *encoding) @@ -922,6 +930,224 @@ static long format_commit_message(const struct commit *commit, return strlen(*buf_p); } +#define ALLOC_GROW(buf, space, need) \ + do { \ + if ((space) < (need)) { \ + buf = xrealloc(buf, need); \ + space = need; \ + } \ + } while (0) + +static void pp_header(enum cmit_fmt fmt, + int abbrev, + enum date_mode dmode, + const char *encoding, + const struct commit *commit, + const char **msg_p, + unsigned long *len_p, + unsigned long *ofs_p, + char **buf_p, + unsigned long *space_p) +{ + int parents_shown = 0; + + for (;;) { + const char *line = *msg_p; + char *dst; + int linelen = get_one_line(*msg_p, *len_p); + unsigned long len; + + if (!linelen) + return; + *msg_p += linelen; + *len_p -= linelen; + + if (linelen == 1) + /* End of header */ + return; + + ALLOC_GROW(*buf_p, *space_p, linelen + *ofs_p + 20); + dst = *buf_p + *ofs_p; + + if (fmt == CMIT_FMT_RAW) { + memcpy(dst, line, linelen); + *ofs_p += linelen; + continue; + } + + if (!memcmp(line, "parent ", 7)) { + if (linelen != 48) + die("bad parent line in commit"); + continue; + } + + if (!parents_shown) { + struct commit_list *parent; + int num; + for (parent = commit->parents, num = 0; + parent; + parent = parent->next, num++) + ; + /* with enough slop */ + num = *ofs_p + num * 50 + 20; + ALLOC_GROW(*buf_p, *space_p, num); + dst = *buf_p + *ofs_p; + *ofs_p += add_merge_info(fmt, dst, commit, abbrev); + parents_shown = 1; + } + + /* + * MEDIUM == DEFAULT shows only author with dates. + * FULL shows both authors but not dates. + * FULLER shows both authors and dates. + */ + if (!memcmp(line, "author ", 7)) { + len = linelen; + if (fmt == CMIT_FMT_EMAIL) + len = bound_rfc2047(linelen, encoding); + ALLOC_GROW(*buf_p, *space_p, *ofs_p + len); + dst = *buf_p + *ofs_p; + *ofs_p += add_user_info("Author", fmt, dst, + line + 7, dmode, encoding); + } + + if (!memcmp(line, "committer ", 10) && + (fmt == CMIT_FMT_FULL || fmt == CMIT_FMT_FULLER)) { + len = linelen; + if (fmt == CMIT_FMT_EMAIL) + len = bound_rfc2047(linelen, encoding); + ALLOC_GROW(*buf_p, *space_p, *ofs_p + len); + dst = *buf_p + *ofs_p; + *ofs_p += add_user_info("Commit", fmt, dst, + line + 10, dmode, encoding); + } + } +} + +static void pp_title_line(enum cmit_fmt fmt, + const char **msg_p, + unsigned long *len_p, + unsigned long *ofs_p, + char **buf_p, + unsigned long *space_p, + int indent, + const char *subject, + const char *after_subject, + const char *encoding, + int plain_non_ascii) +{ + char *title; + unsigned long title_alloc, title_len; + unsigned long len; + + title_len = 0; + title_alloc = 80; + title = xmalloc(title_alloc); + for (;;) { + const char *line = *msg_p; + int linelen = get_one_line(line, *len_p); + *msg_p += linelen; + *len_p -= linelen; + + if (!linelen || is_empty_line(line, &linelen)) + break; + + if (title_alloc <= title_len + linelen + 2) { + title_alloc = title_len + linelen + 80; + title = xrealloc(title, title_alloc); + } + len = 0; + if (title_len) { + if (fmt == CMIT_FMT_EMAIL) { + len++; + title[title_len++] = '\n'; + } + len++; + title[title_len++] = ' '; + } + memcpy(title + title_len, line, linelen); + title_len += linelen; + } + + /* Enough slop for the MIME header and rfc2047 */ + len = bound_rfc2047(title_len, encoding)+ 1000; + if (subject) + len += strlen(subject); + if (after_subject) + len += strlen(after_subject); + if (encoding) + len += strlen(encoding); + ALLOC_GROW(*buf_p, *space_p, title_len + *ofs_p + len); + + if (subject) { + len = strlen(subject); + memcpy(*buf_p + *ofs_p, subject, len); + *ofs_p += len; + *ofs_p += add_rfc2047(*buf_p + *ofs_p, + title, title_len, encoding); + } else { + memcpy(*buf_p + *ofs_p, title, title_len); + *ofs_p += title_len; + } + (*buf_p)[(*ofs_p)++] = '\n'; + if (plain_non_ascii) { + const char *header_fmt = + "MIME-Version: 1.0\n" + "Content-Type: text/plain; charset=%s\n" + "Content-Transfer-Encoding: 8bit\n"; + *ofs_p += snprintf(*buf_p + *ofs_p, + *space_p - *ofs_p, + header_fmt, encoding); + } + if (after_subject) { + len = strlen(after_subject); + memcpy(*buf_p + *ofs_p, after_subject, len); + *ofs_p += len; + } + free(title); + if (fmt == CMIT_FMT_EMAIL) { + ALLOC_GROW(*buf_p, *space_p, *ofs_p + 20); + (*buf_p)[(*ofs_p)++] = '\n'; + } +} + +static void pp_remainder(enum cmit_fmt fmt, + const char **msg_p, + unsigned long *len_p, + unsigned long *ofs_p, + char **buf_p, + unsigned long *space_p, + int indent) +{ + int first = 1; + for (;;) { + const char *line = *msg_p; + int linelen = get_one_line(line, *len_p); + *msg_p += linelen; + *len_p -= linelen; + + if (!linelen) + break; + + if (is_empty_line(line, &linelen)) { + if (first) + continue; + if (fmt == CMIT_FMT_SHORT) + break; + } + first = 0; + + ALLOC_GROW(*buf_p, *space_p, *ofs_p + linelen + indent + 20); + if (indent) { + memset(*buf_p + *ofs_p, ' ', indent); + *ofs_p += indent; + } + memcpy(*buf_p + *ofs_p, line, linelen); + *ofs_p += linelen; + (*buf_p)[(*ofs_p)++] = '\n'; + } +} + unsigned long pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit, unsigned long len, @@ -930,16 +1156,14 @@ unsigned long pretty_print_commit(enum cmit_fmt fmt, const char *after_subject, enum date_mode dmode) { - int hdr = 1, body = 0, seen_title = 0; unsigned long offset = 0; + unsigned long beginning_of_body; int indent = 4; - int parents_shown = 0; const char *msg = commit->buffer; int plain_non_ascii = 0; char *reencoded; const char *encoding; char *buf; - unsigned long space, slop; if (fmt == CMIT_FMT_USERFORMAT) return format_commit_message(commit, msg, buf_p, space_p); @@ -950,8 +1174,10 @@ unsigned long pretty_print_commit(enum cmit_fmt fmt, if (!encoding) encoding = "utf-8"; reencoded = logmsg_reencode(commit, encoding); - if (reencoded) + if (reencoded) { msg = reencoded; + len = strlen(reencoded); + } if (fmt == CMIT_FMT_ONELINE || fmt == CMIT_FMT_EMAIL) indent = 0; @@ -982,155 +1208,57 @@ unsigned long pretty_print_commit(enum cmit_fmt fmt, } } - space = *space_p; - buf = *buf_p; - - /* - * We do not want to repeatedly realloc below, so - * preallocate with enough slop to hold MIME headers, - * "Subject: " prefix, etc. - */ - slop = 1000; - if (subject) - slop += strlen(subject); - if (after_subject) - slop += strlen(after_subject); - if (space < strlen(msg) + slop) { - space = strlen(msg) + slop; - buf = xrealloc(buf, space); - *space_p = space; - *buf_p = buf; + pp_header(fmt, abbrev, dmode, encoding, + commit, &msg, &len, + &offset, buf_p, space_p); + if (fmt != CMIT_FMT_ONELINE && !subject) { + ALLOC_GROW(*buf_p, *space_p, offset + 20); + (*buf_p)[offset++] = '\n'; } + /* Skip excess blank lines at the beginning of body, if any... */ for (;;) { - const char *line = msg; int linelen = get_one_line(msg, len); - + int ll = linelen; if (!linelen) break; - - /* 20 would cover indent and leave us some slop */ - if (offset + linelen + 20 > space) { - space = offset + linelen + 20; - buf = xrealloc(buf, space); - *buf_p = buf; - *space_p = space; - } - + if (!is_empty_line(msg, &ll)) + break; msg += linelen; len -= linelen; - if (hdr) { - if (linelen == 1) { - hdr = 0; - if ((fmt != CMIT_FMT_ONELINE) && !subject) - buf[offset++] = '\n'; - continue; - } - if (fmt == CMIT_FMT_RAW) { - memcpy(buf + offset, line, linelen); - offset += linelen; - continue; - } - if (!memcmp(line, "parent ", 7)) { - if (linelen != 48) - die("bad parent line in commit"); - continue; - } - - if (!parents_shown) { - offset += add_merge_info(fmt, buf + offset, - commit, abbrev); - parents_shown = 1; - continue; - } - /* - * MEDIUM == DEFAULT shows only author with dates. - * FULL shows both authors but not dates. - * FULLER shows both authors and dates. - */ - if (!memcmp(line, "author ", 7)) - offset += add_user_info("Author", fmt, - buf + offset, - line + 7, - dmode, - encoding); - if (!memcmp(line, "committer ", 10) && - (fmt == CMIT_FMT_FULL || fmt == CMIT_FMT_FULLER)) - offset += add_user_info("Commit", fmt, - buf + offset, - line + 10, - dmode, - encoding); - continue; - } + } - if (!subject) - body = 1; + /* These formats treat the title line specially. */ + if (fmt == CMIT_FMT_ONELINE + || fmt == CMIT_FMT_EMAIL) + pp_title_line(fmt, &msg, &len, &offset, + buf_p, space_p, indent, + subject, after_subject, encoding, + plain_non_ascii); - if (is_empty_line(line, &linelen)) { - if (!seen_title) - continue; - if (!body) - continue; - if (subject) - continue; - if (fmt == CMIT_FMT_SHORT) - break; - } + beginning_of_body = offset; + if (fmt != CMIT_FMT_ONELINE) + pp_remainder(fmt, &msg, &len, &offset, + buf_p, space_p, indent); - seen_title = 1; - if (subject) { - int slen = strlen(subject); - memcpy(buf + offset, subject, slen); - offset += slen; - offset += add_rfc2047(buf + offset, line, linelen, - encoding); - } - else { - memset(buf + offset, ' ', indent); - memcpy(buf + offset + indent, line, linelen); - offset += linelen + indent; - } - buf[offset++] = '\n'; - if (fmt == CMIT_FMT_ONELINE) - break; - if (subject && plain_non_ascii) { - int sz; - char header[512]; - const char *header_fmt = - "MIME-Version: 1.0\n" - "Content-Type: text/plain; charset=%s\n" - "Content-Transfer-Encoding: 8bit\n"; - sz = snprintf(header, sizeof(header), header_fmt, - encoding); - if (sizeof(header) < sz) - die("Encoding name %s too long", encoding); - memcpy(buf + offset, header, sz); - offset += sz; - } - if (after_subject) { - int slen = strlen(after_subject); - if (slen > space - offset - 1) - slen = space - offset - 1; - memcpy(buf + offset, after_subject, slen); - offset += slen; - after_subject = NULL; - } - subject = NULL; - } - while (offset && isspace(buf[offset-1])) + while (offset && isspace((*buf_p)[offset-1])) offset--; + + ALLOC_GROW(*buf_p, *space_p, offset + 20); + buf = *buf_p; + /* Make sure there is an EOLN for the non-oneline case */ if (fmt != CMIT_FMT_ONELINE) buf[offset++] = '\n'; + /* - * make sure there is another EOLN to separate the headers from whatever - * body the caller appends if we haven't already written a body + * The caller may append additional body text in e-mail + * format. Make sure we did not strip the blank line + * between the header and the body. */ - if (fmt == CMIT_FMT_EMAIL && !body) + if (fmt == CMIT_FMT_EMAIL && offset <= beginning_of_body) buf[offset++] = '\n'; buf[offset] = '\0'; - free(reencoded); return offset; } -- 1.5.2.1.1021.g1c0b0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file 2007-06-11 7:34 ` Junio C Hamano 2007-06-12 5:34 ` [PATCH] Extend --pretty=oneline to cover the first paragraph, so that an ugly commit message like this can be handled sanely Junio C Hamano @ 2007-06-12 6:17 ` Eric Wong 1 sibling, 0 replies; 6+ messages in thread From: Eric Wong @ 2007-06-12 6:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: Sam Vilain, git Junio C Hamano <gitster@pobox.com> wrote: > Eric Wong <normalperson@yhbt.net> writes: > > > Sam Vilain <sam@vilain.net> wrote: > >> This saves a bit of time when rebuilding the git-svn index. > > > > Does git-log still have the 16k buffer limit? If so then we can't use > > it because commit messages over 16k will be truncated and the git-svn-id > > line will not show up. Also, if that limit is removed I'd prefer to > > just add --pretty=raw to rev-list because git-log is stil porcelain and > > more likely to change. > > How about this? It passes the test suite, but other than that > hasn't seen much test yet. I tried to be careful, but sanity > checking by extra sets of eyeballs would be needed. The patch looks and runs alright to me, but then again I haven't looked at the C portions of git in a while :x I expected the malloc/free overhead to be much greater, but it's hardly noticeable (nor measureable with /usr/bin/time or bash built-in time). There are just a handful more pagefaults measured with /usr/bin/time, but the runtime performance is neck-and-neck with/without the patch. Maybe glibc (2.3.6 on x86 Debian Etch) and Linux (2.6.18) are just doing a very good job with memory allocation... I wonder how well it runs on other platforms. -- Eric Wong ^ permalink raw reply [flat|nested] 6+ messages in thread
* a bunch of outstanding updates @ 2007-06-30 8:56 Sam Vilain 2007-06-30 8:56 ` [PATCH] repack: improve documentation on -a option Sam Vilain 0 siblings, 1 reply; 6+ messages in thread From: Sam Vilain @ 2007-06-30 8:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Following up to this e-mail are a whole load of outstanding feature requests of mine. These changes are relatively mundane: * repack: improve documentation on -a option * git-remote: document -n * git-remote: allow 'git-remote fetch' as a synonym for 'git fetch' * git-svn: use git-log rather than rev-list | xargs cat-file * git-svn: cache max revision in rev_db databases This one will impact on the version displayed by "git --version", but I think this is for the better: * GIT-VERSION-GEN: don't convert - delimiter to .'s These ones are really only very minor updates based on feedback so far: * git-merge-ff: fast-forward only merge * git-mergetool: add support for ediff This one is just the previously posted hook script put into the templates directory, let me know if you'd rather I reshaped it to go into contrib/hooks: * contrib/hooks: add post-update hook for updating working copy This one probably needs a bit more consideration and review, could perhaps sit on pu. * git-repack: generational repacking (and example hook script) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] repack: improve documentation on -a option 2007-06-30 8:56 a bunch of outstanding updates Sam Vilain @ 2007-06-30 8:56 ` Sam Vilain 2007-06-30 8:56 ` [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Sam Vilain 0 siblings, 1 reply; 6+ messages in thread From: Sam Vilain @ 2007-06-30 8:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Sam Vilain Some minor enhancements to the git-repack manual page. Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz> --- Documentation/git-repack.txt | 13 ++++++++----- 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index c33a512..be8e5f8 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -14,7 +14,8 @@ DESCRIPTION ----------- This script is used to combine all objects that do not currently -reside in a "pack", into a pack. +reside in a "pack", into a pack. It can also be used to re-organise +existing packs into a single, more efficient pack. A pack is a collection of objects, individually compressed, with delta compression applied, stored in a single file, with an @@ -28,11 +29,13 @@ OPTIONS -a:: Instead of incrementally packing the unpacked objects, - pack everything available into a single pack. + pack everything referenced into a single pack. Especially useful when packing a repository that is used - for private development and there is no need to worry - about people fetching via dumb file transfer protocols - from it. Use with '-d'. + for private development and there no need to worry + about people fetching via dumb protocols from it. Use + with '-d'. This will clean up the objects that `git prune` + leaves behind, but `git fsck-objects --full` shows as + dangling. -d:: After packing, if the newly created packs make some -- 1.5.2.1.1131.g3b90 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file 2007-06-30 8:56 ` [PATCH] repack: improve documentation on -a option Sam Vilain @ 2007-06-30 8:56 ` Sam Vilain 0 siblings, 0 replies; 6+ messages in thread From: Sam Vilain @ 2007-06-30 8:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Sam Vilain, Sam Vilain From: Sam Vilain <sam@vilain.net> This saves a bit of time when rebuilding the git-svn index. Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz> --- git-svn.perl | 36 ++++++++++++++++++++++-------------- 1 files changed, 22 insertions(+), 14 deletions(-) diff --git a/git-svn.perl b/git-svn.perl index 3033b50..556cd7d 100755 --- a/git-svn.perl +++ b/git-svn.perl @@ -782,12 +782,12 @@ sub read_repo_config { sub extract_metadata { my $id = shift or return (undef, undef, undef); - my ($url, $rev, $uuid) = ($id =~ /^git-svn-id:\s(\S+?)\@(\d+) + my ($url, $rev, $uuid) = ($id =~ /^\s*git-svn-id:\s+(.*)\@(\d+) \s([a-f\d\-]+)$/x); if (!defined $rev || !$uuid || !$url) { # some of the original repositories I made had # identifiers like this: - ($rev, $uuid) = ($id =~/^git-svn-id:\s(\d+)\@([a-f\d\-]+)/); + ($rev, $uuid) = ($id =~/^\s*git-svn-id:\s(\d+)\@([a-f\d\-]+)/); } return ($url, $rev, $uuid); } @@ -799,10 +799,16 @@ sub cmt_metadata { sub working_head_info { my ($head, $refs) = @_; - my ($fh, $ctx) = command_output_pipe('rev-list', $head); - while (my $hash = <$fh>) { - chomp($hash); - my ($url, $rev, $uuid) = cmt_metadata($hash); + my ($fh, $ctx) = command_output_pipe('log', $head); + my $hash; + while (<$fh>) { + if ( m{^commit ($::sha1)$} ) { + unshift @$refs, $hash if $hash and $refs; + $hash = $1; + next; + } + next unless s{^\s*(git-svn-id:)}{$1}; + my ($url, $rev, $uuid) = extract_metadata($_); if (defined $url && defined $rev) { if (my $gs = Git::SVN->find_by_url($url)) { my $c = $gs->rev_db_get($rev); @@ -812,7 +818,6 @@ sub working_head_info { } } } - unshift @$refs, $hash if $refs; } command_close_pipe($fh, $ctx); (undef, undef, undef, undef); @@ -2019,16 +2024,19 @@ sub rebuild { return; } print "Rebuilding $db_path ...\n"; - my ($rev_list, $ctx) = command_output_pipe("rev-list", $self->refname); + my ($log, $ctx) = command_output_pipe("log", $self->refname); my $latest; my $full_url = $self->full_url; remove_username($full_url); my $svn_uuid; - while (<$rev_list>) { - chomp; - my $c = $_; - die "Non-SHA1: $c\n" unless $c =~ /^$::sha1$/o; - my ($url, $rev, $uuid) = ::cmt_metadata($c); + my $c; + while (<$log>) { + if ( m{^commit ($::sha1)$} ) { + $c = $1; + next; + } + next unless s{^\s*(git-svn-id:)}{$1}; + my ($url, $rev, $uuid) = ::extract_metadata($_); remove_username($url); # ignore merges (from set-tree) @@ -2046,7 +2054,7 @@ sub rebuild { $self->rev_db_set($rev, $c); print "r$rev = $c\n"; } - command_close_pipe($rev_list, $ctx); + command_close_pipe($log, $ctx); print "Done rebuilding $db_path\n"; } -- 1.5.2.1.1131.g3b90 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-06-30 8:57 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-06-10 9:00 [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Sam Vilain 2007-06-10 21:24 ` Eric Wong 2007-06-11 7:34 ` Junio C Hamano 2007-06-12 5:34 ` [PATCH] Extend --pretty=oneline to cover the first paragraph, so that an ugly commit message like this can be handled sanely Junio C Hamano 2007-06-12 6:17 ` [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Eric Wong -- strict thread matches above, loose matches on Subject: below -- 2007-06-30 8:56 a bunch of outstanding updates Sam Vilain 2007-06-30 8:56 ` [PATCH] repack: improve documentation on -a option Sam Vilain 2007-06-30 8:56 ` [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Sam Vilain
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).