* [PATCH] git-pickaxe: blame rewritten.
@ 2006-10-12 8:52 Junio C Hamano
2006-10-12 15:58 ` Linus Torvalds
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-12 8:52 UTC (permalink / raw)
To: git
Currently it does what git-blame does, but only faster.
More importantly, its internal structure is designed to support
content movement (aka cut-and-paste) more easily by allowing
more than one paths to be taken from the same commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
I really hate to do this immediately after writing obituary for
annotate, but I had a solid 24-hour to work on git, which is a
rare opportunity for me these days, so here it is.
For 160+ *.c files in git.git repository, running git-blame and
git-pickaxe with "-n -f" options for all of them takes the
following:
*** blame ***
141.43user 1.41system 2:22.94elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+438667minor)pagefaults 0swaps
*** pickaxe ***
78.99user 2.22system 1:21.22elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+397307minor)pagefaults 0swaps
I have given only cursory check to its output (some files are
blamed slightly differently from how git-blame does), but it
appears that there is no major breakage. You can see for
example try annotating builtin-apply.c starting from v1.4.0;
there are two differences, which pickaxe assigns blame to older
commits and both of them seem to be sensible. This exercise
also revealed some mistakes I made earlier (number of same
patches were cherry-picked in separate branches and then later
merged: "gitk master --not e4c9327a 15b4d577 -- pack-objects.c"
would show an example of such a breakage.
Documentation/git-pickaxe.txt | 104 +++++
Documentation/git.txt | 3 +
Makefile | 1 +
builtin-pickaxe.c | 952 +++++++++++++++++++++++++++++++++++++++++
builtin.h | 1 +
git.c | 1 +
t/t8003-pickaxe.sh | 9 +
7 files changed, 1071 insertions(+), 0 deletions(-)
diff --git a/Documentation/git-pickaxe.txt b/Documentation/git-pickaxe.txt
new file mode 100644
index 0000000..7f30cdf
--- /dev/null
+++ b/Documentation/git-pickaxe.txt
@@ -0,0 +1,104 @@
+git-pickaxe(1)
+==============
+
+NAME
+----
+git-pickaxe - Show what revision and author last modified each line of a file
+
+SYNOPSIS
+--------
+'git-pickaxe' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [--] <file> [<rev>]
+
+DESCRIPTION
+-----------
+
+Annotates each line in the given file with information from the revision which
+last modified the line. Optionally, start annotating from the given revision.
+
+Also it can limit the range of lines annotated.
+
+This report doesn't tell you anything about lines which have been deleted or
+replaced; you need to use a tool such as gitlink:git-diff[1] or the "pickaxe"
+interface briefly mentioned in the following paragraph.
+
+Apart from supporting file annotation, git also supports searching the
+development history for when a code snippet occured in a change. This makes it
+possible to track when a code snippet was added to a file, moved or copied
+between files, and eventually deleted or replaced. It works by searching for
+a text string in the diff. A small example:
+
+-----------------------------------------------------------------------------
+$ git log --pretty=oneline -S'blame_usage'
+5040f17eba15504bad66b14a645bddd9b015ebb7 blame -S <ancestry-file>
+ea4c7f9bf69e781dd0cd88d2bccb2bf5cc15c9a7 git-blame: Make the output
+-----------------------------------------------------------------------------
+
+OPTIONS
+-------
+-c, --compatibility::
+ Use the same output mode as gitlink:git-annotate[1] (Default: off).
+
+-L n,m::
+ Annotate only the specified line range (lines count from 1).
+
+-l, --long::
+ Show long rev (Default: off).
+
+-t, --time::
+ Show raw timestamp (Default: off).
+
+-S, --rev-file <revs-file>::
+ Use revs from revs-file instead of calling gitlink:git-rev-list[1].
+
+-f, --show-name::
+ Show filename in the original commit. By default
+ filename is shown if there is any line that came from a
+ file with different name, due to rename detection.
+
+-n, --show-number::
+ Show line number in the original commit (Default: off).
+
+-p, --porcelain::
+ Show in a format designed for machine consumption.
+
+-h, --help::
+ Show help message.
+
+
+THE PORCELAIN FORMAT
+--------------------
+
+In this format, each line is output after a header; the
+header at the minumum has the first line which has:
+
+- 40-byte SHA-1 of the commit the line is attributed to;
+- the line number of the line in the original file;
+- the line number of the line in the final file;
+- on a line that starts a group of line from a different
+ commit than the previous one, the number of lines in this
+ group. On subsequent lines this field is absent.
+
+This header line is followed by the following information
+at least once for each commit:
+
+- author name ("author"), email ("author-mail"), time
+ ("author-time"), and timezone ("author-tz"); similarly
+ for committer.
+- filename in the commit the line is attributed to.
+- the first line of the commit log message ("summary").
+
+The contents of the actual line is output after the above
+header, prefixed by a TAB. This is to allow adding more
+header elements later.
+
+SEE ALSO
+--------
+gitlink:git-blame[1]
+
+AUTHOR
+------
+Written by Junio C Hamano <junkio@cox.net>
+
+GIT
+---
+Part of the gitlink:git[7] suite
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 3af6fc6..7074e32 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -430,6 +430,9 @@ gitlink:git-annotate[1]::
gitlink:git-blame[1]::
Blame file lines on commits.
+gitlink:git-pickaxe[1]::
+ Find out where each line in a file came from.
+
gitlink:git-check-ref-format[1]::
Make sure ref name is well formed.
diff --git a/Makefile b/Makefile
index 2c7c338..bd99550 100644
--- a/Makefile
+++ b/Makefile
@@ -288,6 +288,7 @@ BUILTIN_OBJS = \
builtin-mv.o \
builtin-name-rev.o \
builtin-pack-objects.o \
+ builtin-pickaxe.o \
builtin-prune.o \
builtin-prune-packed.o \
builtin-push.o \
diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
new file mode 100644
index 0000000..0e26308
--- /dev/null
+++ b/builtin-pickaxe.c
@@ -0,0 +1,952 @@
+/*
+ * Pickaxe
+ *
+ * Copyright (c) 2006, Junio C Hamano
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "blob.h"
+#include "commit.h"
+#include "tree-walk.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "xdiff-interface.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+static char pickaxe_usage[] =
+"git-pickaxe [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [--] file [commit]\n"
+" -c, --compatibility Use the same output mode as git-annotate (Default: off)\n"
+" -l, --long Show long commit SHA1 (Default: off)\n"
+" -t, --time Show raw timestamp (Default: off)\n"
+" -f, --show-name Show original filename (Default: auto)\n"
+" -n, --show-number Show original linenumber (Default: off)\n"
+" -p, --porcelain Show in a format designed for machine consumption\n"
+" -L n,m Process only line range n,m, counting from 1\n"
+" -S revs-file Use revisions from revs-file instead of calling git-rev-list\n";
+
+static int longest_file;
+static int longest_author;
+static int max_orig_digits;
+static int max_digits;
+
+#define DEBUG 0
+/*
+ * One blob in a commit
+ */
+struct origin {
+ struct commit *commit;
+ unsigned char blob_sha1[20];
+ char metainfo_given;
+ char path[FLEX_ARRAY];
+};
+
+struct blame_entry {
+ /* the first line of this group in the final image;
+ * internally all line numbers are 0 based.
+ */
+ int lno;
+
+ /* how many lines this group has */
+ int num_lines;
+
+ /* the commit that introduced this group into the final image */
+ struct origin *suspect;
+
+ /* true if the suspect is truly guilty; false while we have not
+ * checked if the group came from one of its parents.
+ */
+ char guilty;
+
+ /* the line number of the first line of this group in the
+ * suspect's file; internally all line numbers are 0 based.
+ */
+ int s_lno;
+};
+
+struct scoreboard {
+ /* the final commit (i.e. where we started digging from) */
+ struct commit *final;
+
+ const char *path;
+
+ /* the contents in the final; pointed into by buf pointers of
+ * blame_entries
+ */
+ const char *final_buf;
+ unsigned long final_buf_size;
+
+ /* list of blames */
+ struct blame_entry **entries;
+ int num_entries;
+};
+
+static int blame_entry_sort(const void *a_, const void *b_)
+{
+ int a = (*(struct blame_entry **)a_)->lno;
+ int b = (*(struct blame_entry **)b_)->lno;
+ return a - b;
+}
+
+static void coalesce(struct scoreboard *sb)
+{
+ int i, o, num;
+ struct blame_entry **ent;
+
+ qsort(sb->entries, sb->num_entries, sizeof(struct blame_entry *),
+ blame_entry_sort);
+ ent = sb->entries;
+ num = sb->num_entries;
+
+ /*
+ * Look at i, and see if it is adjacent to the last one at (o-1);
+ * if so, extend the last one into a larger group and drop the
+ * current one at i. Copy the entry to remove gaps made by the
+ * coalescing process as we go.
+ */
+ for (o = 0, i = 0; i < num; i++) {
+ if (o &&
+ ent[o-1]->suspect == ent[i]->suspect &&
+ ent[o-1]->s_lno + ent[o-1]->num_lines == ent[i]->s_lno) {
+ ent[o-1]->num_lines += ent[i]->num_lines;
+ free(ent[i]);
+ continue;
+ }
+ if (o != i)
+ ent[o] = ent[i];
+ o++;
+ }
+ sb->num_entries = o;
+}
+
+static void free_origin(struct origin *o)
+{
+ free(o);
+}
+
+static struct origin *find_origin(struct scoreboard *sb,
+ struct commit *commit,
+ const char *path)
+{
+ int i, num;
+ struct blame_entry **ent;
+ struct origin *o;
+ unsigned mode;
+ char type[10];
+
+ ent = sb->entries;
+ num = sb->num_entries;
+ for (i = 0; i < num; i++) {
+ if (ent[i]->suspect->commit == commit &&
+ !strcmp(ent[i]->suspect->path, path))
+ return ent[i]->suspect;
+ }
+
+ o = xcalloc(1, sizeof(*o) + strlen(path) + 1);
+ o->commit = commit;
+ strcpy(o->path, path);
+ if (get_tree_entry(commit->object.sha1, path, o->blob_sha1, &mode))
+ goto err_out;
+ if (sha1_object_info(o->blob_sha1, type, NULL) ||
+ strcmp(type, blob_type))
+ goto err_out;
+ return o;
+ err_out:
+ free_origin(o);
+ return NULL;
+}
+
+static struct origin *find_rename(struct scoreboard *sb,
+ struct commit *parent,
+ struct origin *origin)
+{
+ struct origin *porigin = NULL;
+ struct diff_options diff_opts;
+ int i;
+ const char *paths[1];
+
+ diff_setup(&diff_opts);
+ diff_opts.recursive = 1;
+ diff_opts.detect_rename = DIFF_DETECT_RENAME;
+ diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+ paths[0] = NULL;
+ diff_tree_setup_paths(paths, &diff_opts);
+ if (diff_setup_done(&diff_opts) < 0)
+ die("diff-setup");
+ diff_tree_sha1(origin->commit->tree->object.sha1,
+ parent->tree->object.sha1,
+ "", &diff_opts);
+ diffcore_std(&diff_opts);
+
+ for (i = 0; i < diff_queued_diff.nr; i++) {
+ struct diff_filepair *p = diff_queued_diff.queue[i];
+ if (p->status == 'R' && !strcmp(p->one->path, origin->path)) {
+ porigin = find_origin(sb, parent, p->two->path);
+ break;
+ }
+ }
+ diff_flush(&diff_opts);
+ return porigin;
+}
+
+struct chunk {
+ /* line number in postimage; up to but not including this
+ * line is the same as preimage
+ */
+ int same;
+
+ /* preimage line number after this chunk */
+ int p_next;
+
+ /* postimage line number after this chunk */
+ int t_next;
+};
+
+struct patch {
+ struct chunk *chunks;
+ int num;
+};
+
+struct blame_diff_state {
+ struct xdiff_emit_state xm;
+ struct patch *ret;
+};
+
+static void process_u0_diff(void *state_, char *line, unsigned long len)
+{
+ struct blame_diff_state *state = state_;
+ struct chunk *chunk;
+ int off1, off2, len1, len2;
+
+if(DEBUG) fprintf(stderr, "%.*s", (int) len, line);
+
+ if (len < 4 || line[0] != '@' || line[1] != '@')
+ return;
+
+ state->ret->num++;
+ state->ret->chunks = xrealloc(state->ret->chunks,
+ sizeof(struct chunk) * state->ret->num);
+ chunk = &state->ret->chunks[state->ret->num - 1];
+
+ if (parse_hunk_header(line, len,
+ &off1, &len1,
+ &off2, &len2)) {
+ state->ret->num--;
+ return;
+ }
+
+ /* Line numbers in patch output are one based. */
+ off1--;
+ off2--;
+
+ chunk->same = len2 ? off2 : (off2 + 1);
+
+ chunk->p_next = off1 + (len1 ? len1 : 1);
+ chunk->t_next = chunk->same + len2;
+}
+
+static struct patch *get_patch(struct origin *parent,
+ struct origin *origin)
+{
+ struct blame_diff_state state;
+ xpparam_t xpp;
+ xdemitconf_t xecfg;
+ mmfile_t file_p, file_o;
+ xdemitcb_t ecb;
+ char type[10];
+
+if(DEBUG) fprintf(stderr, "get patch %.8s %.8s\n",
+ sha1_to_hex(parent->commit->object.sha1),
+ sha1_to_hex(origin->commit->object.sha1));
+
+ file_p.ptr = read_sha1_file(parent->blob_sha1, type,
+ (unsigned long *) &file_p.size);
+ file_o.ptr = read_sha1_file(origin->blob_sha1, type,
+ (unsigned long *) &file_o.size);
+ if (!file_p.ptr || !file_o.ptr)
+ return NULL;
+
+ xpp.flags = XDF_NEED_MINIMAL;
+ xecfg.ctxlen = 0;
+ xecfg.flags = 0;
+ ecb.outf = xdiff_outf;
+ ecb.priv = &state;
+ memset(&state, 0, sizeof(state));
+ state.xm.consume = process_u0_diff;
+ state.ret = xmalloc(sizeof(struct patch));
+ state.ret->chunks = NULL;
+ state.ret->num = 0;
+
+ xdl_diff(&file_p, &file_o, &xpp, &xecfg, &ecb);
+ return state.ret;
+}
+
+static void free_patch(struct patch *p)
+{
+ free(p->chunks);
+ free(p);
+}
+
+static void add_blame_entry(struct scoreboard *sb, struct blame_entry *e)
+{
+ sb->entries = xrealloc(sb->entries,
+ sizeof(struct blame_entry *) *
+ (sb->num_entries + 1));
+ sb->entries[sb->num_entries++] = e;
+}
+
+static void blame_overlap(struct scoreboard *sb, struct blame_entry *e,
+ int tlno, int plno, int same,
+ struct origin *parent)
+{
+ /* it is known that lines between tlno to same
+ * came from parent, and e has an overlap with that range.
+ * it also is known that parent's line plno corresponds to
+ * e's line tlno.
+ *
+ * <---- e ----->
+ * <------>
+ * <------------>
+ * <------------>
+ * <------------------>
+ *
+ * Potentially we need to split e into three parts; before
+ * this chunk, the chunk to be blamed for parent, and after
+ * that portion.
+ *
+ * When the function splits the blame_entry, it pushes the part
+ * that needs to be re-processed at the end of the queue.
+ */
+ int chunk_end_lno;
+ struct blame_entry split[3], *new_entry;
+ memset(split, 0, sizeof(split));
+
+ split[1].suspect = parent;
+ if (e->s_lno < tlno) {
+ /* there is a pre-chunk part not blamed on parent */
+ split[0].suspect = e->suspect;
+ split[0].lno = e->lno;
+ split[0].s_lno = e->s_lno;
+ split[0].num_lines = tlno - e->s_lno;
+ split[1].lno = e->lno + tlno - e->s_lno;
+ split[1].s_lno = plno;
+ }
+ else {
+ split[1].lno = e->lno;
+ split[1].s_lno = plno + (e->s_lno - tlno);
+ }
+
+ if (same < e->s_lno + e->num_lines) {
+ /* there is a post-chunk part not blamed on parent */
+ split[2].suspect = e->suspect;
+ split[2].lno = e->lno + (same - e->s_lno);
+ split[2].s_lno = e->s_lno + (same - e->s_lno);
+ split[2].num_lines = e->s_lno + e->num_lines - same;
+ chunk_end_lno = split[2].lno;
+ }
+ else
+ chunk_end_lno = e->lno + e->num_lines;
+ split[1].num_lines = chunk_end_lno - split[1].lno;
+
+ if (split[0].suspect && split[2].suspect) {
+ /* we need to split e into two and add another for parent */
+ memcpy(e, &(split[0]), sizeof(struct blame_entry));
+
+ new_entry = xmalloc(sizeof(*new_entry));
+ memcpy(new_entry, &(split[2]), sizeof(struct blame_entry));
+ add_blame_entry(sb, new_entry);
+
+ new_entry = xmalloc(sizeof(*new_entry));
+ memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));
+ add_blame_entry(sb, new_entry);
+ }
+ else if (!split[0].suspect && !split[2].suspect)
+ /* parent covers the entire area */
+ memcpy(e, &(split[1]), sizeof(struct blame_entry));
+ else if (split[0].suspect) {
+ memcpy(e, &(split[0]), sizeof(struct blame_entry));
+
+ new_entry = xmalloc(sizeof(*new_entry));
+ memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));
+ add_blame_entry(sb, new_entry);
+ }
+ else {
+ new_entry = xmalloc(sizeof(*new_entry));
+ memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));
+ add_blame_entry(sb, new_entry);
+
+ memcpy(e, &(split[2]), sizeof(struct blame_entry));
+ }
+}
+
+static void blame_chunk(struct scoreboard *sb,
+ int tlno, int plno, int same,
+ struct origin *target, struct origin *parent)
+{
+ int i;
+ for (i = 0; i < sb->num_entries; i++) {
+ struct blame_entry *e = sb->entries[i];
+ if (e->guilty || e->suspect != target)
+ continue;
+ if (same <= e->s_lno)
+ continue;
+ if (tlno < e->s_lno + e->num_lines)
+ blame_overlap(sb, e, tlno, plno, same, parent);
+ }
+}
+
+static int pass_blame_to_parent(struct scoreboard *sb,
+ struct origin *target,
+ struct origin *parent)
+{
+ int i;
+ struct patch *patch;
+ int plno, tlno, last_in_target = -1;
+
+ for (i = 0; i < sb->num_entries; i++) {
+ struct blame_entry *e = sb->entries[i];
+ if (e->guilty || e->suspect != target)
+ continue;
+ if (last_in_target < e->s_lno + e->num_lines)
+ last_in_target = e->s_lno + e->num_lines;
+ }
+ patch = get_patch(parent, target);
+ plno = tlno = 0;
+ for (i = 0; i < patch->num; i++) {
+ struct chunk *chunk = &patch->chunks[i];
+
+if(DEBUG) fprintf(stderr,
+ "plno = %d, tlno = %d, "
+ "same as parent up to %d, resync %d and %d\n",
+ plno, tlno,
+ chunk->same, chunk->p_next, chunk->t_next);
+ blame_chunk(sb, tlno, plno, chunk->same, target, parent);
+ plno = chunk->p_next;
+ tlno = chunk->t_next;
+ }
+ /* rest (i.e. anything above tlno) are the same as parent */
+ blame_chunk(sb, tlno, plno, last_in_target, target, parent);
+
+ free_patch(patch);
+ return 0;
+}
+
+#define MAXPARENT 16
+static void pass_blame(struct scoreboard *sb, struct origin *origin)
+{
+ int i, parent_ix;
+ struct commit *commit = origin->commit;
+ struct commit_list *parent;
+ struct origin *parent_origin[MAXPARENT], *porigin;
+
+ if (parse_commit(commit))
+ exit(1);
+
+ memset(parent_origin, 0, sizeof(parent_origin));
+ for (parent_ix = 0, parent = commit->parents;
+ parent_ix < MAXPARENT && parent;
+ parent = parent->next, parent_ix++) {
+ if (parse_commit(parent->item))
+ continue;
+ porigin = find_origin(sb, parent->item, origin->path);
+ if (!porigin)
+ porigin = find_rename(sb, parent->item, origin);
+ if (!porigin)
+ continue;
+ if (!hashcmp(porigin->blob_sha1, origin->blob_sha1)) {
+ for (i = 0; i < sb->num_entries; i++) {
+ if (sb->entries[i]->suspect == origin)
+ sb->entries[i]->suspect = porigin;
+ }
+ /* now everything blamed for origin is blamed for
+ * porigin, we do not need to keep it anymore.
+ * Do not free porigin (or the ones we got from
+ * earlier round); they may still be used elsewhere.
+ */
+ free_origin(origin);
+ return;
+ }
+ parent_origin[parent_ix] = porigin;
+ }
+
+ for (parent_ix = 0, parent = commit->parents;
+ parent_ix < MAXPARENT && parent;
+ parent = parent->next, parent_ix++) {
+ struct origin *porigin = parent_origin[parent_ix];
+ if (!porigin)
+ continue;
+ if (pass_blame_to_parent(sb, origin, porigin))
+ break;
+ }
+
+ /* NEEDSWORK:
+ * Optionally run "ciff" to find copies from parents' files here
+ */
+
+ /* Take responsibility for the remaining entries */
+ for (i = 0; i < sb->num_entries; i++)
+ if (!sb->entries[i]->guilty &&
+ sb->entries[i]->suspect == origin)
+ sb->entries[i]->guilty = 1;
+}
+
+static void assign_blame(struct scoreboard *sb)
+{
+ while (1) {
+ int i;
+ struct origin *suspect = NULL;
+
+ /* find one suspect to break down */
+ for (i = 0; !suspect && i < sb->num_entries; i++) {
+ if (!sb->entries[i]->guilty)
+ suspect = sb->entries[i]->suspect;
+ }
+ if (!suspect)
+ return; /* all done */
+
+ pass_blame(sb, suspect);
+ }
+}
+
+static const char *format_time(unsigned long time, const char *tz_str,
+ int show_raw_time)
+{
+ static char time_buf[128];
+ time_t t = time;
+ int minutes, tz;
+ struct tm *tm;
+
+ if (show_raw_time) {
+ sprintf(time_buf, "%lu %s", time, tz_str);
+ return time_buf;
+ }
+
+ tz = atoi(tz_str);
+ minutes = tz < 0 ? -tz : tz;
+ minutes = (minutes / 100)*60 + (minutes % 100);
+ minutes = tz < 0 ? -minutes : minutes;
+ t = time + minutes * 60;
+ tm = gmtime(&t);
+
+ strftime(time_buf, sizeof(time_buf), "%Y-%m-%d %H:%M:%S ", tm);
+ strcat(time_buf, tz_str);
+ return time_buf;
+}
+
+struct commit_info
+{
+ char *author;
+ char *author_mail;
+ unsigned long author_time;
+ char *author_tz;
+
+ /* filled only when asked for details */
+ char *committer;
+ char *committer_mail;
+ unsigned long committer_time;
+ char *committer_tz;
+
+ char *summary;
+};
+
+static void get_ac_line(const char *inbuf, const char *what,
+ int bufsz, char *person, char **mail,
+ unsigned long *time, char **tz)
+{
+ int len;
+ char *tmp, *endp;
+
+ tmp = strstr(inbuf, what);
+ if (!tmp)
+ goto error_out;
+ tmp += strlen(what);
+ endp = strchr(tmp, '\n');
+ if (!endp)
+ len = strlen(tmp);
+ else
+ len = endp - tmp;
+ if (bufsz <= len) {
+ error_out:
+ /* Ugh */
+ person = *mail = *tz = "(unknown)";
+ *time = 0;
+ return;
+ }
+ memcpy(person, tmp, len);
+
+ tmp = person;
+ tmp += len;
+ *tmp = 0;
+ while (*tmp != ' ')
+ tmp--;
+ *tz = tmp+1;
+
+ *tmp = 0;
+ while (*tmp != ' ')
+ tmp--;
+ *time = strtoul(tmp, NULL, 10);
+
+ *tmp = 0;
+ while (*tmp != ' ')
+ tmp--;
+ *mail = tmp + 1;
+ *tmp = 0;
+}
+
+static void get_commit_info(struct commit *commit,
+ struct commit_info *ret,
+ int detailed)
+{
+ int len;
+ char *tmp, *endp;
+ static char author_buf[1024];
+ static char committer_buf[1024];
+ static char summary_buf[1024];
+
+ ret->author = author_buf;
+ get_ac_line(commit->buffer, "\nauthor ",
+ sizeof(author_buf), author_buf, &ret->author_mail,
+ &ret->author_time, &ret->author_tz);
+
+ if (!detailed)
+ return;
+
+ ret->committer = committer_buf;
+ get_ac_line(commit->buffer, "\ncommitter ",
+ sizeof(committer_buf), committer_buf, &ret->committer_mail,
+ &ret->committer_time, &ret->committer_tz);
+
+ ret->summary = summary_buf;
+ tmp = strstr(commit->buffer, "\n\n");
+ if (!tmp) {
+ error_out:
+ sprintf(summary_buf, "(%s)", sha1_to_hex(commit->object.sha1));
+ return;
+ }
+ tmp += 2;
+ endp = strchr(tmp, '\n');
+ if (!endp)
+ goto error_out;
+ len = endp - tmp;
+ if (len >= sizeof(summary_buf))
+ goto error_out;
+ memcpy(summary_buf, tmp, len);
+ summary_buf[len] = 0;
+}
+
+static const char *nth_line(const char *buf, unsigned long len, int lno)
+{
+ while (lno) {
+ if (*buf == '\n')
+ lno--;
+ buf++;
+ if (!--len)
+ die("internal error: buffer ran out");
+ }
+ return buf;
+}
+
+#define OUTPUT_ANNOTATE_COMPAT 001
+#define OUTPUT_LONG_OBJECT_NAME 002
+#define OUTPUT_RAW_TIMESTAMP 004
+#define OUTPUT_PORCELAIN 010
+#define OUTPUT_SHOW_NAME 020
+#define OUTPUT_SHOW_NUMBER 040
+
+static void emit_porcelain(struct scoreboard *sb, struct blame_entry *ent)
+{
+ int cnt;
+ const char *cp;
+ struct origin *suspect = ent->suspect;
+ char hex[41];
+
+ strcpy(hex, sha1_to_hex(suspect->commit->object.sha1));
+ printf("%s%c%d %d %d\n",
+ hex,
+ ent->guilty ? ' ' : '*', // purely for debugging
+ ent->s_lno + 1,
+ ent->lno + 1,
+ ent->num_lines);
+ if (!ent->suspect->metainfo_given) {
+ struct commit_info ci;
+ suspect->metainfo_given = 1;
+ get_commit_info(suspect->commit, &ci, 1);
+ printf("author %s\n", ci.author);
+ printf("author-mail %s\n", ci.author_mail);
+ printf("author-time %lu\n", ci.author_time);
+ printf("author-tz %s\n", ci.author_tz);
+ printf("committer %s\n", ci.committer);
+ printf("committer-mail %s\n", ci.committer_mail);
+ printf("committer-time %lu\n", ci.committer_time);
+ printf("committer-tz %s\n", ci.committer_tz);
+ printf("filename %s\n", suspect->path);
+ printf("summary %s\n", ci.summary);
+ }
+ cp = nth_line(sb->final_buf, sb->final_buf_size, ent->lno);
+ for (cnt = 0; cnt < ent->num_lines; cnt++) {
+ char ch;
+ if (cnt)
+ printf("%s %d %d\n", hex,
+ ent->s_lno + 1 + cnt,
+ ent->lno + 1 + cnt);
+ putchar('\t');
+ do {
+ ch = *cp++;
+ putchar(ch);
+ } while (ch != '\n' &&
+ cp < sb->final_buf + sb->final_buf_size);
+ }
+}
+
+static void emit_other(struct scoreboard *sb, struct blame_entry *ent, int opt)
+{
+ int cnt;
+ const char *cp;
+ struct origin *suspect = ent->suspect;
+ struct commit_info ci;
+ char hex[41];
+ int show_raw_time = !!(opt & OUTPUT_RAW_TIMESTAMP);
+
+ get_commit_info(suspect->commit, &ci, 1);
+ strcpy(hex, sha1_to_hex(suspect->commit->object.sha1));
+
+ cp = nth_line(sb->final_buf, sb->final_buf_size, ent->lno);
+ for (cnt = 0; cnt < ent->num_lines; cnt++) {
+ char ch;
+
+ printf("%.*s", (opt & OUTPUT_LONG_OBJECT_NAME) ? 40 : 8, hex);
+ if (opt & OUTPUT_ANNOTATE_COMPAT)
+ printf("\t(%10s\t%10s\t%d)", ci.author,
+ format_time(ci.author_time, ci.author_tz,
+ show_raw_time),
+ ent->lno + 1 + cnt);
+ else {
+ if (opt & OUTPUT_SHOW_NAME)
+ printf(" %-*.*s", longest_file, longest_file,
+ suspect->path);
+ if (opt & OUTPUT_SHOW_NUMBER)
+ printf(" %*d", max_orig_digits,
+ ent->s_lno + 1 + cnt);
+ printf(" (%-*.*s %10s %*d) ",
+ longest_author, longest_author, ci.author,
+ format_time(ci.author_time, ci.author_tz,
+ show_raw_time),
+ max_digits, ent->lno + 1 + cnt);
+ }
+ do {
+ ch = *cp++;
+ putchar(ch);
+ } while (ch != '\n' &&
+ cp < sb->final_buf + sb->final_buf_size);
+ }
+}
+
+static void output(struct scoreboard *sb, int option)
+{
+ int i, num;
+ struct blame_entry **ent;
+ ent = sb->entries;
+ num = sb->num_entries;
+ for (i = 0; i < num; i++) {
+ if (option & OUTPUT_PORCELAIN)
+ emit_porcelain(sb, ent[i]);
+ else
+ emit_other(sb, ent[i], option);
+ }
+}
+
+static int count_lines(const char *buf, unsigned long len)
+{
+ int num = 0;
+ if (len && buf[len-1] != '\n')
+ num++; /* incomplete line at the end */
+ while (len--) {
+ if (*buf++ == '\n')
+ num++;
+ }
+ return num;
+}
+
+static int read_ancestry(const char *graft_file)
+{
+ FILE *fp = fopen(graft_file, "r");
+ char buf[1024];
+ if (!fp)
+ return -1;
+ while (fgets(buf, sizeof(buf), fp)) {
+ /* The format is just "Commit Parent1 Parent2 ...\n" */
+ int len = strlen(buf);
+ struct commit_graft *graft = read_graft_line(buf, len);
+ register_commit_graft(graft, 0);
+ }
+ fclose(fp);
+ return 0;
+}
+
+static int lineno_width(int lines)
+{
+ int i, width;
+
+ for (width = 1, i = 10; i <= lines + 1; width++)
+ i *= 10;
+ return width;
+}
+
+static void find_alignment(struct scoreboard *sb, int *option)
+{
+ int i;
+ int longest_src_lines = 0;
+ int longest_dst_lines = 0;
+
+ for (i = 0; i < sb->num_entries; i++) {
+ struct blame_entry *e = sb->entries[i];
+ struct origin *suspect = e->suspect;
+ struct commit_info ci;
+ int num;
+
+ if (!suspect->metainfo_given) {
+ suspect->metainfo_given = 1;
+ get_commit_info(suspect->commit, &ci, 1);
+ if (strcmp(suspect->path, sb->path))
+ *option |= OUTPUT_SHOW_NAME;
+ num = strlen(suspect->path);
+ if (longest_file < num)
+ longest_file = num;
+ num = strlen(ci.author);
+ if (longest_author < num)
+ longest_author = num;
+ }
+ num = e->s_lno + e->num_lines;
+ if (longest_src_lines < num)
+ longest_src_lines = num;
+ num = e->lno + e->num_lines;
+ if (longest_dst_lines < num)
+ longest_dst_lines = num;
+ }
+ max_orig_digits = lineno_width(longest_src_lines);
+ max_digits = lineno_width(longest_dst_lines);
+}
+
+int cmd_pickaxe(int argc, const char **argv, const char *prefix)
+{
+ const char *path = argv[1];
+ unsigned char sha1[20];
+ struct scoreboard sb;
+ struct origin *o;
+ struct blame_entry *ent;
+ int i, seen_dashdash;
+ long bottom, top, lno;
+ int output_option = 0;
+ const char *revs_file = NULL;
+ const char *final_commit_name = "HEAD";
+ char type[10];
+
+ bottom = top = 0;
+ seen_dashdash = 0;
+ for (i = 1; i < argc; i++) {
+ const char *arg = argv[i];
+ if (*arg != '-')
+ break;
+ else if (!strcmp("-c", arg))
+ output_option |= OUTPUT_ANNOTATE_COMPAT;
+ else if (!strcmp("-t", arg))
+ output_option |= OUTPUT_RAW_TIMESTAMP;
+ else if (!strcmp("-l", arg))
+ output_option |= OUTPUT_LONG_OBJECT_NAME;
+ else if (!strcmp("-S", arg) && ++i < argc)
+ revs_file = argv[i];
+ else if (!strcmp("-L", arg) && ++i < argc) {
+ char *term;
+ arg = argv[i];
+ if (bottom || top)
+ die("More than one '-L n,m' option given");
+ bottom = strtol(arg, &term, 10);
+ if (*term == ',') {
+ top = strtol(term + 1, &term, 10);
+ if (*term)
+ usage(pickaxe_usage);
+ }
+ if (bottom && top && top < bottom) {
+ unsigned long tmp;
+ tmp = top; top = bottom; bottom = tmp;
+ }
+ }
+ else if (!strcmp("-f", arg) ||
+ !strcmp("--show-name", arg))
+ output_option |= OUTPUT_SHOW_NAME;
+ else if (!strcmp("-n", arg) ||
+ !strcmp("--show-number", arg))
+ output_option |= OUTPUT_SHOW_NUMBER;
+ else if (!strcmp("--porcelain", arg))
+ output_option |= OUTPUT_PORCELAIN;
+ else if (!strcmp("--", arg)) {
+ seen_dashdash = 1;
+ i++;
+ break;
+ }
+ else
+ usage(pickaxe_usage);
+ }
+
+ /* argv[i] is filename, argv[i+1] if exists is the commit */
+ if (i >= argc)
+ usage(pickaxe_usage);
+ path = argv[i++];
+ if (!seen_dashdash) {
+ struct stat st;
+ if (lstat(path, &st))
+ die("cannot stat path %s: %s", path, strerror(errno));
+ }
+
+ if (i == argc - 1)
+ final_commit_name = argv[i];
+ else if (i != argc)
+ usage(pickaxe_usage);
+
+ memset(&sb, 0, sizeof(sb));
+ if (get_sha1(final_commit_name, sha1) ||
+ !(sb.final = lookup_commit_reference(sha1)))
+ die("no such commit %s", final_commit_name);
+ o = find_origin(&sb, sb.final, path);
+ if (!o)
+ die("no such path %s in %s", path, final_commit_name);
+
+ sb.final_buf = read_sha1_file(o->blob_sha1, type, &sb.final_buf_size);
+ lno = count_lines(sb.final_buf, sb.final_buf_size);
+
+ if (bottom < 1)
+ bottom = 1;
+ if (top < 1)
+ top = lno;
+ bottom--;
+ if (lno < top)
+ die("file %s has only %lu lines", path, lno);
+
+ ent = xcalloc(1, sizeof(*ent));
+ ent->lno = bottom;
+ ent->num_lines = top - bottom;
+ ent->suspect = o;
+ ent->s_lno = bottom;
+
+ sb.entries = xcalloc(10, sizeof(struct blame_entry *));
+ sb.entries[0] = ent;
+ sb.num_entries = 1;
+ sb.path = path;
+
+ if (revs_file && read_ancestry(revs_file))
+ die("reading graft file %s failed: %s",
+ revs_file, strerror(errno));
+
+ assign_blame(&sb);
+
+ coalesce(&sb);
+
+ if (!(output_option & OUTPUT_PORCELAIN))
+ find_alignment(&sb, &output_option);
+
+ output(&sb, output_option);
+
+ return 0;
+}
diff --git a/builtin.h b/builtin.h
index f9fa9ff..7451ce6 100644
--- a/builtin.h
+++ b/builtin.h
@@ -39,6 +39,7 @@ extern int cmd_mailsplit(int argc, const
extern int cmd_mv(int argc, const char **argv, const char *prefix);
extern int cmd_name_rev(int argc, const char **argv, const char *prefix);
extern int cmd_pack_objects(int argc, const char **argv, const char *prefix);
+extern int cmd_pickaxe(int argc, const char **argv, const char *prefix);
extern int cmd_prune(int argc, const char **argv, const char *prefix);
extern int cmd_prune_packed(int argc, const char **argv, const char *prefix);
extern int cmd_push(int argc, const char **argv, const char *prefix);
diff --git a/git.c b/git.c
index e089b53..6164380 100644
--- a/git.c
+++ b/git.c
@@ -245,6 +245,7 @@ static void handle_internal_command(int
{ "mv", cmd_mv, RUN_SETUP },
{ "name-rev", cmd_name_rev, RUN_SETUP },
{ "pack-objects", cmd_pack_objects, RUN_SETUP },
+ { "pickaxe", cmd_pickaxe, RUN_SETUP },
{ "prune", cmd_prune, RUN_SETUP },
{ "prune-packed", cmd_prune_packed, RUN_SETUP },
{ "push", cmd_push, RUN_SETUP },
diff --git a/t/t8003-pickaxe.sh b/t/t8003-pickaxe.sh
new file mode 100755
index 0000000..d09d1c9
--- /dev/null
+++ b/t/t8003-pickaxe.sh
@@ -0,0 +1,9 @@
+#!/bin/sh
+
+test_description='git-pickaxe'
+. ./test-lib.sh
+
+PROG='git pickaxe -c'
+. ../annotate-tests.sh
+
+test_done
--
1.4.3.rc2.gdce3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 8:52 [PATCH] git-pickaxe: blame rewritten Junio C Hamano
@ 2006-10-12 15:58 ` Linus Torvalds
2006-10-12 19:09 ` Junio C Hamano
2006-10-13 5:35 ` Junio C Hamano
2006-10-12 19:31 ` Luben Tuikov
2006-10-12 21:55 ` Jakub Narebski
2 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2006-10-12 15:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, 12 Oct 2006, Junio C Hamano wrote:
> +
> +SYNOPSIS
> +--------
> +'git-pickaxe' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [--] <file> [<rev>]
Btw, could we please get rid of this horrible command line syntax.
Pretty much _every_ other git command takes the form
git cmd [<rev>] [--] <path>
but for some reason annotate and blame (and now pickaxe) do it the wrong
way around, and do
git cmd [--] <path> [<rev>]
which is just irritating to somebody who has grown very used to being able
to specify revisions first.
(I'd actually also like to have a range-modifier, so that I could do
git annotate --since=2.weeks.ago v2.6.18.. <path>
that didn't go back beyond a certain point, so the command line syntax
actually _does_ matter - even if we don't support that now, having the
regular command line syntax means that we -could- support it some day).
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 15:58 ` Linus Torvalds
@ 2006-10-12 19:09 ` Junio C Hamano
2006-10-12 20:10 ` Luben Tuikov
2006-10-13 20:31 ` Junio C Hamano
2006-10-13 5:35 ` Junio C Hamano
1 sibling, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-12 19:09 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Linus Torvalds <torvalds@osdl.org> writes:
> Btw, could we please get rid of this horrible command line syntax.
>
> Pretty much _every_ other git command takes the form
>
> git cmd [<rev>] [--] <path>
>
> but for some reason annotate and blame (and now pickaxe) do it the wrong
> way around, and do
>
> git cmd [--] <path> [<rev>]
>
> which is just irritating to somebody who has grown very used to being able
> to specify revisions first.
Side note: I do not intend to keep it named pickaxe -- only
while it is in "pu".
I think we could support all of them to retain backward
compatibility:
git blame [-options] <path> [<rev>] (*1*)
git blame [-options] -- <path> [<rev>] (*2*)
git blame [-options] [<rev>] [--] <path> (*3*)
(*1*) Only for path that does not start with a '-'; we should
tighten the input to make sure <path> lstat()'s fine
(which we currently do with pickaxe), <path> cannot be
interpreted as a valud rev, and when <rev> is given, <rev>
does not lstat() Ok, to avoid ambiguity. Other cases we
should require the newer format with explicit -- (*3*).
(*2*) Backward compatible canoncal format. The above comment
for validation does not apply, as (*3*) never has more
than one path for 'annotate/blame/pickaxe'
(*3*) The canonical format for ohter git commands. Without
an explicit --, <rev> should not lstat() Ok, and <path>
should, to avoid ambiguity.
> (I'd actually also like to have a range-modifier, so that I could do
>
> git annotate --since=2.weeks.ago v2.6.18.. <path>
>
> that didn't go back beyond a certain point,...
I am not sure about revision bottom (v.2.6.18..) offhand, but
the age limit (--since=2.weeks) should be trivial.
Inside pass_blame() while we iterate over the parents of the
suspect we are looking at, you can skip the parent if it is
older than the age limit, or an ancestor of revision bottom,
like this:
--- l/builtin-pickaxe.c
+++ k/builtin-pickaxe.c
@@ -450,6 +450,12 @@ static void pass_blame(struct scoreboard
parent = parent->next, parent_ix++) {
if (parse_commit(parent->item))
continue;
+
+ if (parent is older than age limit)
+ continue;
+ if (parent is an ancestor of revision bottom)
+ continue;
+
porigin = find_origin(sb, parent->item, origin->path);
if (!porigin)
porigin = find_rename(sb, parent->item, origin);
I think we can get away by checking if the parent _is_ the
revision bottom (or one of the bottoms, if you say "--not
v2.6.18 v2.6.17.13") instead of doing "is it an ancestor" check,
in practice. But that is not correct when a merge is involved.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 8:52 [PATCH] git-pickaxe: blame rewritten Junio C Hamano
2006-10-12 15:58 ` Linus Torvalds
@ 2006-10-12 19:31 ` Luben Tuikov
2006-10-12 22:20 ` Junio C Hamano
2006-10-13 0:06 ` Junio C Hamano
2006-10-12 21:55 ` Jakub Narebski
2 siblings, 2 replies; 18+ messages in thread
From: Luben Tuikov @ 2006-10-12 19:31 UTC (permalink / raw)
To: Junio C Hamano, git
--- Junio C Hamano <junkio@cox.net> wrote:
> Currently it does what git-blame does, but only faster.
>
> More importantly, its internal structure is designed to support
> content movement (aka cut-and-paste) more easily by allowing
> more than one paths to be taken from the same commit.
Good work.
> I have given only cursory check to its output (some files are
> blamed slightly differently from how git-blame does), but it
Details? How differently?
> appears that there is no major breakage. You can see for
> example try annotating builtin-apply.c starting from v1.4.0;
> there are two differences, which pickaxe assigns blame to older
> commits and both of them seem to be sensible.
Renames are naturally supported?
> diff --git a/Documentation/git-pickaxe.txt b/Documentation/git-pickaxe.txt
> new file mode 100644
> index 0000000..7f30cdf
> --- /dev/null
> +++ b/Documentation/git-pickaxe.txt
> @@ -0,0 +1,104 @@
> +git-pickaxe(1)
> +==============
> +
> +NAME
> +----
> +git-pickaxe - Show what revision and author last modified each line of a file
[...]
> +THE PORCELAIN FORMAT
> +--------------------
> +
Let's quantify the output:
---cut---
The porcelain format is as follows:
<SHA-1> <orig line> <line> [<num lines>
author <name>
author-mail <email format>
author-time <time>
author-tz <TZ>
committer <name>
committer-mail <email format>
committer-time <time>
committer-tz <TZ>
filename <string>
summary <string>]
<TAB><line data>
Where
<SHA-1> is the SHA-1 of the commit which introduces this line.
<orig line> it the line number where this line is introduced.
<line> is the line number of the final file (at SHA-1 commit)
Then, if <SHA-1> is different from the previous line's SHA-1 (if no
previous then always different), a header follows. It starts by the
number of lines that this <SHA-1> commit introduces, then on a new
line, information about about the commit is printed for the following
seveal lines, then newline, TAB, and the line data.
If, OTOH, <SHA-1> is the same as the previous line's <SHA-1> then the
header is not printed as indicated by brackets. Just newline,
TAB, and the line data.
---cut----
This kind of makes it slightly clearer.
Junio, is it possible to also print the "previous" commit?
I mean, is it tenable to print the commit such that
a "git-diff C B -- A:file" will give a diff of the block of lines
we're looking at?
Picture:
Annotate A:file
C B line 1
X D line 2
C B line 3
F G line 4
...
Currently we do not print "C, X, F", but only "B, D, G".
So in effect, <orig line> is the line of, wlg, "C:file"
where line 1 was introduced (by commit B). I.e.
the "parent" commit of that commit, which doesn't always
exist. (e.g. if the commit added the file)
Is this tenable? If it is not or if it is going to make
it slow or ambiguous, lets forget about it.
Luben
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 19:09 ` Junio C Hamano
@ 2006-10-12 20:10 ` Luben Tuikov
2006-10-13 20:31 ` Junio C Hamano
1 sibling, 0 replies; 18+ messages in thread
From: Luben Tuikov @ 2006-10-12 20:10 UTC (permalink / raw)
To: Junio C Hamano, Linus Torvalds; +Cc: git
--- Junio C Hamano <junkio@cox.net> wrote:
> Side note: I do not intend to keep it named pickaxe -- only
> while it is in "pu".
Yes, I was going to mention something about that, but hesitated
and didn't.
What name is a good name? Reuse and old one? "blame", "annotate"?
Luben
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 8:52 [PATCH] git-pickaxe: blame rewritten Junio C Hamano
2006-10-12 15:58 ` Linus Torvalds
2006-10-12 19:31 ` Luben Tuikov
@ 2006-10-12 21:55 ` Jakub Narebski
2006-10-12 22:47 ` Petr Baudis
2 siblings, 1 reply; 18+ messages in thread
From: Jakub Narebski @ 2006-10-12 21:55 UTC (permalink / raw)
To: git
Junio C Hamano wrote:
> Currently it does what git-blame does, but only faster.
>
> More importantly, its internal structure is designed to support
> content movement (aka cut-and-paste) more easily by allowing
> more than one paths to be taken from the same commit.
>
> Signed-off-by: Junio C Hamano <junkio@cox.net>
> ---
>
> I really hate to do this immediately after writing obituary for
> annotate, but I had a solid 24-hour to work on git, which is a
> rare opportunity for me these days, so here it is.
Why not reuse git-annotate name? git-pickaxe doesn't do pickaxe...
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 19:31 ` Luben Tuikov
@ 2006-10-12 22:20 ` Junio C Hamano
2006-10-13 20:54 ` Luben Tuikov
2006-10-13 0:06 ` Junio C Hamano
1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2006-10-12 22:20 UTC (permalink / raw)
To: Luben Tuikov; +Cc: git
Luben Tuikov <ltuikov@yahoo.com> writes:
> Junio, is it possible to also print the "previous" commit?
> I mean, is it tenable to print the commit such that
> a "git-diff C B -- A:file" will give a diff of the block of lines
> we're looking at?
There is no single "previous" in general. Which side of the
merge would you take?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 21:55 ` Jakub Narebski
@ 2006-10-12 22:47 ` Petr Baudis
2006-10-12 23:35 ` Junio C Hamano
0 siblings, 1 reply; 18+ messages in thread
From: Petr Baudis @ 2006-10-12 22:47 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Dear diary, on Thu, Oct 12, 2006 at 11:55:49PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Junio C Hamano wrote:
>
> > Currently it does what git-blame does, but only faster.
> >
> > More importantly, its internal structure is designed to support
> > content movement (aka cut-and-paste) more easily by allowing
> > more than one paths to be taken from the same commit.
> >
> > Signed-off-by: Junio C Hamano <junkio@cox.net>
> > ---
> >
> > I really hate to do this immediately after writing obituary for
> > annotate, but I had a solid 24-hour to work on git, which is a
> > rare opportunity for me these days, so here it is.
>
> Why not reuse git-annotate name? git-pickaxe doesn't do pickaxe...
I agree that git-pickaxe is wrong, and luckily Junio does too,
apparently.
But please, let's not go right back to the git-annotate / git-blame
situation. It's just confusing to have two tools that do the same thing,
perhaps subtly differently. If it's gonna replace git-blame, it should
either do that right away or live as git-blame2 for some time, but not
play any confusing games with the names.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 22:47 ` Petr Baudis
@ 2006-10-12 23:35 ` Junio C Hamano
0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-12 23:35 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
Petr Baudis <pasky@suse.cz> writes:
> But please, let's not go right back to the git-annotate / git-blame
> situation. It's just confusing to have two tools that do the same thing,
> perhaps subtly differently. If it's gonna replace git-blame, it should
> either do that right away or live as git-blame2 for some time, but not
> play any confusing games with the names.
Actually the plan is to make it do _true_ pickaxe, although it
will most likely end up either in dustbin or replace blame.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 19:31 ` Luben Tuikov
2006-10-12 22:20 ` Junio C Hamano
@ 2006-10-13 0:06 ` Junio C Hamano
2006-10-13 1:04 ` Luben Tuikov
1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2006-10-13 0:06 UTC (permalink / raw)
To: Luben Tuikov; +Cc: git
Luben Tuikov <ltuikov@yahoo.com> writes:
> Renames are naturally supported?
Didn't I say you can see it yourself by doing this and that?
A short answer is yes. I'll do a re-write-up of algorithm
overview and directions to enhance it in a separate message when
I have time.
One interesting difference is that it can do partial file (line
range). With the "parameter order" fix Linus asked, I can do:
$ git-pickaxe -L 551,554 master commit.c
which would run a lot faster than:
$ git-pickaxe -L 551,560 master commit.c
This is a made-up example, but you would get the idea. In
short, it does not have to dig deeper outside the range
originally you specified.
> ---cut---
> The porcelain format is as follows:
>
> <SHA-1> <orig line> <line> [<num lines>
This is misleading. <num lines> is always shown for the group
head, even if there was another group earlier from the same
commit (otherwise the Porcelain has to buffer the chunk, which
you did not like). Second and subsequent lines in the same
group do not have <num lines>.
> author <name>
> author-mail <email format>
>...
> committer-tz <TZ>
> filename <string>
> summary <string>]
> <TAB><line data>
>
> Where
> <SHA-1> is the SHA-1 of the commit which introduces this line.
> <orig line> it the line number where this line is introduced.
> <line> is the line number of the final file (at SHA-1 commit)
>
> Then, if <SHA-1> is different from the previous line's SHA-1 (if no
> previous then always different), a header follows. It starts by the
> number of lines that this <SHA-1> commit introduces,...
So this description is wrong; <num lines> is not part of the
"extra".
>...
> ---cut----
>
> This kind of makes it slightly clearer.
I deliberately left it vague so that we can add things later to
the header. Porcelains should ignore the fields that they do
not understand, and should not expect these fields listed above
come in the order the above list shows.
Also I deliberately left it vague so that Porcelains can get the
header for the same SHA-1 more than once. This is needed when
we add "ciff" to pick more than one paths from the same commit.
In such a case, most likely we are better off not to repeat
header fields from author...committer-tz and summary but we
would need filename. The expectation to the Porcelains is:
Read one line, which begins with commit object name and two
or three numbers; if it has three numbers, it is the
beginning of a group. Any such "object name" line can be
followed by a header. Read subsequent lines until you get a
line that begins with a <TAB>.
- do not get upset if you see a header field that you do not
understand; you are reading from newer version of blame.
- do not get upset if you see a header for the commit you
have already seen. Some of the header fields can be
updated from the one you have seen the last time, so use
the updated value from the new header.
Lines that begin with a <TAB> is data from the file being
annotated. The object name and headers you saw before this
line annotate it.
So Porcelains need to keep one set of these header fields per
commit object.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-13 0:06 ` Junio C Hamano
@ 2006-10-13 1:04 ` Luben Tuikov
2006-10-13 1:45 ` Junio C Hamano
0 siblings, 1 reply; 18+ messages in thread
From: Luben Tuikov @ 2006-10-13 1:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
--- Junio C Hamano <junkio@cox.net> wrote:
> > ---cut---
> > The porcelain format is as follows:
> >
> > <SHA-1> <orig line> <line> [<num lines>
>
> This is misleading. <num lines> is always shown for the group
> head, even if there was another group earlier from the same
> commit (otherwise the Porcelain has to buffer the chunk, which
Completely understood and always agreed upon.
> you did not like). Second and subsequent lines in the same
> group do not have <num lines>.
Completely understood and always agreed upon.
>
> > author <name>
> > author-mail <email format>
> >...
> > committer-tz <TZ>
> > filename <string>
> > summary <string>]
> > <TAB><line data>
> >
> > Where
> > <SHA-1> is the SHA-1 of the commit which introduces this line.
> > <orig line> it the line number where this line is introduced.
> > <line> is the line number of the final file (at SHA-1 commit)
> >
> > Then, if <SHA-1> is different from the previous line's SHA-1 (if no
> > previous then always different), a header follows. It starts by the
> > number of lines that this <SHA-1> commit introduces,...
>
> So this description is wrong; <num lines> is not part of the
> "extra".
So, it is possible to print
<SHA-1> <orig lineno> <this lineno> <num lines>
TAB<data line>
?
> I deliberately left it vague so that we can add things later to
> the header. Porcelains should ignore the fields that they do
> not understand, and should not expect these fields listed above
> come in the order the above list shows.
If the format changes, then porcelains should change their parsing
algorithms.
> Also I deliberately left it vague so that Porcelains can get the
> header for the same SHA-1 more than once. This is needed when
For different blocks separated by at least one different commit, yes.
But the opposite should not be true.
That is, you start a group, only when the previous commit differs
from this commit.
It would be prudent to print a <num lines> iff a group is started.
> we add "ciff" to pick more than one paths from the same commit.
> In such a case, most likely we are better off not to repeat
> header fields from author...committer-tz and summary but we
> would need filename. The expectation to the Porcelains is:
>
> Read one line, which begins with commit object name and two
> or three numbers; if it has three numbers, it is the
> beginning of a group.
And if it doesn't have three numbers?
Luben
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-13 1:04 ` Luben Tuikov
@ 2006-10-13 1:45 ` Junio C Hamano
0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-13 1:45 UTC (permalink / raw)
To: Luben Tuikov; +Cc: git
Luben Tuikov <ltuikov@yahoo.com> writes:
> That is, you start a group, only when the previous commit differs
> from this commit.
>
> It would be prudent to print a <num lines> iff a group is started.
Yes. I think the current code does that.
>> we add "ciff" to pick more than one paths from the same commit.
>> In such a case, most likely we are better off not to repeat
>> header fields from author...committer-tz and summary but we
>> would need filename. The expectation to the Porcelains is:
>>
>> Read one line, which begins with commit object name and two
>> or three numbers; if it has three numbers, it is the
>> beginning of a group.
>
> And if it doesn't have three numbers?
It can still have header if we wanted to say some more details
about the line. What's the problem?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 15:58 ` Linus Torvalds
2006-10-12 19:09 ` Junio C Hamano
@ 2006-10-13 5:35 ` Junio C Hamano
1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-13 5:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Linus Torvalds <torvalds@osdl.org> writes:
> On Thu, 12 Oct 2006, Junio C Hamano wrote:
>> +
>> +SYNOPSIS
>> +--------
>> +'git-pickaxe' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [--] <file> [<rev>]
>
> Btw, could we please get rid of this horrible command line syntax.
>
> Pretty much _every_ other git command takes the form
>
> git cmd [<rev>] [--] <path>
>
> but for some reason annotate and blame (and now pickaxe) do it the wrong
> way around, and do
>
> git cmd [--] <path> [<rev>]
>
> which is just irritating to somebody who has grown very used to being able
> to specify revisions first.
>
> (I'd actually also like to have a range-modifier, so that I could do
>
> git annotate --since=2.weeks.ago v2.6.18.. <path>
>
> that didn't go back beyond a certain point, so the command line syntax
> actually _does_ matter - even if we don't support that now, having the
> regular command line syntax means that we -could- support it some day).
>
> Linus
So here it is. I did not think "-n 20" would make sense for
this application so the code does not do anything about it.
This will be parked on "pu" for now.
-- >8 --
[PATCH] git-pickaxe: minimally use revision machinery.
This rewrites pickaxe argument parser to use revision machinery,
so that you can say something like:
git pickaxe --since=2.weeks master commit.c
git pickaxe -L 20,40 v1.4.0..master -- commit.c
The traditional parameter order (path first then optional rev)
is supported but cannot be mixed with revision limiting
parameters.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
Documentation/git-pickaxe.txt | 2 +-
builtin-pickaxe.c | 221 ++++++++++++++++++++++++++++++++---------
2 files changed, 173 insertions(+), 50 deletions(-)
diff --git a/Documentation/git-pickaxe.txt b/Documentation/git-pickaxe.txt
index 7f30cdf..7685bd0 100644
--- a/Documentation/git-pickaxe.txt
+++ b/Documentation/git-pickaxe.txt
@@ -7,7 +7,7 @@ git-pickaxe - Show what revision and aut
SYNOPSIS
--------
-'git-pickaxe' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [--] <file> [<rev>]
+'git-pickaxe' [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [<rev>] [--] <file>
DESCRIPTION
-----------
diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
index 0e26308..0aa4565 100644
--- a/builtin-pickaxe.c
+++ b/builtin-pickaxe.c
@@ -11,13 +11,14 @@ #include "commit.h"
#include "tree-walk.h"
#include "diff.h"
#include "diffcore.h"
+#include "revision.h"
#include "xdiff-interface.h"
#include <time.h>
#include <sys/time.h>
static char pickaxe_usage[] =
-"git-pickaxe [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [--] file [commit]\n"
+"git-pickaxe [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [commit] [--] file\n"
" -c, --compatibility Use the same output mode as git-annotate (Default: off)\n"
" -l, --long Show long commit SHA1 (Default: off)\n"
" -t, --time Show raw timestamp (Default: off)\n"
@@ -109,7 +110,8 @@ static void coalesce(struct scoreboard *
for (o = 0, i = 0; i < num; i++) {
if (o &&
ent[o-1]->suspect == ent[i]->suspect &&
- ent[o-1]->s_lno + ent[o-1]->num_lines == ent[i]->s_lno) {
+ ent[o-1]->s_lno + ent[o-1]->num_lines == ent[i]->s_lno &&
+ ent[o-1]->guilty == ent[i]->guilty) {
ent[o-1]->num_lines += ent[i]->num_lines;
free(ent[i]);
continue;
@@ -220,7 +222,8 @@ static void process_u0_diff(void *state_
struct chunk *chunk;
int off1, off2, len1, len2;
-if(DEBUG) fprintf(stderr, "%.*s", (int) len, line);
+ if (DEBUG)
+ fprintf(stderr, "%.*s", (int) len, line);
if (len < 4 || line[0] != '@' || line[1] != '@')
return;
@@ -256,17 +259,23 @@ static struct patch *get_patch(struct or
mmfile_t file_p, file_o;
xdemitcb_t ecb;
char type[10];
-
-if(DEBUG) fprintf(stderr, "get patch %.8s %.8s\n",
- sha1_to_hex(parent->commit->object.sha1),
- sha1_to_hex(origin->commit->object.sha1));
-
- file_p.ptr = read_sha1_file(parent->blob_sha1, type,
- (unsigned long *) &file_p.size);
- file_o.ptr = read_sha1_file(origin->blob_sha1, type,
- (unsigned long *) &file_o.size);
- if (!file_p.ptr || !file_o.ptr)
+ char *blob_p, *blob_o;
+
+ if (DEBUG) fprintf(stderr, "get patch %.8s %.8s\n",
+ sha1_to_hex(parent->commit->object.sha1),
+ sha1_to_hex(origin->commit->object.sha1));
+
+ blob_p = read_sha1_file(parent->blob_sha1, type,
+ (unsigned long *) &file_p.size);
+ blob_o = read_sha1_file(origin->blob_sha1, type,
+ (unsigned long *) &file_o.size);
+ file_p.ptr = blob_p;
+ file_o.ptr = blob_o;
+ if (!file_p.ptr || !file_o.ptr) {
+ free(blob_p);
+ free(blob_o);
return NULL;
+ }
xpp.flags = XDF_NEED_MINIMAL;
xecfg.ctxlen = 0;
@@ -280,6 +289,8 @@ if(DEBUG) fprintf(stderr, "get patch %.8
state.ret->num = 0;
xdl_diff(&file_p, &file_o, &xpp, &xecfg, &ecb);
+ free(blob_p);
+ free(blob_o);
return state.ret;
}
@@ -417,11 +428,12 @@ static int pass_blame_to_parent(struct s
for (i = 0; i < patch->num; i++) {
struct chunk *chunk = &patch->chunks[i];
-if(DEBUG) fprintf(stderr,
- "plno = %d, tlno = %d, "
- "same as parent up to %d, resync %d and %d\n",
- plno, tlno,
- chunk->same, chunk->p_next, chunk->t_next);
+ if (DEBUG)
+ fprintf(stderr,
+ "plno = %d, tlno = %d, "
+ "same as parent up to %d, resync %d and %d\n",
+ plno, tlno,
+ chunk->same, chunk->p_next, chunk->t_next);
blame_chunk(sb, tlno, plno, chunk->same, target, parent);
plno = chunk->p_next;
tlno = chunk->t_next;
@@ -434,7 +446,8 @@ if(DEBUG) fprintf(stderr,
}
#define MAXPARENT 16
-static void pass_blame(struct scoreboard *sb, struct origin *origin)
+static void pass_blame(struct scoreboard *sb, struct origin *origin,
+ struct rev_info *revs)
{
int i, parent_ix;
struct commit *commit = origin->commit;
@@ -448,8 +461,15 @@ static void pass_blame(struct scoreboard
for (parent_ix = 0, parent = commit->parents;
parent_ix < MAXPARENT && parent;
parent = parent->next, parent_ix++) {
- if (parse_commit(parent->item))
+ struct commit *p = parent->item;
+
+ if (parse_commit(p))
+ continue;
+ if (p->object.flags & UNINTERESTING)
continue;
+ if (revs->max_age != -1 && p->date < revs->max_age)
+ continue;
+
porigin = find_origin(sb, parent->item, origin->path);
if (!porigin)
porigin = find_rename(sb, parent->item, origin);
@@ -485,14 +505,9 @@ static void pass_blame(struct scoreboard
* Optionally run "ciff" to find copies from parents' files here
*/
- /* Take responsibility for the remaining entries */
- for (i = 0; i < sb->num_entries; i++)
- if (!sb->entries[i]->guilty &&
- sb->entries[i]->suspect == origin)
- sb->entries[i]->guilty = 1;
}
-static void assign_blame(struct scoreboard *sb)
+static void assign_blame(struct scoreboard *sb, struct rev_info *revs)
{
while (1) {
int i;
@@ -506,7 +521,13 @@ static void assign_blame(struct scoreboa
if (!suspect)
return; /* all done */
- pass_blame(sb, suspect);
+ pass_blame(sb, suspect, revs);
+
+ /* Take responsibility for the remaining entries */
+ for (i = 0; i < sb->num_entries; i++)
+ if (!sb->entries[i]->guilty &&
+ sb->entries[i]->suspect == suspect)
+ sb->entries[i]->guilty = 1;
}
}
@@ -829,23 +850,29 @@ static void find_alignment(struct scoreb
max_digits = lineno_width(longest_dst_lines);
}
+static int has_path_in_work_tree(const char *path)
+{
+ struct stat st;
+ return !lstat(path, &st);
+}
+
int cmd_pickaxe(int argc, const char **argv, const char *prefix)
{
- const char *path = argv[1];
- unsigned char sha1[20];
+ struct rev_info revs;
+ const char *path;
struct scoreboard sb;
struct origin *o;
struct blame_entry *ent;
- int i, seen_dashdash;
+ int i, seen_dashdash, unk;
long bottom, top, lno;
int output_option = 0;
const char *revs_file = NULL;
- const char *final_commit_name = "HEAD";
+ const char *final_commit_name = NULL;
char type[10];
bottom = top = 0;
seen_dashdash = 0;
- for (i = 1; i < argc; i++) {
+ for (unk = i = 1; i < argc; i++) {
const char *arg = argv[i];
if (*arg != '-')
break;
@@ -887,28 +914,121 @@ int cmd_pickaxe(int argc, const char **a
break;
}
else
+ argv[unk++] = arg;
+ }
+
+ /* We have collected options unknown to us in argv[1..unk]
+ * which are to be passed to revision machinery if we are
+ * going to do the "bottom" procesing.
+ *
+ * The remaining are:
+ *
+ * (1) if seen_dashdash, its either
+ * "-options -- <path>" or
+ * "-options -- <path> <rev>".
+ * but the latter is allowed only if there is no
+ * options that we passed to revision machinery.
+ *
+ * (2) otherwise, we may have "--" somewhere later and
+ * might be looking at the first one of multiple 'rev'
+ * parameters (e.g. " master ^next ^maint -- path").
+ * See if there is a dashdash first, and give the
+ * arguments before that to revision machinery.
+ * After that there must be one 'path'.
+ *
+ * (3) otherwise, its one of the three:
+ * "-options <path> <rev>"
+ * "-options <rev> <path>"
+ * "-options <path>"
+ * but again the first one is allowed only if
+ * there is no options that we passed to revision
+ * machinery.
+ */
+
+ if (seen_dashdash) {
+ /* (1) */
+ if (argc <= i)
+ usage(pickaxe_usage);
+ path = argv[i];
+ if (i + 1 == argc - 1) {
+ if (unk != 1)
+ usage(pickaxe_usage);
+ argv[unk++] = argv[i + 1];
+ }
+ else if (i + 1 != argc)
+ /* garbage at end */
usage(pickaxe_usage);
}
+ else {
+ int j;
+ for (j = i; !seen_dashdash && j < argc; j++)
+ if (!strcmp(argv[j], "--"))
+ seen_dashdash = j;
+ if (seen_dashdash) {
+ if (seen_dashdash + 1 != argc - 1)
+ usage(pickaxe_usage);
+ path = argv[seen_dashdash + 1];
+ for (j = i; j < seen_dashdash; j++)
+ argv[unk++] = argv[j];
+ }
+ else {
+ /* (3) */
+ path = argv[i];
+ if (i + 1 == argc - 1) {
+ final_commit_name = argv[i + 1];
+
+ /* if (unk == 1) we could be getting
+ * old-style
+ */
+ if (unk == 1 && !has_path_in_work_tree(path)) {
+ path = argv[i + 1];
+ final_commit_name = argv[i];
+ }
+ }
+ else if (i != argc - 1)
+ usage(pickaxe_usage); /* garbage at end */
- /* argv[i] is filename, argv[i+1] if exists is the commit */
- if (i >= argc)
- usage(pickaxe_usage);
- path = argv[i++];
- if (!seen_dashdash) {
- struct stat st;
- if (lstat(path, &st))
- die("cannot stat path %s: %s", path, strerror(errno));
+ if (!has_path_in_work_tree(path))
+ die("cannot stat path %s: %s",
+ path, strerror(errno));
+ }
}
- if (i == argc - 1)
- final_commit_name = argv[i];
- else if (i != argc)
- usage(pickaxe_usage);
+ if (final_commit_name)
+ argv[unk++] = final_commit_name;
+ /* Now we got rev and path. We do not want the path pruning
+ * but we may want "bottom" processing.
+ */
+ argv[unk] = NULL;
+
+ init_revisions(&revs, NULL);
+ setup_revisions(unk, argv, &revs, "HEAD");
memset(&sb, 0, sizeof(sb));
- if (get_sha1(final_commit_name, sha1) ||
- !(sb.final = lookup_commit_reference(sha1)))
- die("no such commit %s", final_commit_name);
+
+ /* There must be one and only one positive commit in the
+ * revs->pending array.
+ */
+ for (i = 0; i < revs.pending.nr; i++) {
+ struct object *obj = revs.pending.objects[i].item;
+ if (obj->flags & UNINTERESTING)
+ continue;
+ if (obj->type != OBJ_COMMIT)
+ die("Non commit %s?",
+ revs.pending.objects[i].name);
+ if (sb.final)
+ die("More than one commit to dig from %s and %s?",
+ revs.pending.objects[i].name,
+ final_commit_name);
+ sb.final = (struct commit *) obj;
+ final_commit_name = revs.pending.objects[i].name;
+ }
+ /* If we have bottom, this will mark the ancestors of the
+ * bottom commits we would reach while traversing as
+ * uninteresting.
+ */
+ prepare_revision_walk(&revs);
+
o = find_origin(&sb, sb.final, path);
if (!o)
die("no such path %s in %s", path, final_commit_name);
@@ -939,7 +1059,7 @@ int cmd_pickaxe(int argc, const char **a
die("reading graft file %s failed: %s",
revs_file, strerror(errno));
- assign_blame(&sb);
+ assign_blame(&sb, &revs);
coalesce(&sb);
@@ -947,6 +1067,9 @@ int cmd_pickaxe(int argc, const char **a
find_alignment(&sb, &output_option);
output(&sb, output_option);
-
+ free((void *)sb.final_buf);
+ for (i = 0; i < sb.num_entries; i++)
+ free(sb.entries[i]);
+ free(sb.entries);
return 0;
}
--
1.4.3.rc2.g51ca
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 19:09 ` Junio C Hamano
2006-10-12 20:10 ` Luben Tuikov
@ 2006-10-13 20:31 ` Junio C Hamano
1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-13 20:31 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Junio C Hamano <junkio@cox.net> writes:
>> (I'd actually also like to have a range-modifier, so that I could do
>>
>> git annotate --since=2.weeks.ago v2.6.18.. <path>
>>
>> that didn't go back beyond a certain point,...
>
> I am not sure about revision bottom (v.2.6.18..) offhand, but
> the age limit (--since=2.weeks) should be trivial.
>
> Inside pass_blame() while we iterate over the parents of the
> suspect we are looking at, you can skip the parent if it is
> older than the age limit, or an ancestor of revision bottom,
> like this:
>
> --- l/builtin-pickaxe.c
> +++ k/builtin-pickaxe.c
> @@ -450,6 +450,12 @@ static void pass_blame(struct scoreboard
> parent = parent->next, parent_ix++) {
> if (parse_commit(parent->item))
> continue;
> +
> + if (parent is older than age limit)
> + continue;
> + if (parent is an ancestor of revision bottom)
> + continue;
> +
> porigin = find_origin(sb, parent->item, origin->path);
> if (!porigin)
> porigin = find_rename(sb, parent->item, origin);
>
This is not quite right. The above code tries to avoid passing
blames to the boundary revisions, so "v2.6.18.." makes every old
line to be blamed on one revision after v2.6.18, which might be
technically correct but not what we want.
Instead, we should pass blame, and then prevent boundary
revisions to pass blame further down. The code in "pu" today
has seen slight restructure of this part and this change should
be easier to implement there, something along the lines of...
--- l/builtin-pickaxe.c
+++ k/builtin-pickaxe.c
@@ -450,8 +450,7 @@ static int pass_blame_to_parent(struct s
}
#define MAXPARENT 16
-static void pass_blame(struct scoreboard *sb, struct origin *origin,
- struct rev_info *revs)
+static void pass_blame(struct scoreboard *sb, struct origin *origin)
{
int i, parent_ix;
struct commit *commit = origin->commit;
@@ -469,10 +468,6 @@ static void pass_blame(struct scoreboard
if (parse_commit(p))
continue;
- if (p->object.flags & UNINTERESTING)
- continue;
- if (revs->max_age != -1 && p->date < revs->max_age)
- continue;
porigin = find_origin(sb, parent->item, origin->path);
if (!porigin)
@@ -516,6 +511,7 @@ static void assign_blame(struct scoreboa
while (1) {
int i;
struct origin *suspect = NULL;
+ struct commit *commit;
/* find one suspect to break down */
for (i = 0; !suspect && i < sb->num_entries; i++) {
@@ -525,7 +521,10 @@ static void assign_blame(struct scoreboa
if (!suspect)
return; /* all done */
- pass_blame(sb, suspect, revs);
+ commit = suspect->commit;
+ if (!(commit->object.flags & UNINTERESTING) &&
+ !(revs->max_age != -1 && commit->date < revs->max_age))
+ pass_blame(sb, suspect, revs);
/* Take responsibility for the remaining entries */
for (i = 0; i < sb->num_entries; i++)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-12 22:20 ` Junio C Hamano
@ 2006-10-13 20:54 ` Luben Tuikov
2006-10-13 21:38 ` Junio C Hamano
0 siblings, 1 reply; 18+ messages in thread
From: Luben Tuikov @ 2006-10-13 20:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
--- Junio C Hamano <junkio@cox.net> wrote:
> Luben Tuikov <ltuikov@yahoo.com> writes:
>
> > Junio, is it possible to also print the "previous" commit?
> > I mean, is it tenable to print the commit such that
> > a "git-diff C B -- A:file" will give a diff of the block of lines
> > we're looking at?
>
> There is no single "previous" in general. Which side of the
> merge would you take?
The parent commit.
For example,
Annotation/blame of A:File
C 1 line 1
2 line 2
D 3 line 3
B 4 line 4
The parent commit of C, such that,
git-diff parent(C) C -- A:File
will give me the diff which introduced
the "first block", or more generally,
all lines annotated with C.
Then when I click on 1 or 2, I'd like to
see Annotation/blame of parent(C):File,
on the line number where the C "block" was
introduced.
Luben
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-13 20:54 ` Luben Tuikov
@ 2006-10-13 21:38 ` Junio C Hamano
2006-10-13 21:59 ` Luben Tuikov
0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2006-10-13 21:38 UTC (permalink / raw)
To: Luben Tuikov; +Cc: git
Luben Tuikov <ltuikov@yahoo.com> writes:
> --- Junio C Hamano <junkio@cox.net> wrote:
>> Luben Tuikov <ltuikov@yahoo.com> writes:
>>
>> > Junio, is it possible to also print the "previous" commit?
>> > I mean, is it tenable to print the commit such that
>> > a "git-diff C B -- A:file" will give a diff of the block of lines
>> > we're looking at?
>>
>> There is no single "previous" in general. Which side of the
>> merge would you take?
>
> The parent commit.
There is no single "the parent commit" in general. Which side
of the merge would you take?
Also remeber, when we blame a line to a revision (unless we do
not limit the blame with v2.6.18.. and --since=2.weeks which
only git-pickaxe can do), the line is known to have been
introduced by _that_ commit. If there were a corresponding line
in "the parent commit" for that line, we would not have assigned
the blame to the commit, but the blame would have been passed
down to "the parent commit" already.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-13 21:38 ` Junio C Hamano
@ 2006-10-13 21:59 ` Luben Tuikov
2006-10-13 22:50 ` Junio C Hamano
0 siblings, 1 reply; 18+ messages in thread
From: Luben Tuikov @ 2006-10-13 21:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
--- Junio C Hamano <junkio@cox.net> wrote:
> Luben Tuikov <ltuikov@yahoo.com> writes:
>
> > --- Junio C Hamano <junkio@cox.net> wrote:
> >> Luben Tuikov <ltuikov@yahoo.com> writes:
> >>
> >> > Junio, is it possible to also print the "previous" commit?
> >> > I mean, is it tenable to print the commit such that
> >> > a "git-diff C B -- A:file" will give a diff of the block of lines
> >> > we're looking at?
> >>
> >> There is no single "previous" in general. Which side of the
> >> merge would you take?
> >
> > The parent commit.
>
> There is no single "the parent commit" in general. Which side
> of the merge would you take?
Yes, I realise that...
I guess I'm trying to get to a successful implementation of
the intention of commit 65910395c08e3dc4be685a9a9f60adfa61c89aa5
(later reverted for a good reason).
It is ok if this is not possible. After all, the absolutely
unambiguous way is blame->commit->blame->commit->..., etc,
due to multiple parenting.
> Also remeber, when we blame a line to a revision (unless we do
> not limit the blame with v2.6.18.. and --since=2.weeks which
> only git-pickaxe can do), the line is known to have been
> introduced by _that_ commit.
That is what we want.
(fully agree with your previous comment that we limit _after_
placing blame on a commit...)
> If there were a corresponding line
> in "the parent commit" for that line, we would not have assigned
> the blame to the commit, but the blame would have been passed
> down to "the parent commit" already.
Indeed.
Luben
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] git-pickaxe: blame rewritten.
2006-10-13 21:59 ` Luben Tuikov
@ 2006-10-13 22:50 ` Junio C Hamano
0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2006-10-13 22:50 UTC (permalink / raw)
To: ltuikov; +Cc: git
Luben Tuikov <ltuikov@yahoo.com> writes:
>> If there were a corresponding line
>> in "the parent commit" for that line, we would not have assigned
>> the blame to the commit, but the blame would have been passed
>> down to "the parent commit" already.
>
> Indeed.
So is the topic retracted?
Actually, I kind of know what you want. However, I just do not
think blame is the right place to do that.
Suppose you have three lines with this history (for simplicity
the ancestry is linear, time flows from left to right):
Revisions 0 1 2 3 4 5
line 1 a a a A A A
line 2 b b x y B B
line 3 c C C C C C
and we dig from commit 5. The three lines will get blamed to
commit 3, 4 and 1 respectively.
last change lineno data
3 1 A
4 2 B
1 3 C
You click lineno #2 and what you will see is this:
Revisions 0 1 2 3 4
line 1 a a a A A
line 2 b b x y B
line 3 c C C C C
last change lineno data
3 1 A
4 2 B
1 3 C
In such a case, we already know the current commit we are
looking at (commit 4) is what introduced 'B', so there is no
more digging down from that line, but we would like to peel that
'B' to reveal what was behind it (in this case, 'y'). Clicking
line 1 would take you to commit 3 and that happens to show the
line after the line that had 'A' had 'y', but clicking line 3
would take you to commit 1 and you would miss changes that had
'x' or 'y' on the line before the line that had 'C'. So "click
surrounding ones to see how things changed" is not quite the
right answer to the question "Ok, we know 'B' appeared in the
commit that got blamed. What, if anything, was there instead
before that?"
You would need to diff commit 4 and commit 3 (after all, we know
4 introduced 'B' on line 2 so we know rev 3 is already different
and does not have 'B' there) and guess what's on that line, and
if you have more than one parent for commit 4 you would need to
do that for each parent and show that in some human readable
way. This will be expensive but is a useful thing to do "on
demand".
I think "on demand" is really the key word here. We can spend
extra cycles to find that out when we know we are interested in
what, if any, was there where we have 'B' today. Spending that
extra cycle to all lines in the blamed file will end up wasted
if we are using this for "iteratively dig deeper" in gitweb.
That is why I do not think blame is the right place to do this.
What you can do to improve gitweb is to change the URL that each
lineno on the blame page has, so that when it is pointing at the
current commit (i.e. the commit we are looking at is the one
that was blamed for the line), have it run an equivalent of
"git-diff-tree -m -p commit", and guess what line that line that
did not exist in any of the parent (in our example, line 2 that
had 'B') is in the parent, and show 'git-blame' for that file in
that parent around that line.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2006-10-13 22:51 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-12 8:52 [PATCH] git-pickaxe: blame rewritten Junio C Hamano
2006-10-12 15:58 ` Linus Torvalds
2006-10-12 19:09 ` Junio C Hamano
2006-10-12 20:10 ` Luben Tuikov
2006-10-13 20:31 ` Junio C Hamano
2006-10-13 5:35 ` Junio C Hamano
2006-10-12 19:31 ` Luben Tuikov
2006-10-12 22:20 ` Junio C Hamano
2006-10-13 20:54 ` Luben Tuikov
2006-10-13 21:38 ` Junio C Hamano
2006-10-13 21:59 ` Luben Tuikov
2006-10-13 22:50 ` Junio C Hamano
2006-10-13 0:06 ` Junio C Hamano
2006-10-13 1:04 ` Luben Tuikov
2006-10-13 1:45 ` Junio C Hamano
2006-10-12 21:55 ` Jakub Narebski
2006-10-12 22:47 ` Petr Baudis
2006-10-12 23:35 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).