git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 01/13] parse-options: stop when encounter a non-option
@ 2010-07-11  6:18 Bo Yang
  2010-07-11  6:18 ` [PATCH v3 02/13] parse-options: add two helper functions Bo Yang
                   ` (12 more replies)
  0 siblings, 13 replies; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

We support the syntax like:
-L n1,m1 pathspec1 -L n2,m2 pathspec2.

Make the parse-options API not stop when encounter a
non-option argument, report the status and go on parsing
the remain options.

Thanks-to: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 parse-options.c |    3 ++-
 parse-options.h |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 0fa79bc..cbb49d3 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -374,7 +374,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
 			if (parse_nodash_opt(ctx, arg, options) == 0)
 				continue;
 			if (ctx->flags & PARSE_OPT_STOP_AT_NON_OPTION)
-				break;
+				return PARSE_OPT_NON_OPTION;
 			ctx->out[ctx->cpidx++] = ctx->argv[0];
 			continue;
 		}
@@ -456,6 +456,7 @@ int parse_options(int argc, const char **argv, const char *prefix,
 	switch (parse_options_step(&ctx, options, usagestr)) {
 	case PARSE_OPT_HELP:
 		exit(129);
+	case PARSE_OPT_NON_OPTION:
 	case PARSE_OPT_DONE:
 		break;
 	default: /* PARSE_OPT_UNKNOWN */
diff --git a/parse-options.h b/parse-options.h
index 7435cdb..407697a 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -161,7 +161,8 @@ extern NORETURN void usage_msg_opt(const char *msg,
 enum {
 	PARSE_OPT_HELP = -1,
 	PARSE_OPT_DONE,
-	PARSE_OPT_UNKNOWN
+	PARSE_OPT_NON_OPTION,
+	PARSE_OPT_UNKNOWN,
 };
 
 /*
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 02/13] parse-options: add two helper functions
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 16:45   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 03/13] add the basic data structure for line level history Bo Yang
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

1. parse_options_current: get the current option/argument the API
   now is dealing with;
2. parse_options_next: make the API to deal with the next
   option/argument.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 parse-options.c |   19 +++++++++++++++++++
 parse-options.h |    4 ++++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index cbb49d3..4266bde 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -439,6 +439,25 @@ unknown:
 	return PARSE_OPT_DONE;
 }
 
+const char *parse_options_current(struct parse_opt_ctx_t *ctx)
+{
+	return ctx->argv[0];
+}
+
+int parse_options_next(struct parse_opt_ctx_t *ctx, int retain)
+{
+	if (ctx->argc <= 0)
+		return -1;
+
+	if (retain == 1)
+		ctx->out[ctx->cpidx++] = ctx->argv[0];
+
+	ctx->argc--;
+	ctx->argv++;
+
+	return 0;
+}
+
 int parse_options_end(struct parse_opt_ctx_t *ctx)
 {
 	memmove(ctx->out + ctx->cpidx, ctx->argv, ctx->argc * sizeof(*ctx->out));
diff --git a/parse-options.h b/parse-options.h
index 407697a..4791baa 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -187,6 +187,10 @@ extern int parse_options_step(struct parse_opt_ctx_t *ctx,
 			      const struct option *options,
 			      const char * const usagestr[]);
 
+extern const char *parse_options_current(struct parse_opt_ctx_t *ctx);
+
+extern int parse_options_next(struct parse_opt_ctx_t *ctx, int retain);
+
 extern int parse_options_end(struct parse_opt_ctx_t *ctx);
 
 extern int parse_options_concat(struct option *dst, size_t, struct option *src);
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 03/13] add the basic data structure for line level history
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
  2010-07-11  6:18 ` [PATCH v3 02/13] parse-options: add two helper functions Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 14:16   ` Bo Yang
  2010-07-11  6:18 ` [PATCH v3 04/13] refactor parse_loc Bo Yang
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

'struct diff_line_range' is the main data structure to store
the user interesting line range. There is one 'diff_line_range'
for each file, and there are multiple 'struct range' in each
'diff_line_range'. In this way, we support multiple ranges.

Within 'struct range', there are multiple 'struct print_range'
which represent a diff chunk.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 Makefile   |    2 +
 diffcore.h |    1 +
 line.c     |  456 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 line.h     |  122 ++++++++++++++++
 revision.c |    6 +
 revision.h |    8 +-
 6 files changed, 593 insertions(+), 2 deletions(-)
 create mode 100644 line.c
 create mode 100644 line.h

diff --git a/Makefile b/Makefile
index 1fadd43..0183a21 100644
--- a/Makefile
+++ b/Makefile
@@ -496,6 +496,7 @@ LIB_H += grep.h
 LIB_H += hash.h
 LIB_H += help.h
 LIB_H += levenshtein.h
+LIB_H += line.h
 LIB_H += list-objects.h
 LIB_H += ll-merge.h
 LIB_H += log-tree.h
@@ -582,6 +583,7 @@ LIB_OBJS += help.o
 LIB_OBJS += hex.o
 LIB_OBJS += ident.o
 LIB_OBJS += levenshtein.o
+LIB_OBJS += line.o
 LIB_OBJS += list-objects.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
diff --git a/diffcore.h b/diffcore.h
index 491bea0..13d8e93 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -23,6 +23,7 @@
 #define MINIMUM_BREAK_SIZE     400 /* do not break a file smaller than this */
 
 struct userdiff_driver;
+struct diff_options;
 
 struct diff_filespec {
 	unsigned char sha1[20];
diff --git a/line.c b/line.c
new file mode 100644
index 0000000..8bda022
--- /dev/null
+++ b/line.c
@@ -0,0 +1,456 @@
+#include "line.h"
+#include "cache.h"
+#include "tag.h"
+#include "blob.h"
+#include "tree.h"
+#include "commit.h"
+#include "diff.h"
+#include "decorate.h"
+#include "revision.h"
+#include "xdiff-interface.h"
+#include "strbuf.h"
+#include "log-tree.h"
+
+static void cleanup(struct diff_line_range *r)
+{
+	while (r) {
+		struct diff_line_range *next = r->next;
+		DIFF_LINE_RANGE_CLEAR(r);
+		free(r);
+		r = next;
+	}
+}
+
+static struct object *verify_commit(struct rev_info *revs)
+{
+	struct object *commit = NULL;
+	const char *name = NULL;
+	int i;
+
+	for (i = 0; i < revs->pending.nr; i++) {
+		struct object *obj = revs->pending.objects[i].item;
+		if (obj->flags & UNINTERESTING)
+			continue;
+		while (obj->type == OBJ_TAG)
+			obj = deref_tag(obj, NULL, 0);
+		if (obj->type != OBJ_COMMIT)
+			die("Non commit %s?", revs->pending.objects[i].name);
+		if (commit)
+			die("More than one commit to dig from: %s and %s?",
+			    revs->pending.objects[i].name, name);
+		commit = obj;
+		name = revs->pending.objects[i].name;
+	}
+
+	if (commit == NULL)
+		die("No commit specified?");
+
+	return commit;
+}
+
+static void fill_blob_sha1(struct commit *commit, struct diff_line_range *r)
+{
+	unsigned mode;
+	unsigned char sha1[20];
+
+	while (r) {
+		if (get_tree_entry(commit->object.sha1, r->spec->path,
+			sha1, &mode))
+			goto error;
+		fill_filespec(r->spec, sha1, mode);
+		r = r->next;
+	}
+
+	return;
+error:
+	die("There is no path %s in the commit", r->spec->path);
+}
+
+static void fill_line_ends(struct diff_filespec *spec, long *lines,
+	unsigned long **line_ends)
+{
+	int num = 0, size = 50;
+	long cur = 0;
+	unsigned long *ends = NULL;
+	char *data = NULL;
+
+	if (diff_populate_filespec(spec, 0))
+		die("Cannot read blob %s", sha1_to_hex(spec->sha1));
+
+	ends = xmalloc(size * sizeof(int));
+	ends[cur++] = -1;
+	data = spec->data;
+	while (num < spec->size) {
+		if (data[num] == '\n' || num == spec->size - 1) {
+			ALLOC_GROW(ends, (cur + 1), size);
+			ends[cur++] = num;
+		}
+		num++;
+	}
+
+	/* shrink the array to fit the elements */
+	ends = xrealloc(ends, cur * sizeof(int));
+	*lines = cur;
+	*line_ends = ends;
+}
+
+static const char *nth_line(struct diff_filespec *spec, int line,
+		long lines, unsigned long *line_ends)
+{
+	assert(line < lines);
+	assert(spec && spec->data);
+
+	return (char *)spec->data + line_ends[line] + 1;
+}
+
+/*
+ * copied from blame.c, indeed, we can even to use this to test
+ * whether line log works. :)
+ */
+static const char *parse_loc(const char *spec, struct diff_filespec *file,
+			     long lines, unsigned long *line_ends,
+			     long begin, long *ret)
+{
+	char *term;
+	const char *line;
+	long num;
+	int reg_error;
+	regex_t regexp;
+	regmatch_t match[1];
+
+	/* Allow "-L <something>,+20" to mean starting at <something>
+	 * for 20 lines, or "-L <something>,-5" for 5 lines ending at
+	 * <something>.
+	 */
+	if (1 < begin && (spec[0] == '+' || spec[0] == '-')) {
+		num = strtol(spec + 1, &term, 10);
+		if (term != spec + 1) {
+			if (spec[0] == '-')
+				num = 0 - num;
+			if (0 < num)
+				*ret = begin + num - 2;
+			else if (!num)
+				*ret = begin;
+			else
+				*ret = begin + num;
+			return term;
+		}
+		return spec;
+	}
+	num = strtol(spec, &term, 10);
+	if (term != spec) {
+		*ret = num;
+		return term;
+	}
+	if (spec[0] != '/')
+		return spec;
+
+	/* it could be a regexp of form /.../ */
+	for (term = (char *) spec + 1; *term && *term != '/'; term++) {
+		if (*term == '\\')
+			term++;
+	}
+	if (*term != '/')
+		return spec;
+
+	/* try [spec+1 .. term-1] as regexp */
+	*term = 0;
+	begin--; /* input is in human terms */
+	line = nth_line(file, begin, lines, line_ends);
+
+	if (!(reg_error = regcomp(&regexp, spec + 1, REG_NEWLINE)) &&
+	    !(reg_error = regexec(&regexp, line, 1, match, 0))) {
+		const char *cp = line + match[0].rm_so;
+		const char *nline;
+
+		while (begin++ < lines) {
+			nline = nth_line(file, begin, lines, line_ends);
+			if (line <= cp && cp < nline)
+				break;
+			line = nline;
+		}
+		*ret = begin;
+		regfree(&regexp);
+		*term++ = '/';
+		return term;
+	}
+	else {
+		char errbuf[1024];
+		regerror(reg_error, &regexp, errbuf, 1024);
+		die("-L parameter '%s': %s", spec + 1, errbuf);
+	}
+}
+
+static void parse_range(long lines, unsigned long *line_ends,
+		struct range *r, struct diff_filespec *spec)
+{
+	const char *term;
+
+	term = parse_loc(r->arg, spec, lines, line_ends, 1, &r->start);
+	if (*term == ',') {
+		term = parse_loc(term + 1, spec, lines, line_ends,
+			r->start + 1, &r->end);
+		if (*term) {
+			die("-L parameter's argument should be <start>,<end>");
+		}
+	}
+
+	if (*term) {
+		die("-L parameter's argument should be <start>,<end>");
+	}
+
+	if (r->start > r->end) {
+		long tmp = r->start;
+		r->start = r->end;
+		r->end = tmp;
+	}
+
+	if (r->start < 1)
+		r->start = 1;
+	if (r->end >= lines)
+		r->end = lines - 1;
+}
+
+static void parse_lines(struct commit *commit, struct diff_line_range *r)
+{
+	int i;
+	struct range *old_range = NULL;
+	long lines = 0;
+	unsigned long *ends = NULL;
+
+	while (r) {
+		struct diff_filespec *spec = r->spec;
+		int num = r->nr;
+		assert(spec);
+		fill_blob_sha1(commit, r);
+		old_range = r->ranges;
+		r->ranges = NULL;
+		r->nr = r->alloc = 0;
+		fill_line_ends(spec, &lines, &ends);
+		for (i = 0; i < num; i++) {
+			parse_range(lines, ends, old_range + i, spec);
+			diff_line_range_insert(r, old_range[i].arg,
+				old_range[i].start, old_range[i].end);
+		}
+
+		free(ends);
+		ends = NULL;
+
+		r = r->next;
+		free(old_range);
+	}
+}
+
+/*
+ * Insert a new line range into a diff_line_range struct, and keep the
+ * r->ranges sorted by their starting line number.
+ */
+struct range *diff_line_range_insert(struct diff_line_range *r, const char *arg,
+				int start, int end)
+{
+	int i = 0;
+	struct range *rs = r->ranges;
+	int left_extend = 0, right_extend = 0;
+
+	assert(r != NULL);
+	assert(start <= end);
+
+	if (r->nr == 0 || rs[r->nr - 1].end < start - 1) {
+		DIFF_LINE_RANGE_GROW(r);
+		rs = r->ranges;
+		int num = r->nr - 1;
+		rs[num].arg = arg;
+		rs[num].start = start;
+		rs[num].end = end;
+		return rs + num;
+	}
+
+	for (; i < r->nr; i++) {
+		if (rs[i].end < start - 1)
+			continue;
+		if (rs[i].end == start - 1) {
+			rs[i].end = end;
+			right_extend = 1;
+			goto out;
+		}
+
+		assert(rs[i].end > start - 1);
+		if (rs[i].start <= start) {
+			if (rs[i].end < end) {
+				rs[i].end = end;
+				right_extend = 1;
+			}
+			goto out;
+		} else if (rs[i].start <= end + 1) {
+			rs[i].start = start;
+			left_extend = 1;
+			if (rs[i].end < end) {
+				rs[i].end = end;
+				right_extend = 1;
+			}
+			goto out;
+		} else {
+			int num = r->nr - i;
+			DIFF_LINE_RANGE_GROW(r);
+			rs = r->ranges;
+			memmove(rs + i + 1, rs + i, num * sizeof(struct range));
+			rs[i].arg = arg;
+			rs[i].start = start;
+			rs[i].end = end;
+			goto out;
+		}
+	}
+
+out:
+	assert(r->nr != i);
+	if (left_extend) {
+		int j = i;
+		for (; j > -1; j--) {
+			if (rs[j].end >= rs[i].start - 1)
+				if (rs[j].start < rs[i].start)
+					rs[i].start = rs[j].start;
+		}
+		memmove(rs + j + 1, rs + i, (r->nr - i) * sizeof(struct range));
+		r->nr -= i - j - 1;
+	}
+	if (right_extend) {
+		int j = i;
+		for (; j < r->nr; j++) {
+			if (rs[j].start <= rs[i].end + 1)
+				if (rs[j].end > rs[i].end)
+					rs[i].end = rs[j].end;
+		}
+		if (j < r->nr) {
+			memmove(rs + i + 1, rs + j, (r->nr - j) * sizeof(struct range));
+		}
+		r->nr -= j - i - 1;
+	}
+	assert(r->nr);
+
+	return rs + i;
+}
+
+void diff_line_range_clear(struct diff_line_range *r)
+{
+	int i = 0, zero = 0;
+
+	for (; i < r->nr; i++) {
+		struct range *rg = r->ranges + i;
+		RANGE_CLEAR(rg);
+	}
+
+	if (r->prev) {
+		zero = 0;
+		if (r->prev->count == 1) {
+			zero = 1;
+		}
+		free_filespec(r->prev);
+		if (zero)
+			r->prev = NULL;
+	}
+	if (r->spec) {
+		zero = 0;
+		if (r->spec->count == 1) {
+			zero = 1;
+		}
+		free_filespec(r->spec);
+		if (zero)
+			r->spec = NULL;
+	}
+
+	r->status = '\0';
+	r->alloc = r->nr = 0;
+
+	if (r->ranges)
+		free(r->ranges);
+	r->ranges = NULL;
+	r->next = NULL;
+}
+
+void diff_line_range_append(struct diff_line_range *r, const char *arg)
+{
+	DIFF_LINE_RANGE_GROW(r);
+	r->ranges[r->nr - 1].arg = arg;
+}
+
+struct diff_line_range *diff_line_range_merge(struct diff_line_range *out,
+	struct diff_line_range *other)
+{
+	struct diff_line_range *one = out, *two = other;
+	struct diff_line_range *pone;
+
+	while (one) {
+		struct diff_line_range *ptwo;
+		two = other;
+		ptwo = other;
+		while (two) {
+			if (!strcmp(one->spec->path, two->spec->path)) {
+				int i = 0;
+				for (; i < two->nr; i++) {
+					diff_line_range_insert(one, NULL,
+						two->ranges[i].start,
+						two->ranges[i].end);
+				}
+				if (two == other) {
+					other = other->next;
+				} else {
+					ptwo->next = two->next;
+				}
+				DIFF_LINE_RANGE_CLEAR(two);
+				free(two);
+				two = NULL;
+
+				break;
+			}
+
+			ptwo = two;
+			two = two->next;
+		}
+
+		pone = one;
+		one = one->next;
+	}
+	pone->next = other;
+
+	return out;
+}
+
+void add_line_range(struct rev_info *revs, struct commit *commit, struct diff_line_range *r)
+{
+	struct diff_line_range *ret = NULL;
+
+	if (r != NULL) {
+		ret = lookup_decoration(&revs->line_range, &commit->object);
+		if (ret != NULL) {
+			diff_line_range_merge(ret, r);
+		} else {
+			add_decoration(&revs->line_range, &commit->object, r);
+		}
+		commit->object.flags |= RANGE_UPDATE;
+	}
+}
+
+struct diff_line_range *lookup_line_range(struct rev_info *revs, struct commit *commit)
+{
+	struct diff_line_range *ret = NULL;
+
+	ret = lookup_decoration(&revs->line_range, &commit->object);
+	return ret;
+}
+
+void setup_line(struct rev_info *rev, struct diff_line_range *r)
+{
+	struct commit *commit = NULL;
+	struct diff_options *opt = &rev->diffopt;
+
+	commit = (struct commit *)verify_commit(rev);
+	parse_lines(commit, r);
+
+	add_line_range(rev, commit, r);
+	/*
+	 * Note we support -M/-C to detect file rename
+	 */
+	opt->nr_paths = 0;
+	diff_tree_release_paths(opt);
+}
+
diff --git a/line.h b/line.h
new file mode 100644
index 0000000..a2f8545
--- /dev/null
+++ b/line.h
@@ -0,0 +1,122 @@
+#ifndef LINE_H
+#define LINE_H
+
+#include "diffcore.h"
+
+struct rev_info;
+struct commit;
+struct diff_line_range;
+struct diff_options;
+
+struct print_range {
+	int start, end;
+	int pstart, pend;
+	int line_added : 1;
+};
+
+struct print_pair {
+	int alloc, nr;
+	struct print_range *ranges;
+};
+
+#define PRINT_RANGE_INIT(r) \
+	do { \
+		(r)->line_added = 0; \
+	} while(0);
+
+#define PRINT_PAIR_INIT(p) \
+	do { \
+		(p)->alloc = (p)->nr = 0; \
+		(p)->ranges = NULL; \
+	} while(0);
+
+#define PRINT_PAIR_GROW(p) \
+	do { \
+		(p)->nr ++; \
+		ALLOC_GROW((p)->ranges, (p)->nr, (p)->alloc); \
+	} while (0);
+
+#define PRINT_PAIR_CLEAR(p) \
+	do { \
+		(p)->alloc = (p)->nr = 0; \
+		if ((p)->ranges) \
+			free((p)->ranges); \
+		(p)->ranges = NULL; \
+	} while(0);
+
+struct range {
+	const char *arg;	//The argument to specify this line range
+	long start, end;	//The start line number, inclusive
+	long pstart, pend;	//The end line number, inclusive
+	struct print_pair pair;
+			//The changed lines inside this range
+	unsigned int diff:1;
+};
+
+struct diff_line_range {
+	struct diff_filespec *prev;
+	struct diff_filespec *spec;
+	char status;
+	int alloc;
+	int nr;
+	struct range *ranges;
+	unsigned int	touch:1,
+			diff:1;
+	struct diff_line_range *next;
+};
+
+#define RANGE_INIT(r) \
+	do { \
+		(r)->arg = NULL; \
+		(r)->start = (r)->end = 0; \
+		(r)->pstart = (r)->pend = 0; \
+		PRINT_PAIR_INIT(&((r)->pair)); \
+		(r)->diff = 0; \
+	} while (0);
+
+#define RANGE_CLEAR(r) \
+	do { \
+		(r)->arg = NULL; \
+		(r)->start = (r)->end = 0; \
+		(r)->pstart = (r)->pend = 0; \
+		PRINT_PAIR_CLEAR(&r->pair); \
+		(r)->diff = 0; \
+	} while(0);
+
+#define DIFF_LINE_RANGE_INIT(r) \
+	do { \
+		(r)->prev = (r)->spec = NULL; \
+		(r)->status = '\0'; \
+		(r)->alloc = (r)->nr = 0; \
+		(r)->ranges = NULL; \
+		(r)->next = NULL; \
+		(r)->touch = 0; \
+		(r)->diff = 0; \
+	} while(0);
+
+#define DIFF_LINE_RANGE_GROW(r) \
+	do { \
+		(r)->nr ++; \
+		ALLOC_GROW((r)->ranges, (r)->nr, (r)->alloc); \
+		RANGE_INIT(((r)->ranges + (r)->nr - 1)); \
+	} while(0);
+
+#define DIFF_LINE_RANGE_CLEAR(r) \
+	diff_line_range_clear((r));
+
+extern struct range *diff_line_range_insert(struct diff_line_range *r, const char *arg,
+				int start, int end);
+extern void diff_line_range_append(struct diff_line_range *r, const char *arg);
+
+extern void diff_line_range_clear(struct diff_line_range *r);
+
+extern struct diff_line_range *diff_line_range_merge(struct diff_line_range *out,
+	struct diff_line_range *other);
+
+extern void setup_line(struct rev_info *rev, struct diff_line_range *r);
+
+extern void add_line_range(struct rev_info *revs, struct commit *commit, struct diff_line_range *r);
+
+extern struct diff_line_range *lookup_line_range(struct rev_info *revs, struct commit *commit);
+
+#endif
diff --git a/revision.c b/revision.c
index 7e82efd..63f37ea 100644
--- a/revision.c
+++ b/revision.c
@@ -1637,6 +1637,12 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	if (revs->combine_merges)
 		revs->ignore_merges = 0;
 	revs->diffopt.abbrev = revs->abbrev;
+
+	if (revs->line) {
+		revs->limited = 1;
+		revs->topo_order = 1;
+	}
+
 	if (diff_setup_done(&revs->diffopt) < 0)
 		die("diff_setup_done failed");
 
diff --git a/revision.h b/revision.h
index 36fdf22..55836e0 100644
--- a/revision.h
+++ b/revision.h
@@ -14,7 +14,8 @@
 #define CHILD_SHOWN	(1u<<6)
 #define ADDED		(1u<<7)	/* Parents already parsed and added? */
 #define SYMMETRIC_LEFT	(1u<<8)
-#define ALL_REV_FLAGS	((1u<<9)-1)
+#define RANGE_UPDATE	(1u<<9) /* for line level traverse */
+#define ALL_REV_FLAGS	((1u<<10)-1)
 
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2
@@ -68,7 +69,8 @@ struct rev_info {
 			cherry_pick:1,
 			bisect:1,
 			ancestry_path:1,
-			first_parent_only:1;
+			first_parent_only:1,
+			line:1;
 
 	/* Diff flags */
 	unsigned int	diff:1,
@@ -137,6 +139,8 @@ struct rev_info {
 	/* commit counts */
 	int count_left;
 	int count_right;
+	/* line-level intersting range */
+	struct decoration line_range;
 };
 
 #define REV_TREE_SAME		0
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 04/13] refactor parse_loc
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
  2010-07-11  6:18 ` [PATCH v3 02/13] parse-options: add two helper functions Bo Yang
  2010-07-11  6:18 ` [PATCH v3 03/13] add the basic data structure for line level history Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 18:32   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 05/13] parse the -L options Bo Yang
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

Both 'git blame -L' and 'git log -L' parse the same style
of line number arguments, so put the 'parse_loc' function
to line.c and export it.
The caller of parse_loc should provide a callback function
which is used to calculate the nth line start position.
Other parts such as regexp search, line number parsing are
abstracted and re-used.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 builtin/blame.c |   89 +++++-------------------------------------------------
 line.c          |   35 ++++++++++++---------
 line.h          |    5 +++
 3 files changed, 34 insertions(+), 95 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 01e62fd..17b71cd 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -21,6 +21,7 @@
 #include "parse-options.h"
 #include "utf8.h"
 #include "userdiff.h"
+#include "line.h"
 
 static char blame_usage[] = "git blame [options] [rev-opts] [rev] [--] file";
 
@@ -541,11 +542,16 @@ static void dup_entry(struct blame_entry *dst, struct blame_entry *src)
 	dst->score = 0;
 }
 
-static const char *nth_line(struct scoreboard *sb, int lno)
+static const char *nth_line(struct scoreboard *sb, long lno)
 {
 	return sb->final_buf + sb->lineno[lno];
 }
 
+static const char *nth_line_cb(void *data, long lno)
+{
+	return nth_line((struct scoreboard *)data, lno);
+}
+
 /*
  * It is known that lines between tlno to same came from parent, and e
  * has an overlap with that range.  it also is known that parent's
@@ -1907,83 +1913,6 @@ static const char *add_prefix(const char *prefix, const char *path)
 }
 
 /*
- * Parsing of (comma separated) one item in the -L option
- */
-static const char *parse_loc(const char *spec,
-			     struct scoreboard *sb, long lno,
-			     long begin, long *ret)
-{
-	char *term;
-	const char *line;
-	long num;
-	int reg_error;
-	regex_t regexp;
-	regmatch_t match[1];
-
-	/* Allow "-L <something>,+20" to mean starting at <something>
-	 * for 20 lines, or "-L <something>,-5" for 5 lines ending at
-	 * <something>.
-	 */
-	if (1 < begin && (spec[0] == '+' || spec[0] == '-')) {
-		num = strtol(spec + 1, &term, 10);
-		if (term != spec + 1) {
-			if (spec[0] == '-')
-				num = 0 - num;
-			if (0 < num)
-				*ret = begin + num - 2;
-			else if (!num)
-				*ret = begin;
-			else
-				*ret = begin + num;
-			return term;
-		}
-		return spec;
-	}
-	num = strtol(spec, &term, 10);
-	if (term != spec) {
-		*ret = num;
-		return term;
-	}
-	if (spec[0] != '/')
-		return spec;
-
-	/* it could be a regexp of form /.../ */
-	for (term = (char *) spec + 1; *term && *term != '/'; term++) {
-		if (*term == '\\')
-			term++;
-	}
-	if (*term != '/')
-		return spec;
-
-	/* try [spec+1 .. term-1] as regexp */
-	*term = 0;
-	begin--; /* input is in human terms */
-	line = nth_line(sb, begin);
-
-	if (!(reg_error = regcomp(&regexp, spec + 1, REG_NEWLINE)) &&
-	    !(reg_error = regexec(&regexp, line, 1, match, 0))) {
-		const char *cp = line + match[0].rm_so;
-		const char *nline;
-
-		while (begin++ < lno) {
-			nline = nth_line(sb, begin);
-			if (line <= cp && cp < nline)
-				break;
-			line = nline;
-		}
-		*ret = begin;
-		regfree(&regexp);
-		*term++ = '/';
-		return term;
-	}
-	else {
-		char errbuf[1024];
-		regerror(reg_error, &regexp, errbuf, 1024);
-		die("-L parameter '%s': %s", spec + 1, errbuf);
-	}
-}
-
-/*
  * Parsing of -L option
  */
 static void prepare_blame_range(struct scoreboard *sb,
@@ -1993,9 +1922,9 @@ static void prepare_blame_range(struct scoreboard *sb,
 {
 	const char *term;
 
-	term = parse_loc(bottomtop, sb, lno, 1, bottom);
+	term = parse_loc(bottomtop, nth_line_cb, sb, lno, 1, bottom);
 	if (*term == ',') {
-		term = parse_loc(term + 1, sb, lno, *bottom + 1, top);
+		term = parse_loc(term + 1, nth_line_cb, sb, lno, *bottom + 1, top);
 		if (*term)
 			usage(blame_usage);
 	}
diff --git a/line.c b/line.c
index 8bda022..a6bcbdf 100644
--- a/line.c
+++ b/line.c
@@ -94,22 +94,26 @@ static void fill_line_ends(struct diff_filespec *spec, long *lines,
 	*line_ends = ends;
 }
 
-static const char *nth_line(struct diff_filespec *spec, int line,
-		long lines, unsigned long *line_ends)
+struct nth_line_cb {
+	struct diff_filespec *spec;
+	long lines;
+	unsigned long *line_ends;
+};
+
+static const char *nth_line(void *data, long line)
 {
-	assert(line < lines);
-	assert(spec && spec->data);
+	struct nth_line_cb *d = data;
+	assert(d && line < d->lines);
+	assert(d->spec && d->spec->data);
 
-	return (char *)spec->data + line_ends[line] + 1;
+	return (char *)d->spec->data + d->line_ends[line] + 1;
 }
 
 /*
- * copied from blame.c, indeed, we can even to use this to test
- * whether line log works. :)
+ * Parsing of (comma separated) one item in the -L option
  */
-static const char *parse_loc(const char *spec, struct diff_filespec *file,
-			     long lines, unsigned long *line_ends,
-			     long begin, long *ret)
+const char *parse_loc(const char *spec, nth_line_fn_t nth_line,
+		void *data, long lines, long begin, long *ret)
 {
 	char *term;
 	const char *line;
@@ -156,7 +160,7 @@ static const char *parse_loc(const char *spec, struct diff_filespec *file,
 	/* try [spec+1 .. term-1] as regexp */
 	*term = 0;
 	begin--; /* input is in human terms */
-	line = nth_line(file, begin, lines, line_ends);
+	line = nth_line(data, begin);
 
 	if (!(reg_error = regcomp(&regexp, spec + 1, REG_NEWLINE)) &&
 	    !(reg_error = regexec(&regexp, line, 1, match, 0))) {
@@ -164,7 +168,7 @@ static const char *parse_loc(const char *spec, struct diff_filespec *file,
 		const char *nline;
 
 		while (begin++ < lines) {
-			nline = nth_line(file, begin, lines, line_ends);
+			nline = nth_line(data, begin);
 			if (line <= cp && cp < nline)
 				break;
 			line = nline;
@@ -185,11 +189,12 @@ static void parse_range(long lines, unsigned long *line_ends,
 		struct range *r, struct diff_filespec *spec)
 {
 	const char *term;
+	struct nth_line_cb data = {spec, lines, line_ends};
 
-	term = parse_loc(r->arg, spec, lines, line_ends, 1, &r->start);
+	term = parse_loc(r->arg, nth_line, &data, lines, 1, &r->start);
 	if (*term == ',') {
-		term = parse_loc(term + 1, spec, lines, line_ends,
-			r->start + 1, &r->end);
+		term = parse_loc(term + 1, nth_line, &data, lines,
+				r->start + 1, &r->end);
 		if (*term) {
 			die("-L parameter's argument should be <start>,<end>");
 		}
diff --git a/line.h b/line.h
index a2f8545..0eb03bd 100644
--- a/line.h
+++ b/line.h
@@ -8,6 +8,8 @@ struct commit;
 struct diff_line_range;
 struct diff_options;
 
+typedef const char *(*nth_line_fn_t)(void *data, long lno);
+
 struct print_range {
 	int start, end;
 	int pstart, pend;
@@ -119,4 +121,7 @@ extern void add_line_range(struct rev_info *revs, struct commit *commit, struct
 
 extern struct diff_line_range *lookup_line_range(struct rev_info *revs, struct commit *commit);
 
+const char *parse_loc(const char *spec, nth_line_fn_t nth_line,
+		void *data, long lines, long begin, long *ret);
+
 #endif
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 05/13] parse the -L options
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (2 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 04/13] refactor parse_loc Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-13 23:03   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 06/13] export three functions from diff.c Bo Yang
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

With the two new APIs of parse options added in the previous
commit, we parse the multiple '-L n,m <file>' syntax.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 builtin/log.c |  102 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 101 insertions(+), 1 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 08b8722..1e90b03 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -19,6 +19,7 @@
 #include "remote.h"
 #include "string-list.h"
 #include "parse-options.h"
+#include "line.h"
 
 /* Set a default date-time format for git log ("log.date" config variable) */
 static const char *default_date_mode = NULL;
@@ -27,11 +28,24 @@ static int default_show_root = 1;
 static int decoration_style;
 static const char *fmt_patch_subject_prefix = "PATCH";
 static const char *fmt_pretty;
+static const char *dashdash = "--";
 
-static const char * const builtin_log_usage =
+static char builtin_log_usage[] =
 	"git log [<options>] [<since>..<until>] [[--] <path>...]\n"
+	"git log [<options>] -L n,m <path>\n"
 	"   or: git show [options] <object>...";
 
+static const char *log_opt_usage[] = {
+	builtin_log_usage,
+	NULL
+};
+
+struct line_opt_callback_data {
+	struct diff_line_range **range;
+	struct parse_opt_ctx_t *ctx;
+	struct rev_info *rev;
+};
+
 static int parse_decoration_style(const char *var, const char *value)
 {
 	switch (git_config_maybe_bool(var, value)) {
@@ -49,12 +63,41 @@ static int parse_decoration_style(const char *var, const char *value)
 	return -1;
 }
 
+static int log_line_range_callback(const struct option *option, const char *arg, int unset)
+{
+	struct line_opt_callback_data *data = option->value;
+	struct diff_line_range *r = *data->range;
+	struct parse_opt_ctx_t *ctx = data->ctx;
+	if (!arg)
+		return -1;
+
+	if (r->nr == 0 && r->next == NULL) {
+		ctx->out[ctx->cpidx++] = dashdash;
+	}
+
+	diff_line_range_append(r, arg);
+	data->rev->line = 1;
+	return 0;
+}
+
 static void cmd_log_init(int argc, const char **argv, const char *prefix,
 			 struct rev_info *rev, struct setup_revision_opt *opt)
 {
 	int i;
 	int decoration_given = 0;
 	struct userformat_want w;
+	const char *path = NULL, *pathspec = NULL;
+	static struct diff_line_range *range = NULL;
+	static struct parse_opt_ctx_t ctx;
+	static struct line_opt_callback_data line_cb = {&range, &ctx, NULL};
+	static const struct option options[] = {
+		OPT_CALLBACK('L', NULL, &line_cb, "n,m", "Process only line range n,m, counting from 1", log_line_range_callback),
+		OPT_END()
+	};
+
+	line_cb.rev = rev;
+	range = xmalloc(sizeof(*range));
+	DIFF_LINE_RANGE_INIT(range);
 
 	rev->abbrev = DEFAULT_ABBREV;
 	rev->commit_format = CMIT_FMT_DEFAULT;
@@ -75,6 +118,58 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
 	 */
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage(builtin_log_usage);
+
+	parse_options_start(&ctx, argc, argv, prefix, PARSE_OPT_KEEP_DASHDASH |
+			PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_STOP_AT_NON_OPTION);
+	for (;;) {
+		switch (parse_options_step(&ctx, options, log_opt_usage)) {
+		case PARSE_OPT_HELP:
+			exit(129);
+		case PARSE_OPT_DONE:
+			goto parse_done;
+		case PARSE_OPT_NON_OPTION:
+			path = parse_options_current(&ctx);
+			pathspec = prefix_path(prefix, prefix ? strlen(prefix) : 0, path);
+			range->spec = alloc_filespec(pathspec);
+			free((void *)pathspec);
+			if (range->nr == 0) {
+				if(range->next) {
+					die("Path %s need a -L <range> option\n"
+					"If you want follow the history of the whole file "
+					"whether to using 'git log' without -L or using "
+					"'git log -L 1,$ <path>'", range->spec->path);
+				} else {
+					parse_options_next(&ctx, 1);
+					continue;
+				}
+			}
+			struct diff_line_range *r = xmalloc(sizeof(*r));
+			DIFF_LINE_RANGE_INIT(r);
+			r->next = range;
+			range = r;
+			parse_options_next(&ctx, 1);
+			continue;
+		case PARSE_OPT_UNKNOWN:
+			parse_options_next(&ctx, 1);
+			continue;
+		}
+
+		parse_revision_opt(rev, &ctx, options, log_opt_usage);
+	}
+parse_done:
+	argc = parse_options_end(&ctx);
+
+	/* die if '-L <range>' with no pathspec follow */
+	if (range->nr > 0 && range->spec == NULL) {
+		die("Each -L should follow a pathspec");
+	}
+	/* clear up the last range */
+	if (range->nr == 0) {
+		struct diff_line_range *r = range->next;
+		DIFF_LINE_RANGE_CLEAR(range);
+		range = r;
+	}
+
 	argc = setup_revisions(argc, argv, rev, opt);
 
 	memset(&w, 0, sizeof(w));
@@ -125,6 +220,11 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
 		rev->show_decorations = 1;
 		load_ref_decorations(decoration_style);
 	}
+
+	/* Test whether line level history is asked for */
+	if (range && range->nr > 0) {
+		setup_line(rev, range);
+	}
 }
 
 /*
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 06/13] export three functions from diff.c
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (3 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 05/13] parse the -L options Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-11  6:18 ` [PATCH v3 07/13] add range clone functions Bo Yang
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

Use fill_metainfo to fill the line level diff meta data,
emit_line to print out a line and quote_two to quote
paths.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 diff.c |    6 +++---
 diff.h |   17 +++++++++++++++++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 3aa695d..d685f5a 100644
--- a/diff.c
+++ b/diff.c
@@ -144,7 +144,7 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 	return git_color_default_config(var, value, cb);
 }
 
-static char *quote_two(const char *one, const char *two)
+char *quote_two(const char *one, const char *two)
 {
 	int need_one = quote_c_style(one, NULL, NULL, 1);
 	int need_two = quote_c_style(two, NULL, NULL, 1);
@@ -325,7 +325,7 @@ static void emit_line_0(struct diff_options *o, const char *set, const char *res
 		fputc('\n', file);
 }
 
-static void emit_line(struct diff_options *o, const char *set, const char *reset,
+void emit_line(struct diff_options *o, const char *set, const char *reset,
 		      const char *line, int len)
 {
 	emit_line_0(o, set, reset, line[0], line+1, len-1);
@@ -2564,7 +2564,7 @@ static int similarity_index(struct diff_filepair *p)
 	return p->score * 100 / MAX_SCORE;
 }
 
-static void fill_metainfo(struct strbuf *msg,
+void fill_metainfo(struct strbuf *msg,
 			  const char *name,
 			  const char *other,
 			  struct diff_filespec *one,
diff --git a/diff.h b/diff.h
index 063d10a..9676ab9 100644
--- a/diff.h
+++ b/diff.h
@@ -12,6 +12,7 @@ struct diff_queue_struct;
 struct strbuf;
 struct diff_filespec;
 struct userdiff_driver;
+struct diff_filepair;
 
 typedef void (*change_fn_t)(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
@@ -301,4 +302,20 @@ extern size_t fill_textconv(struct userdiff_driver *driver,
 
 extern struct userdiff_driver *get_textconv(struct diff_filespec *one);
 
+/* some output functions line.c need */
+extern void fill_metainfo(struct strbuf *msg,
+			  const char *name,
+			  const char *other,
+			  struct diff_filespec *one,
+			  struct diff_filespec *two,
+			  struct diff_options *o,
+			  struct diff_filepair *p,
+			  int *must_show_header,
+			  int use_color);
+
+extern void emit_line(struct diff_options *o, const char *set, const char *reset,
+		      const char *line, int len);
+
+extern char *quote_two(const char *one, const char *two);
+
 #endif /* DIFF_H */
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 07/13] add range clone functions
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (4 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 06/13] export three functions from diff.c Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 21:52   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 08/13] map/take range to parent Bo Yang
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

Both single range clone and deeply clone are supported.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 line.c |   39 +++++++++++++++++++++++++++++++++++++++
 line.h |    4 ++++
 2 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/line.c b/line.c
index a6bcbdf..ba96d7d 100644
--- a/line.c
+++ b/line.c
@@ -378,6 +378,45 @@ void diff_line_range_append(struct diff_line_range *r, const char *arg)
 	r->ranges[r->nr - 1].arg = arg;
 }
 
+struct diff_line_range *diff_line_range_clone(struct diff_line_range *r)
+{
+	struct diff_line_range *ret = xmalloc(sizeof(*ret));
+	int i = 0;
+
+	DIFF_LINE_RANGE_INIT(ret);
+	ret->ranges = xcalloc(r->nr, sizeof(struct range));
+	memcpy(ret->ranges, r->ranges, sizeof(struct range) * r->nr);
+
+	ret->alloc = ret->nr = r->nr;
+
+	for (; i < ret->nr; i++)
+		PRINT_PAIR_INIT(&ret->ranges[i].pair);
+
+	ret->spec = r->spec;
+	assert(ret->spec);
+	ret->spec->count++;
+
+	return ret;
+}
+
+struct diff_line_range *diff_line_range_clone_deeply(struct diff_line_range *r)
+{
+	struct diff_line_range *ret = NULL;
+	struct diff_line_range *tmp = NULL, *prev = NULL;
+
+	assert(r);
+	ret = tmp = prev = diff_line_range_clone(r);
+	r = r->next;
+	while (r) {
+		tmp = diff_line_range_clone(r);
+		prev->next = tmp;
+		prev = tmp;
+		r = r->next;
+	}
+
+	return ret;
+}
+
 struct diff_line_range *diff_line_range_merge(struct diff_line_range *out,
 	struct diff_line_range *other)
 {
diff --git a/line.h b/line.h
index 0eb03bd..885e985 100644
--- a/line.h
+++ b/line.h
@@ -112,6 +112,10 @@ extern void diff_line_range_append(struct diff_line_range *r, const char *arg);
 
 extern void diff_line_range_clear(struct diff_line_range *r);
 
+extern struct diff_line_range *diff_line_range_clone(struct diff_line_range *r);
+
+extern struct diff_line_range *diff_line_range_clone_deeply(struct diff_line_range *r);
+
 extern struct diff_line_range *diff_line_range_merge(struct diff_line_range *out,
 	struct diff_line_range *other);
 
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 08/13] map/take range to parent
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (5 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 07/13] add range clone functions Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 21:52   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 09/13] print the line log Bo Yang
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

The interesting line range will be passed to all of its parents.
For non-merge commit, we just map_range the ranges. Generally, the
algorithm do:

1. Run a diffcore_std to find out the pairs;
2. Run a xdi_diff_hunks on each interesting file pair;
3. The map_range_cb callback will be invoked for each diff hunk,
   and in the function map_lines we will calculate the pre-image
   range from the post-image range.

For merge commit, another take_range pass will be done except the
above normal map_range work. It is used to subtract each same
part lines of the current range out. After this pass, if there is
any line range left, this means the merge must be a non-trivial
merge. This is how the non-trivial merge detect work.

The algorithm that map lines from post-image to pre-image is in
the function map_lines. Generally, we use simple line number
calculation method to do the map.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 line.c     |  453 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 revision.h |    5 +-
 2 files changed, 457 insertions(+), 1 deletions(-)

diff --git a/line.c b/line.c
index ba96d7d..e1af2f6 100644
--- a/line.c
+++ b/line.c
@@ -498,3 +498,456 @@ void setup_line(struct rev_info *rev, struct diff_line_range *r)
 	diff_tree_release_paths(opt);
 }
 
+struct take_range_cb_data {
+	struct diff_line_range *interesting;
+	struct diff_line_range *range;
+	long plno, tlno;
+	int diff;
+};
+
+#define SCALE_FACTOR 4
+void map_lines(long p_start, long p_end, long t_start, long t_end,
+		long start, long end, long *o_start, long *o_end)
+{
+	/*
+	 * xdiff always assign 'same' with the last line number of two
+	 * same ranges. When some lines are added from scratch,
+	 * p_start = same + 1;
+	 * p_end = same;
+	 * so, we can understand the following condition.
+	 */
+	if (p_start > p_end) {
+		*o_start = *o_end = 0;
+		return;
+	}
+	/* A deletion */
+	if (t_start > t_end) {
+		*o_start = p_start;
+		*o_end = p_end;
+		return;
+	}
+
+	if (start == t_start && end == t_end) {
+		*o_start = p_start;
+		*o_end = p_end;
+		return;
+	}
+
+	/*
+	 * A new strategy for lines mapping:
+	 *
+	 * When the pre-image is no more than 1/4 of the post-image, it
+	 * take no sense to say what the real pre-image is, wan can just
+	 * treat all the post-image lines as added from scratch.
+	 */
+	if (SCALE_FACTOR * (p_end - p_start + 1) < (t_end - t_start + 1)) {
+		*o_start = *o_end = 0;
+		return;
+	}
+
+	*o_start = p_start + start - t_start;
+	*o_end = p_end - (t_end - end);
+
+	if (*o_start > *o_end) {
+		int temp = *o_start;
+		*o_start = *o_end;
+		*o_end = temp;
+	}
+
+	if (*o_start < p_start)
+		*o_start = p_start;
+	if (*o_end > p_end)
+		*o_end = p_end;
+}
+
+static void map_range(struct take_range_cb_data *data, int same,
+		long p_start, long p_end, long t_start, long t_end)
+{
+	struct range *ranges = data->interesting->ranges;
+	long takens, takene, start, end;
+	int i = 0, out = 0, added = 0;
+	long op_start = p_start, op_end = p_end, ot_start = t_start, ot_end = t_end;
+
+	for (; i < data->interesting->nr; i++) {
+		added = 0;
+		if (t_start > ranges[i].end)
+			continue;
+		if (t_end < ranges[i].start)
+			break;
+
+		if (t_start > ranges[i].start) {
+			start = t_start;
+			takens = p_start;
+			if (t_end >= ranges[i].end) {
+				end = ranges[i].end;
+				takene = p_start + end - t_start;
+			} else {
+				end = t_end;
+				takene = p_end;
+				out = 1;
+			}
+		} else {
+			start = ranges[i].start;
+			takens = p_start + start - t_start;
+			if (t_end >= ranges[i].end) {
+				end = ranges[i].end;
+				takene = p_start + end - t_start;
+			} else {
+				end = t_end;
+				takene = p_end;
+				out = 1;
+			}
+		}
+
+		if (!same) {
+			struct print_pair *pair = &ranges[i].pair;
+			PRINT_PAIR_GROW(pair);
+			struct print_range *rr = pair->ranges + pair->nr - 1;
+			PRINT_RANGE_INIT(rr);
+			rr->start = start;
+			rr->end = end;
+			map_lines(op_start, op_end, ot_start, ot_end, start, end,
+					&takens, &takene);
+			if (takens == 0 && takene == 0) {
+				added = 1;
+				rr->line_added = 1;
+			}
+			rr->pstart = takens;
+			rr->pend = takene;
+			data->diff = 1;
+			data->interesting->diff = 1;
+			ranges[i].diff = 1;
+		}
+		if (added) {
+			/* Code movement/copy detect here */
+		} else {
+			struct range *added_range = diff_line_range_insert(data->range,
+					NULL, takens, takene);
+			assert(added_range);
+			ranges[i].pstart = added_range->start;
+			ranges[i].pend = added_range->end;
+		}
+
+		t_start = end + 1;
+		p_start = takene + 1;
+
+		if (out)
+			break;
+	}
+}
+
+static void take_range(struct take_range_cb_data *data,
+		long p_start, long p_end, long t_start, long t_end)
+{
+	struct range *ranges = data->interesting->ranges;
+	long takens, takene, start, end;
+	int i = 0, out = 0, added = 0;
+
+	for (; i < data->interesting->nr; i++) {
+		added = 0;
+		if (t_start > ranges[i].end)
+			continue;
+		if (t_end < ranges[i].start)
+			break;
+
+		if (t_start > ranges[i].start) {
+			long tmp = ranges[i].end;
+			ranges[i].end = t_start - 1;
+			start = t_start;
+			takens = p_start;
+			if (t_end >= tmp) {
+				end = tmp;
+				takene = p_start + end - t_start;
+				p_start = takene + 1;
+				t_start = end + 1;
+			} else {
+				end = t_end;
+				takene = p_end;
+				diff_line_range_insert(data->interesting, NULL,
+					t_end + 1, tmp);
+				out = 1;
+			}
+		} else {
+			start = ranges[i].start;
+			takens = p_start + start - t_start;
+			if (t_end >= ranges[i].end) {
+				int num = data->interesting->nr - 1;
+				end = ranges[i].end;
+				takene = p_start + end - t_start;
+				t_start = end + 1;
+				p_start = takene + 1;
+				memmove(ranges + i, ranges + i + 1, (num - i) * sizeof(*ranges));
+				data->interesting->nr = num;
+				i--;
+			} else {
+				end = t_end;
+				takene = p_end;
+				ranges[i].start = t_end + 1;
+				out = 1;
+			}
+		}
+
+		diff_line_range_insert(data->range, NULL, takens, takene);
+
+		if (out)
+			break;
+	}
+}
+
+static void take_range_cb(void *data, long same, long p_next, long t_next)
+{
+	struct take_range_cb_data *d = data;
+	long p_start = d->plno + 1, t_start = d->tlno + 1;
+	long p_end = p_start + same - t_start, t_end = same;
+
+	/* If one file is added from scratch, this may confuse take_range */
+	if (t_end >= t_start)
+		take_range(d, p_start, p_end, t_start, t_end);
+	d->plno = p_next;
+	d->tlno = t_next;
+}
+
+static void map_range_cb(void *data, long same, long p_next, long t_next)
+{
+	struct take_range_cb_data *d = data;
+
+	long p_start = d->plno + 1;
+	long t_start = d->tlno + 1;
+	long p_end = same - t_start + p_start;
+	long t_end = same;
+
+	/* Firstly, take the unchanged lines from child */
+	if (t_end >= t_start)
+		map_range(d, 1, p_start, p_end, t_start, t_end);
+
+	/* find out which lines to print */
+	t_start = same + 1;
+	p_start = d->plno + t_start - d->tlno;
+	map_range(d, 0, p_start, p_next, t_start, t_next);
+
+	d->plno = p_next;
+	d->tlno = t_next;
+}
+
+static void assign_range_to_parent(struct rev_info *rev, struct commit *c,
+		struct commit *p, struct diff_line_range *r,
+		struct diff_options *opt, int map)
+{
+	struct diff_line_range *rr = xmalloc(sizeof(*rr));
+	struct diff_line_range *cr = rr, *prev_r = rr;
+	struct tree_desc desc1, desc2;
+	void *tree1 = NULL, *tree2 = NULL;
+	unsigned long size1, size2;
+	struct diff_queue_struct *queue;
+	struct take_range_cb_data cb = {NULL, cr, 0, 0};
+	xpparam_t xpp;
+	xdemitconf_t xecfg;
+	int i, diff = 0;
+	xdiff_emit_hunk_consume_fn fn = map ? map_range_cb : take_range_cb;
+
+	DIFF_LINE_RANGE_INIT(cr);
+	memset(&xpp, 0, sizeof(xpp));
+	memset(&xecfg, 0, sizeof(xecfg));
+	xpp.flags = XDF_NEED_MINIMAL;
+	xecfg.ctxlen = xecfg.interhunkctxlen = 0;
+
+	/*
+	 * Compose up two trees, for root commit, we make up a empty tree.
+	 */
+	assert(c);
+	tree2 = read_object_with_reference(c->tree->object.sha1, "tree", &size2, NULL);
+	if (tree2 == NULL)
+		die("Unable to read tree (%s)", sha1_to_hex(c->tree->object.sha1));
+	init_tree_desc(&desc2, tree2, size2);
+	if (p) {
+		tree1 = read_object_with_reference(p->tree->object.sha1, "tree", &size1, NULL);
+		if (tree1 == NULL)
+			die("Unable to read tree (%s)", sha1_to_hex(p->tree->object.sha1));
+		init_tree_desc(&desc1, tree1, size1);
+	} else {
+		init_tree_desc(&desc1, "", 0);
+	}
+
+	DIFF_QUEUE_CLEAR(&diff_queued_diff);
+	diff_tree(&desc1, &desc2, "", opt);
+	diffcore_std(opt);
+
+	queue = &diff_queued_diff;
+	for (i = 0; i < queue->nr; i++) {
+		struct diff_filepair *pair = queue->queue[i];
+		struct diff_line_range *rg = r;
+		mmfile_t file_p, file_t;
+		assert(pair->two->path);
+		while (rg) {
+			assert(rg->spec->path);
+			if (!strcmp(rg->spec->path, pair->two->path))
+				break;
+			rg = rg->next;
+		}
+
+		if (rg == NULL)
+			continue;
+		rg->touch = 1;
+		if (rg->nr == 0)
+			continue;
+
+		rg->status = pair->status;
+		assert(pair->two->sha1_valid);
+		diff_populate_filespec(pair->two, 0);
+		file_t.ptr = pair->two->data;
+		file_t.size = pair->two->size;
+
+		if (rg->prev)
+			free_filespec(rg->prev);
+		rg->prev = pair->one;
+		rg->prev->count++;
+		if (pair->one->sha1_valid) {
+			diff_populate_filespec(pair->one, 0);
+			file_p.ptr = pair->one->data;
+			file_p.size = pair->one->size;
+		} else {
+			file_p.ptr = "";
+			file_p.size = 0;
+		}
+
+		if (cr->nr != 0) {
+			struct diff_line_range *tmp = xmalloc(sizeof(*tmp));
+			cr->next = tmp;
+			prev_r = cr;
+			cr = tmp;
+		} else if (cr->spec)
+			DIFF_LINE_RANGE_CLEAR(cr);
+
+		DIFF_LINE_RANGE_INIT(cr);
+		if (pair->one->sha1_valid) {
+			cr->spec = pair->one;
+			cr->spec->count++;
+		}
+
+		cb.interesting = rg;
+		cb.range = cr;
+		cb.diff = 0;
+		cb.plno = cb.tlno = 0;
+		xdi_diff_hunks(&file_p, &file_t, fn, &cb, &xpp, &xecfg);
+		if (cb.diff)
+			diff = 1;
+		/*
+		 * The remain part is the same part.
+		 * Instead of calculating the true line number of the two files,
+		 * use the biggest integer.
+		 */
+		if (map)
+			map_range(&cb, 1, cb.plno + 1, 0x7FFFFFFF, cb.tlno + 1, 0x7FFFFFFF);
+		else
+			take_range(&cb, cb.plno + 1, 0x7FFFFFFF, cb.tlno + 1, 0x7FFFFFFF);
+	}
+	opt->output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_flush(opt);
+
+	/* Collect the untouch ranges, this comes from the files not changed
+	 * between two commit.
+	 */
+	struct diff_line_range *rg = r;
+	while (rg) {
+		/* clear the touch one to make it usable in next round */
+		if (rg->touch) {
+			rg->touch = 0;
+		} else {
+			struct diff_line_range *untouch = diff_line_range_clone(rg);
+			if (prev_r == rr && rr->nr == 0) {
+				rr = prev_r = untouch;
+			} else {
+				prev_r->next = untouch;
+				prev_r = untouch;
+			}
+		}
+		rg = rg->next;
+	}
+
+	if (cr->nr == 0) {
+		DIFF_LINE_RANGE_CLEAR(cr);
+		free(cr);
+		if (prev_r == cr) {
+			rr = NULL;
+		} else {
+			prev_r->next = NULL;
+		}
+	}
+
+	if (rr) {
+		assert(p);
+		add_line_range(rev, p, rr);
+	}
+
+	/* and the ranges of current commit c is updated */
+	c->object.flags &= ~RANGE_UPDATE;
+	if (diff)
+		c->object.flags |= NEED_PRINT;
+
+	if (tree1)
+		free(tree1);
+	if (tree2)
+		free(tree2);
+}
+
+static void diff_update_parent_range(struct rev_info *rev, struct commit *commit)
+{
+	struct diff_line_range *r = lookup_line_range(rev, commit);
+	struct commit_list *parents = commit->parents;
+	struct commit *c = NULL;
+	if (parents) {
+		assert(parents->next == NULL);
+		c = parents->item;
+	}
+
+	assign_range_to_parent(rev, commit, c, r, &rev->diffopt, 1);
+}
+
+static void assign_parents_range(struct rev_info *rev, struct commit *commit)
+{
+	struct commit_list *parents = commit->parents;
+	struct diff_line_range *r = lookup_line_range(rev, commit);
+	struct diff_line_range *copy = NULL, *range = NULL;
+	int nontrivial = 0;
+
+	/*
+	 * If we are in linear history, update range and flush the patch if
+	 * necessary
+	 */
+	if (parents == NULL || parents->next == NULL) {
+		return diff_update_parent_range(rev, commit);
+	}
+
+	/*
+	 * Loop on the parents and assign the ranges to different
+	 * parents, if there is any range left, this commit must
+	 * be an evil merge.
+	 */
+	copy = diff_line_range_clone_deeply(r);
+	parents = commit->parents;
+	while (parents) {
+		struct commit *p = parents->item;
+		assign_range_to_parent(rev, commit, p, r, &rev->diffopt, 1);
+		assign_range_to_parent(rev, commit, p, copy, &rev->diffopt, 0);
+		parents = parents->next;
+	}
+
+	/*
+	 * yes, this must be an evil merge.
+	 */
+	range = copy;
+	while (range) {
+		if (range->nr) {
+			commit->object.flags |= NEED_PRINT | EVIL_MERGE;
+			nontrivial = 1;
+		}
+		range = range->next;
+	}
+
+	if (nontrivial) {
+		add_decoration(&rev->nontrivial_merge, &commit->object, copy);
+	} else {
+		cleanup(copy);
+	}
+}
+
diff --git a/revision.h b/revision.h
index 55836e0..bbc4e2c 100644
--- a/revision.h
+++ b/revision.h
@@ -15,7 +15,9 @@
 #define ADDED		(1u<<7)	/* Parents already parsed and added? */
 #define SYMMETRIC_LEFT	(1u<<8)
 #define RANGE_UPDATE	(1u<<9) /* for line level traverse */
-#define ALL_REV_FLAGS	((1u<<10)-1)
+#define NEED_PRINT	(1u<<10)
+#define EVIL_MERGE	(1u<<11)
+#define ALL_REV_FLAGS	((1u<<12)-1)
 
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2
@@ -141,6 +143,7 @@ struct rev_info {
 	int count_right;
 	/* line-level intersting range */
 	struct decoration line_range;
+	struct decoration nontrivial_merge;
 };
 
 #define REV_TREE_SAME		0
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 09/13] print the line log
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (6 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 08/13] map/take range to parent Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 21:52   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 10/13] map/print ranges along traversing the history topologically Bo Yang
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

'struct line_chunk' is used to make sure each file is scaned
only once when printing the lines. We trace the line number and
the start position of intermediate line in this struct.
Two helper functions from diff.c are used to generate the
consistent format of diff meta info.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 line.c |  245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 245 insertions(+), 0 deletions(-)

diff --git a/line.c b/line.c
index e1af2f6..8813376 100644
--- a/line.c
+++ b/line.c
@@ -951,3 +951,248 @@ static void assign_parents_range(struct rev_info *rev, struct commit *commit)
 	}
 }
 
+struct line_chunk {
+	int lone, ltwo;
+	const char *one, *two;
+	const char *one_end, *two_end;
+	struct diff_line_range *range;
+};
+
+static void flush_lines(struct diff_options *opt, const char **ptr, const char *end,
+		int slno, int elno, int *lno, const char *color, const char heading)
+{
+	const char *p = *ptr;
+	struct strbuf buf = STRBUF_INIT;
+	const char *reset;
+
+	if (strcmp(color, ""))
+		reset = "";
+	else
+		reset = diff_get_color_opt(opt, DIFF_RESET);
+
+	strbuf_addf(&buf, "%s%c", color, heading);
+	while (*ptr < end && *lno < slno) {
+		if (**ptr == '\n') {
+			(*lno)++;
+			if (*lno == slno) {
+				(*ptr)++;
+				break;
+			}
+		}
+		(*ptr)++;
+	}
+	assert(*ptr <= end);
+	p = *ptr;
+
+	/*
+	 * todo: when --graph landed on master, this should be changed
+	 * a little.
+	 */
+	while (*ptr < end && *lno <= elno) {
+		if (**ptr == '\n') {
+			fprintf(opt->file, "%s", buf.buf);
+			if (*ptr - p) {
+				fwrite(p, *ptr - p, 1, opt->file);
+			}
+			fprintf(opt->file, "%s\n", reset);
+			p = *ptr + 1;
+			(*lno)++;
+		}
+		(*ptr)++;
+	}
+	if (*lno <= elno) {
+		fprintf(opt->file, "%s", buf.buf);
+		if (*ptr - p) {
+			fwrite(p, *ptr - p, 1, opt->file);
+		}
+		fprintf(opt->file, "%s\n", reset);
+	}
+	strbuf_release(&buf);
+}
+
+static void diff_flush_range(struct diff_options *opt, struct line_chunk *chunk,
+		struct range *range)
+{
+	struct print_pair *pair = &range->pair;
+	const char *old = diff_get_color_opt(opt, DIFF_FILE_OLD);
+	const char *new = diff_get_color_opt(opt, DIFF_FILE_NEW);
+	int i, cur = range->start;
+
+	for (i = 0; i < pair->nr; i++) {
+		struct print_range *pr = pair->ranges + i;
+		if (cur < pr->start)
+			flush_lines(opt, &chunk->two, chunk->two_end,
+				cur, pr->start - 1, &chunk->ltwo, "", ' ');
+
+		if (!pr->line_added)
+			flush_lines(opt, &chunk->one, chunk->one_end,
+				pr->pstart, pr->pend, &chunk->lone, old, '-');
+		flush_lines(opt, &chunk->two, chunk->two_end,
+			pr->start, pr->end, &chunk->ltwo, new, '+');
+
+		cur = pr->end + 1;
+	}
+
+	if (cur <= range->end) {
+		flush_lines(opt, &chunk->two, chunk->two_end,
+			cur, range->end, &chunk->ltwo, "", ' ');
+	}
+}
+
+static void diff_flush_chunks(struct diff_options *opt, struct line_chunk *chunk)
+{
+	struct diff_line_range *range = chunk->range;
+	const char *set = diff_get_color_opt(opt, DIFF_FRAGINFO);
+	const char *reset = diff_get_color_opt(opt, DIFF_RESET);
+	int i;
+
+	for (i = 0; i < range->nr; i++) {
+		struct range *r = range->ranges + i;
+		long lenp = r->pend - r->pstart + 1, pstart = r->pstart;
+		long len = r->end - r->start + 1;
+		if (pstart == 0)
+			lenp = 0;
+
+		fprintf(opt->file, "%s@@ -%ld,%ld +%ld,%ld @@%s\n",
+			set, pstart, lenp, r->start, len, reset);
+
+		diff_flush_range(opt, chunk, r);
+	}
+}
+
+static void diff_flush_filepair(struct rev_info *rev, struct diff_line_range *range)
+{
+	struct diff_options *opt = &rev->diffopt;
+	struct diff_filespec *one = range->prev, *two = range->spec;
+	struct diff_filepair p = {one, two, range->status, 0};
+	struct strbuf header = STRBUF_INIT, meta = STRBUF_INIT;
+	const char *a_prefix, *b_prefix;
+	const char *name_a, *name_b, *a_one, *b_two;
+	const char *lbl[2];
+	const char *set = diff_get_color_opt(opt, DIFF_METAINFO);
+	const char *reset = diff_get_color_opt(opt, DIFF_RESET);
+	struct line_chunk chunk;
+	int must_show_header;
+
+	/*
+	 * the ranges that touch no different file, in this case
+	 * the line number will not change, and of course we have
+	 * no sensible rang->pair since there is no diff run.
+	 */
+	if (one == NULL) {
+		return;
+	}
+
+	if (range->status == DIFF_STATUS_DELETED)
+		die("We are following an nonexistent file, interesting!");
+
+	name_a  = one->path;
+	name_b = two->path;
+	fill_metainfo(&meta, name_a, name_b, one, two, opt, &p, &must_show_header,
+			DIFF_OPT_TST(opt, COLOR_DIFF));
+
+	diff_set_mnemonic_prefix(opt, "a/", "b/");
+	if (DIFF_OPT_TST(opt, REVERSE_DIFF)) {
+		a_prefix = opt->b_prefix;
+		b_prefix = opt->a_prefix;
+	} else {
+		a_prefix = opt->a_prefix;
+		b_prefix = opt->b_prefix;
+	}
+
+	name_a = DIFF_FILE_VALID(one) ? name_a : name_b;
+	name_b = DIFF_FILE_VALID(two) ? name_b : name_a;
+
+	a_one = quote_two(a_prefix, name_a + (*name_a == '/'));
+	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
+	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
+	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
+	strbuf_addf(&header, "%sdiff --git %s %s%s\n", set, a_one, b_two, reset);
+	if (lbl[0][0] == '/') {
+		strbuf_addf(&header, "%snew file mode %06o%s\n", set, two->mode, reset);
+	} else if (lbl[1][0] == '/') {
+		strbuf_addf(&header, "%sdeleted file mode %06o%s\n", set, one->mode, reset);
+	} else if (one->mode != two->mode) {
+			strbuf_addf(&header, "%sold mode %06o%s\n", set, one->mode, reset);
+			strbuf_addf(&header, "%snew mode %06o%s\n", set, two->mode, reset);
+	}
+
+	fprintf(opt->file, "%s%s", header.buf, meta.buf);
+	strbuf_release(&meta);
+	strbuf_release(&header);
+	fprintf(opt->file, "%s--- %s%s\n", set, lbl[0], reset);
+	fprintf(opt->file, "%s+++ %s%s\n", set, lbl[1], reset);
+	free((void *)a_one);
+	free((void *)b_two);
+
+	chunk.one = one->data;
+	chunk.one_end = one->data + one->size;
+	chunk.lone = 1;
+	chunk.two = two->data;
+	chunk.two_end = two->data + two->size;
+	chunk.ltwo = 1;
+	chunk.range = range;
+	diff_flush_chunks(&rev->diffopt, &chunk);
+}
+
+#define EVIL_MERGE_STR "nontrivial merge found"
+static void flush_nontrivial_merge(struct rev_info *rev, struct diff_line_range *range)
+{
+	struct diff_options *opt = &rev->diffopt;
+	const char *reset = diff_get_color_opt(opt, DIFF_RESET);
+	const char *frag = diff_get_color_opt(opt, DIFF_FRAGINFO);
+	const char *meta = diff_get_color_opt(opt, DIFF_METAINFO);
+	const char *new = diff_get_color_opt(opt, DIFF_FILE_NEW);
+
+	fprintf(opt->file, "%s%s%s\n", meta, EVIL_MERGE_STR, reset);
+
+	while (range) {
+		if (range->nr) {
+			fprintf(opt->file, "%s%s%s\n\n", meta, range->spec->path, reset);
+			int lno = 1;
+			const char *ptr = range->spec->data;
+			const char *end = range->spec->data + range->spec->size;
+			int i = 0;
+			for (; i < range->nr; i++) {
+				struct range *r = range->ranges + i;
+				fprintf(opt->file, "%s@@ %ld,%ld @@%s\n", frag, r->start,
+					r->end - r->start + 1, reset);
+				flush_lines(opt, &ptr, end, r->start, r->end,
+					&lno, new, ' ');
+			}
+			fprintf(opt->file, "\n");
+		}
+		range = range->next;
+	}
+}
+
+static void line_log_flush(struct rev_info *rev, struct commit *c)
+{
+	struct diff_line_range *range = lookup_line_range(rev, c);
+	struct diff_line_range *nontrivial = lookup_decoration(&rev->nontrivial_merge, &c->object);
+	struct log_info log;
+
+	if (range == NULL)
+		return;
+
+	log.commit = c;
+	log.parent = NULL;
+	rev->loginfo = &log;
+	show_log(rev);
+	rev->loginfo = NULL;
+	/*
+	 * Add a new line after each commit message, of course we should
+	 * add --graph alignment later when the patches comes to master.
+	 */
+	fprintf(rev->diffopt.file, "\n");
+
+	if (c->object.flags & EVIL_MERGE)
+		return flush_nontrivial_merge(rev, nontrivial);
+
+	while (range) {
+		if (range->diff)
+			diff_flush_filepair(rev, range);
+		range = range->next;
+	}
+}
+
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 10/13] map/print ranges along traversing the history topologically
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (7 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 09/13] print the line log Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-12 21:52   ` Junio C Hamano
  2010-07-11  6:18 ` [PATCH v3 11/13] add --always-print option Bo Yang
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

Since ranges may change in different branches, we should
make sure we do not pass range to parent until all the
ranges get 'combined' at the commit which is a split commit.
So, topological traversing is necessary.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 builtin/log.c |    5 ++++-
 line.c        |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 line.h        |    2 ++
 3 files changed, 60 insertions(+), 1 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 1e90b03..47b386d 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -615,7 +615,10 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
 	cmd_log_init(argc, argv, prefix, &rev, &opt);
-	return cmd_log_walk(&rev);
+	if (rev.line)
+		return cmd_line_log_walk(&rev);
+	else
+		return cmd_log_walk(&rev);
 }
 
 /* format-patch */
diff --git a/line.c b/line.c
index 8813376..5c8f77a 100644
--- a/line.c
+++ b/line.c
@@ -1196,3 +1196,57 @@ static void line_log_flush(struct rev_info *rev, struct commit *c)
 	}
 }
 
+int cmd_line_log_walk(struct rev_info *rev)
+{
+	struct commit *commit;
+	struct commit_list *list = NULL;
+	struct diff_line_range *r = NULL;
+
+	if (prepare_revision_walk(rev))
+		die("revision walk prepare failed");
+
+	list = rev->commits;
+	if (list) {
+		list->item->object.flags |= RANGE_UPDATE;
+		list = list->next;
+	}
+	/* Clear the flags */
+	while (list) {
+		list->item->object.flags &= 0x0;
+		list = list->next;
+	}
+
+	list = rev->commits;
+	while (list) {
+		struct commit_list *need_free = list;
+		commit = list->item;
+
+		if (commit->object.flags & RANGE_UPDATE) {
+			assign_parents_range(rev, commit);
+		}
+
+		if (commit->object.flags & NEED_PRINT) {
+			line_log_flush(rev, commit);
+		}
+
+		r = lookup_line_range(rev, commit);
+		if (r) {
+			cleanup(r);
+			r = NULL;
+			add_line_range(rev, commit, r);
+		}
+
+		r = lookup_decoration(&rev->nontrivial_merge, &commit->object);
+		if (r) {
+			cleanup(r);
+			r = NULL;
+			add_decoration(&rev->nontrivial_merge, &commit->object, r);
+		}
+
+		list = list->next;
+		free(need_free);
+	}
+
+	return 0;
+}
+
diff --git a/line.h b/line.h
index 885e985..b293894 100644
--- a/line.h
+++ b/line.h
@@ -128,4 +128,6 @@ extern struct diff_line_range *lookup_line_range(struct rev_info *revs, struct c
 const char *parse_loc(const char *spec, nth_line_fn_t nth_line,
 		void *data, long lines, long begin, long *ret);
 
+extern int cmd_line_log_walk(struct rev_info *rev);
+
 #endif
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 11/13] add --always-print option
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (8 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 10/13] map/print ranges along traversing the history topologically Bo Yang
@ 2010-07-11  6:18 ` Bo Yang
  2010-07-13 20:35   ` Junio C Hamano
  2010-07-11  6:19 ` [PATCH v3 12/13] add two test cases Bo Yang
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:18 UTC (permalink / raw)
  To: git; +Cc: gitster

Always print the interesting ranges even if the current
commit does not change any line of it.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 builtin/log.c |    3 +++
 line.c        |   11 +++++++++--
 revision.h    |    3 ++-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 47b386d..3fa31dc 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -85,6 +85,7 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
 {
 	int i;
 	int decoration_given = 0;
+	static int always_print = 0;
 	struct userformat_want w;
 	const char *path = NULL, *pathspec = NULL;
 	static struct diff_line_range *range = NULL;
@@ -92,6 +93,7 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
 	static struct line_opt_callback_data line_cb = {&range, &ctx, NULL};
 	static const struct option options[] = {
 		OPT_CALLBACK('L', NULL, &line_cb, "n,m", "Process only line range n,m, counting from 1", log_line_range_callback),
+		OPT_BOOLEAN(0, "always-print", &always_print, "Always print the interesting range even if the current commit does not change any line of it"),
 		OPT_END()
 	};
 
@@ -224,6 +226,7 @@ parse_done:
 	/* Test whether line level history is asked for */
 	if (range && range->nr > 0) {
 		setup_line(rev, range);
+		rev->always_print = always_print;
 	}
 }
 
diff --git a/line.c b/line.c
index 5c8f77a..c08b510 100644
--- a/line.c
+++ b/line.c
@@ -1080,6 +1080,13 @@ static void diff_flush_filepair(struct rev_info *rev, struct diff_line_range *ra
 	 * no sensible rang->pair since there is no diff run.
 	 */
 	if (one == NULL) {
+		if (rev->always_print) {
+			chunk.two = two->data;
+			chunk.two_end = two->data + two->size;
+			chunk.ltwo = 1;
+			chunk.range = range;
+			diff_flush_chunks(&rev->diffopt, &chunk);
+		}
 		return;
 	}
 
@@ -1190,7 +1197,7 @@ static void line_log_flush(struct rev_info *rev, struct commit *c)
 		return flush_nontrivial_merge(rev, nontrivial);
 
 	while (range) {
-		if (range->diff)
+		if (range->diff || (range->nr && rev->always_print))
 			diff_flush_filepair(rev, range);
 		range = range->next;
 	}
@@ -1225,7 +1232,7 @@ int cmd_line_log_walk(struct rev_info *rev)
 			assign_parents_range(rev, commit);
 		}
 
-		if (commit->object.flags & NEED_PRINT) {
+		if (commit->object.flags & NEED_PRINT || rev->always_print) {
 			line_log_flush(rev, commit);
 		}
 
diff --git a/revision.h b/revision.h
index bbc4e2c..433da9a 100644
--- a/revision.h
+++ b/revision.h
@@ -72,7 +72,8 @@ struct rev_info {
 			bisect:1,
 			ancestry_path:1,
 			first_parent_only:1,
-			line:1;
+			line:1,
+			always_print:1;
 
 	/* Diff flags */
 	unsigned int	diff:1,
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 12/13] add two test cases
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (9 preceding siblings ...)
  2010-07-11  6:18 ` [PATCH v3 11/13] add --always-print option Bo Yang
@ 2010-07-11  6:19 ` Bo Yang
  2010-07-11  6:19 ` [PATCH v3 13/13] some document update Bo Yang
  2010-07-12 16:45 ` [PATCH v3 01/13] parse-options: stop when encounter a non-option Junio C Hamano
  12 siblings, 0 replies; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:19 UTC (permalink / raw)
  To: git; +Cc: gitster

t4301: for simple linear history only
t4302: for history contains merge

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 t/t4301-log-line-single-history.sh |  342 ++++++++++++++++++++++++++++++++++++
 t/t4302-log-line-merge-history.sh  |  114 ++++++++++++
 2 files changed, 456 insertions(+), 0 deletions(-)
 create mode 100755 t/t4301-log-line-single-history.sh
 create mode 100755 t/t4302-log-line-merge-history.sh

diff --git a/t/t4301-log-line-single-history.sh b/t/t4301-log-line-single-history.sh
new file mode 100755
index 0000000..9981496
--- /dev/null
+++ b/t/t4301-log-line-single-history.sh
@@ -0,0 +1,342 @@
+#!/bin/sh
+#
+# Copyright (c) 2010 Bo Yang
+#
+
+test_description='Test git log -L with single line of history
+
+'
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/diff-lib.sh
+
+echo >path0 'void func(){
+	int a = 0;
+	int b = 1;
+	int c;
+	c = a + b;
+}
+'
+
+echo >path1 'void output(){
+	printf("hello world");
+}
+'
+
+test_expect_success \
+    'add path0/path1 and commit.' \
+    'git add path0 path1 &&
+     git commit -m "Base commit"'
+
+echo >path0 'void func(){
+	int a = 10;
+	int b = 11;
+	int c;
+	c = a + b;
+}
+'
+
+echo >path1 'void output(){
+	const char *str = "hello world!";
+	printf("%s", str);
+}
+'
+
+test_expect_success \
+    'Change the 2,3 lines of path0 and path1.' \
+    'git add path0 path1 &&
+     git commit -m "Change 2,3 lines of path0 and path1"'
+
+echo >path0 'void func(){
+	int a = 10;
+	int b = 11;
+	int c;
+	c = 10 * (a + b);
+}
+'
+
+test_expect_success \
+	'Change the 5th line of path0.' \
+	'git add path0 &&
+	 git commit -m "Change the 5th line of path0"'
+
+echo >path0 'void func(){
+	int a = 10;
+	int b = 11;
+	printf("%d", a - b);
+}
+'
+
+test_expect_success \
+	'Final change of path0.' \
+	'git add path0 &&
+	 git commit -m "Final change of path0"'
+
+test_expect_success \
+    'Show the line level log of path0' \
+    'git log --pretty=format:%s%n%b -L /func/,/^}/ path0 > current-path0'
+
+test_expect_success \
+    'Show the line level log of path1' \
+    'git log --pretty=format:%s%n%b -L /output/,/^}/ path1 > current-path1'
+
+test_expect_success \
+	'Show the line level log of two files' \
+    'git log --pretty=format:%s%n%b -L /func/,/^}/ path0 -L /output/,/^}/ path1 > current-pathall'
+
+test_expect_success \
+	'Test the line number argument' \
+	'git log --pretty=format:%s%n%b -L 1,2 path0 > current-linenum'
+
+test_expect_success \
+	'Test the --always-print option' \
+	'git log --pretty=format:%s%n%b --always-print -L 1,2 path0 > current-always'
+
+cat >expected-path0 <<\EOF
+Final change of path0
+
+diff --git a/path0 b/path0
+index 44db133..1518c15 100644
+--- a/path0
++++ b/path0
+@@ -1,6 +1,5 @@
+ void func(){
+ 	int a = 10;
+ 	int b = 11;
+-	int c;
+-	c = 10 * (a + b);
++	printf("%d", a - b);
+ }
+
+Change the 5th line of path0
+
+diff --git a/path0 b/path0
+index 9ef1692..44db133 100644
+--- a/path0
++++ b/path0
+@@ -1,6 +1,6 @@
+ void func(){
+ 	int a = 10;
+ 	int b = 11;
+ 	int c;
+-	c = a + b;
++	c = 10 * (a + b);
+ }
+
+Change 2,3 lines of path0 and path1
+
+diff --git a/path0 b/path0
+index aabffdf..9ef1692 100644
+--- a/path0
++++ b/path0
+@@ -1,6 +1,6 @@
+ void func(){
+-	int a = 0;
+-	int b = 1;
++	int a = 10;
++	int b = 11;
+ 	int c;
+ 	c = a + b;
+ }
+
+Base commit
+
+diff --git a/path0 b/path0
+new file mode 100644
+index 0000000..aabffdf
+--- /dev/null
++++ b/path0
+@@ -0,0 +1,6 @@
++void func(){
++	int a = 0;
++	int b = 1;
++	int c;
++	c = a + b;
++}
+EOF
+
+cat >expected-path1 <<\EOF
+Change 2,3 lines of path0 and path1
+
+diff --git a/path1 b/path1
+index 997d841..1d711b5 100644
+--- a/path1
++++ b/path1
+@@ -1,3 +1,4 @@
+ void output(){
+-	printf("hello world");
++	const char *str = "hello world!";
++	printf("%s", str);
+ }
+
+Base commit
+
+diff --git a/path1 b/path1
+new file mode 100644
+index 0000000..997d841
+--- /dev/null
++++ b/path1
+@@ -0,0 +1,3 @@
++void output(){
++	printf("hello world");
++}
+EOF
+
+cat >expected-pathall <<\EOF
+Final change of path0
+
+diff --git a/path0 b/path0
+index 44db133..1518c15 100644
+--- a/path0
++++ b/path0
+@@ -1,6 +1,5 @@
+ void func(){
+ 	int a = 10;
+ 	int b = 11;
+-	int c;
+-	c = 10 * (a + b);
++	printf("%d", a - b);
+ }
+
+Change the 5th line of path0
+
+diff --git a/path0 b/path0
+index 9ef1692..44db133 100644
+--- a/path0
++++ b/path0
+@@ -1,6 +1,6 @@
+ void func(){
+ 	int a = 10;
+ 	int b = 11;
+ 	int c;
+-	c = a + b;
++	c = 10 * (a + b);
+ }
+
+Change 2,3 lines of path0 and path1
+
+diff --git a/path0 b/path0
+index aabffdf..9ef1692 100644
+--- a/path0
++++ b/path0
+@@ -1,6 +1,6 @@
+ void func(){
+-	int a = 0;
+-	int b = 1;
++	int a = 10;
++	int b = 11;
+ 	int c;
+ 	c = a + b;
+ }
+diff --git a/path1 b/path1
+index 997d841..1d711b5 100644
+--- a/path1
++++ b/path1
+@@ -1,3 +1,4 @@
+ void output(){
+-	printf("hello world");
++	const char *str = "hello world!";
++	printf("%s", str);
+ }
+
+Base commit
+
+diff --git a/path0 b/path0
+new file mode 100644
+index 0000000..aabffdf
+--- /dev/null
++++ b/path0
+@@ -0,0 +1,6 @@
++void func(){
++	int a = 0;
++	int b = 1;
++	int c;
++	c = a + b;
++}
+diff --git a/path1 b/path1
+new file mode 100644
+index 0000000..997d841
+--- /dev/null
++++ b/path1
+@@ -0,0 +1,3 @@
++void output(){
++	printf("hello world");
++}
+EOF
+
+cat >expected-linenum <<\EOF
+Change 2,3 lines of path0 and path1
+
+diff --git a/path0 b/path0
+index aabffdf..9ef1692 100644
+--- a/path0
++++ b/path0
+@@ -1,2 +1,2 @@
+ void func(){
+-	int a = 0;
++	int a = 10;
+
+Base commit
+
+diff --git a/path0 b/path0
+new file mode 100644
+index 0000000..aabffdf
+--- /dev/null
++++ b/path0
+@@ -0,0 +1,2 @@
++void func(){
++	int a = 0;
+EOF
+
+cat >expected-always <<\EOF
+Final change of path0
+
+diff --git a/path0 b/path0
+index 44db133..1518c15 100644
+--- a/path0
++++ b/path0
+@@ -1,2 +1,2 @@
+ void func(){
+ 	int a = 10;
+
+Change the 5th line of path0
+
+diff --git a/path0 b/path0
+index 9ef1692..44db133 100644
+--- a/path0
++++ b/path0
+@@ -1,2 +1,2 @@
+ void func(){
+ 	int a = 10;
+
+Change 2,3 lines of path0 and path1
+
+diff --git a/path0 b/path0
+index aabffdf..9ef1692 100644
+--- a/path0
++++ b/path0
+@@ -1,2 +1,2 @@
+ void func(){
+-	int a = 0;
++	int a = 10;
+
+Base commit
+
+diff --git a/path0 b/path0
+new file mode 100644
+index 0000000..aabffdf
+--- /dev/null
++++ b/path0
+@@ -0,0 +1,2 @@
++void func(){
++	int a = 0;
+EOF
+
+test_expect_success \
+    'validate the output.' \
+    'test_cmp current-path0 expected-path0 &&
+	 test_cmp current-path1 expected-path1 &&
+	 test_cmp current-pathall expected-pathall &&
+	 test_cmp current-linenum expected-linenum &&
+	 test_cmp current-always expected-always'
+
+test_done
diff --git a/t/t4302-log-line-merge-history.sh b/t/t4302-log-line-merge-history.sh
new file mode 100755
index 0000000..02e7439
--- /dev/null
+++ b/t/t4302-log-line-merge-history.sh
@@ -0,0 +1,114 @@
+#!/bin/sh
+#
+# Copyright (c) 2010 Bo Yang
+#
+
+test_description='Test git log -L with merge commit
+
+'
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/diff-lib.sh
+
+echo >path0 'void func(){
+	printf("hello");
+}
+'
+
+test_expect_success \
+    'Add path0 and commit.' \
+    'git add path0 &&
+     git commit -m "Base commit"'
+
+echo >path0 'void func(){
+	printf("hello earth");
+}
+'
+
+test_expect_success \
+    'Change path0 in master.' \
+    'git add path0 &&
+     git commit -m "Change path0 in master"'
+
+test_expect_success \
+	'Make a new branch from the base commit' \
+	'git checkout -b feature master^'
+
+echo >path0 'void func(){
+	print("hello moon");
+}
+'
+
+test_expect_success \
+    'Change path0 in feature.' \
+    'git add path0 &&
+     git commit -m "Change path0 in feature"'
+
+test_expect_success \
+	'Merge the master to feature' \
+	'! git merge master'
+
+echo >path0 'void func(){
+	printf("hello earth and moon");
+}
+'
+
+test_expect_success \
+	'Resolve the conflict' \
+	'git add path0 &&
+	 git commit -m "Merge two branches"'
+
+test_expect_success \
+    'Show the line level log of path0' \
+    'git log --pretty=format:%s%n%b -L /func/,/^}/ path0 > current'
+
+cat >expected <<\EOF
+Merge two branches
+
+nontrivial merge found
+path0
+
+@@ 2,1 @@
+ 	printf("hello earth and moon");
+
+
+Change path0 in master
+
+diff --git a/path0 b/path0
+index f628dea..bef7fa3 100644
+--- a/path0
++++ b/path0
+@@ -1,3 +1,3 @@
+ void func(){
+-	printf("hello");
++	printf("hello earth");
+ }
+
+Change path0 in feature
+
+diff --git a/path0 b/path0
+index f628dea..a940ef6 100644
+--- a/path0
++++ b/path0
+@@ -1,3 +1,3 @@
+ void func(){
+-	printf("hello");
++	print("hello moon");
+ }
+
+Base commit
+
+diff --git a/path0 b/path0
+new file mode 100644
+index 0000000..f628dea
+--- /dev/null
++++ b/path0
+@@ -0,0 +1,3 @@
++void func(){
++	printf("hello");
++}
+EOF
+test_expect_success \
+    'validate the output.' \
+    'test_cmp current expected'
+
+test_done
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 13/13] some document update
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (10 preceding siblings ...)
  2010-07-11  6:19 ` [PATCH v3 12/13] add two test cases Bo Yang
@ 2010-07-11  6:19 ` Bo Yang
  2010-07-11  8:27   ` Jakub Narebski
  2010-07-12 16:45 ` [PATCH v3 01/13] parse-options: stop when encounter a non-option Junio C Hamano
  12 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-11  6:19 UTC (permalink / raw)
  To: git; +Cc: gitster

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 Documentation/git-log.txt |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 0e6ff31..14a9703 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -9,6 +9,7 @@ git-log - Show commit logs
 SYNOPSIS
 --------
 'git log' [<options>] [<since>..<until>] [[\--] <path>...]
+'git log' [<options>] -L n,m <path>
 
 DESCRIPTION
 -----------
@@ -19,6 +20,9 @@ command to control what is shown and how, and options applicable to
 the 'git diff-*' commands to control how the changes
 each commit introduces are shown.
 
+With '-L' option, the command will help to trace the history of user specified
+line ranges. It can trace multiple ranges coming from multiple files.
+
 
 OPTIONS
 -------
@@ -72,6 +76,32 @@ include::diff-options.txt[]
 	to be prefixed with "\-- " to separate them from options or
 	refnames.
 
+-L <start>,<end>::
+	The line range.  <start> and <end> can take one of these forms:
+
+- number
++
+If <start> or <end> is a number, it specifies an
+absolute line number (lines count from 1).
++
+
+- /regex/
++
+This form will use the first line matching the given
+POSIX regex.  If <end> is a regex, it will search
+starting at the line given by <start>.
++
+
+- +offset or -offset
++
+This is only valid for <end> and will specify a number
+of lines before or after the line given by <start>.
++
+
+--always-print::
+	Always print the interesting range even if the current commit
+	does not change any line of the range.
+
 
 include::rev-list-options.txt[]
 
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 13/13] some document update
  2010-07-11  6:19 ` [PATCH v3 13/13] some document update Bo Yang
@ 2010-07-11  8:27   ` Jakub Narebski
  2010-07-12 14:12     ` Bo Yang
  0 siblings, 1 reply; 39+ messages in thread
From: Jakub Narebski @ 2010-07-11  8:27 UTC (permalink / raw)
  To: Bo Yang; +Cc: git, Junio Hamano

Bo Yang <struggleyb.nku@gmail.com> writes:

>  Documentation/git-log.txt |   30 ++++++++++++++++++++++++++++++
>  1 files changed, 30 insertions(+), 0 deletions(-)

[...]
> +-L <start>,<end>::
> +	The line range.  <start> and <end> can take one of these forms:
> +
> +- number
> ++
> +If <start> or <end> is a number, it specifies an
> +absolute line number (lines count from 1).
> ++
> +
> +- /regex/
> ++
> +This form will use the first line matching the given
> +POSIX regex.  If <end> is a regex, it will search
> +starting at the line given by <start>.
> ++
> +
> +- +offset or -offset
> ++
> +This is only valid for <end> and will specify a number
> +of lines before or after the line given by <start>.
> ++

If the parsing code for -L <start>,<end> is the same for git-blame and
for git-log, and therefore documentation is the same or nearly the
same for this option, wouldn't it be better to separate this
documentation into separate file, e.g. line-range-option.txt, and
include it both in git-blame and git-log manpages?  If there are minor
differences, they can be covered by ifdefs.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 13/13] some document update
  2010-07-11  8:27   ` Jakub Narebski
@ 2010-07-12 14:12     ` Bo Yang
  2010-07-12 18:45       ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-12 14:12 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Junio Hamano

Hi Jakub,

On Sun, Jul 11, 2010 at 4:27 PM, Jakub Narebski <jnareb@gmail.com> wrote:
>
> If the parsing code for -L <start>,<end> is the same for git-blame and
> for git-log, and therefore documentation is the same or nearly the
> same for this option, wouldn't it be better to separate this
> documentation into separate file, e.g. line-range-option.txt, and
> include it both in git-blame and git-log manpages?  If there are minor
> differences, they can be covered by ifdefs.
>

Thanks a lot for your advice, I have revised the patch.

-----------------------------------------------------------
From 88ed88a53d83c2d46fa4917008efadc531ba1068 Mon Sep 17 00:00:00 2001
From: Bo Yang <struggleyb.nku@gmail.com>
Date: Sat, 26 Jun 2010 01:35:48 -0700
Subject: [PATCH v3 revised 13/13] some document update

Both 'git log' and 'git blame' support the same
format of '-L' arguments, so put the argument
format description into a new file.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 Documentation/blame-options.txt     |   19 +------------------
 Documentation/git-log.txt           |   13 +++++++++++++
 Documentation/line-range-format.txt |   18 ++++++++++++++++++
 3 files changed, 32 insertions(+), 18 deletions(-)
 create mode 100644 Documentation/line-range-format.txt

diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt
index 16e3c68..3526835 100644
--- a/Documentation/blame-options.txt
+++ b/Documentation/blame-options.txt
@@ -13,24 +13,7 @@
 	Annotate only the given line range.  <start> and <end> can take
 	one of these forms:

-	- number
-+
-If <start> or <end> is a number, it specifies an
-absolute line number (lines count from 1).
-+
-
-- /regex/
-+
-This form will use the first line matching the given
-POSIX regex.  If <end> is a regex, it will search
-starting at the line given by <start>.
-+
-
-- +offset or -offset
-+
-This is only valid for <end> and will specify a number
-of lines before or after the line given by <start>.
-+
+include::line-range-format.txt[]

 -l::
 	Show long rev (Default: off).
diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 0e6ff31..4781f8b 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -9,6 +9,7 @@ git-log - Show commit logs
 SYNOPSIS
 --------
 'git log' [<options>] [<since>..<until>] [[\--] <path>...]
+'git log' [<options>] -L n,m <path>

 DESCRIPTION
 -----------
@@ -19,6 +20,9 @@ command to control what is shown and how, and
options applicable to
 the 'git diff-*' commands to control how the changes
 each commit introduces are shown.

+With '-L' option, the command will help to trace the history of user specified
+line ranges. It can trace multiple ranges coming from multiple files.
+

 OPTIONS
 -------
@@ -72,6 +76,15 @@ include::diff-options.txt[]
 	to be prefixed with "\-- " to separate them from options or
 	refnames.

+-L <start>,<end>::
+	The line range.  <start> and <end> can take one of these forms:
+
+include::line-range-format.txt[]
+
+--always-print::
+	Always print the interesting range even if the current commit
+	does not change any line of the range.
+

 include::rev-list-options.txt[]

diff --git a/Documentation/line-range-format.txt
b/Documentation/line-range-format.txt
new file mode 100644
index 0000000..265bc23
--- /dev/null
+++ b/Documentation/line-range-format.txt
@@ -0,0 +1,18 @@
+- number
++
+If <start> or <end> is a number, it specifies an
+absolute line number (lines count from 1).
++
+
+- /regex/
++
+This form will use the first line matching the given
+POSIX regex.  If <end> is a regex, it will search
+starting at the line given by <start>.
++
+
+- +offset or -offset
++
+This is only valid for <end> and will specify a number
+of lines before or after the line given by <start>.
++
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/13] add the basic data structure for line level  history
  2010-07-11  6:18 ` [PATCH v3 03/13] add the basic data structure for line level history Bo Yang
@ 2010-07-12 14:16   ` Bo Yang
  2010-07-12 16:50     ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-12 14:16 UTC (permalink / raw)
  To: git; +Cc: gitster

Hi Junio,

    I have make a revised version of this patch to fix a bug of it.
That's that I declare a 'unsigned long *ends;" but allocate memory
using "xmalloc(size*sizeof(int))" for it. And this new patch fix it.

-------------------------------------------------------------------------
From 7608bb8d67126b6d9d5436f00e8743ac0600db6f Mon Sep 17 00:00:00 2001
From: Bo Yang <struggleyb.nku@gmail.com>
Date: Fri, 25 Jun 2010 19:37:32 -0700
Subject: [PATCH v3 revised 03/13] add the basic data structure for
line level history

'struct diff_line_range' is the main data structure to store
the user interesting line range. There is one 'diff_line_range'
for each file, and there are multiple 'struct range' in each
'diff_line_range'. In this way, we support multiple ranges.

Within 'struct range', there are multiple 'struct print_range'
which represent a diff chunk.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
 Makefile   |    2 +
 diffcore.h |    1 +
 line.c     |  456 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 line.h     |  122 ++++++++++++++++
 revision.c |    6 +
 revision.h |    8 +-
 6 files changed, 593 insertions(+), 2 deletions(-)
 create mode 100644 line.c
 create mode 100644 line.h

diff --git a/Makefile b/Makefile
index 1fadd43..0183a21 100644
--- a/Makefile
+++ b/Makefile
@@ -496,6 +496,7 @@ LIB_H += grep.h
 LIB_H += hash.h
 LIB_H += help.h
 LIB_H += levenshtein.h
+LIB_H += line.h
 LIB_H += list-objects.h
 LIB_H += ll-merge.h
 LIB_H += log-tree.h
@@ -582,6 +583,7 @@ LIB_OBJS += help.o
 LIB_OBJS += hex.o
 LIB_OBJS += ident.o
 LIB_OBJS += levenshtein.o
+LIB_OBJS += line.o
 LIB_OBJS += list-objects.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
diff --git a/diffcore.h b/diffcore.h
index 491bea0..13d8e93 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -23,6 +23,7 @@
 #define MINIMUM_BREAK_SIZE     400 /* do not break a file smaller than this */

 struct userdiff_driver;
+struct diff_options;

 struct diff_filespec {
 	unsigned char sha1[20];
diff --git a/line.c b/line.c
new file mode 100644
index 0000000..2f82fd8
--- /dev/null
+++ b/line.c
@@ -0,0 +1,456 @@
+#include "line.h"
+#include "cache.h"
+#include "tag.h"
+#include "blob.h"
+#include "tree.h"
+#include "commit.h"
+#include "diff.h"
+#include "decorate.h"
+#include "revision.h"
+#include "xdiff-interface.h"
+#include "strbuf.h"
+#include "log-tree.h"
+
+static void cleanup(struct diff_line_range *r)
+{
+	while (r) {
+		struct diff_line_range *next = r->next;
+		DIFF_LINE_RANGE_CLEAR(r);
+		free(r);
+		r = next;
+	}
+}
+
+static struct object *verify_commit(struct rev_info *revs)
+{
+	struct object *commit = NULL;
+	const char *name = NULL;
+	int i;
+
+	for (i = 0; i < revs->pending.nr; i++) {
+		struct object *obj = revs->pending.objects[i].item;
+		if (obj->flags & UNINTERESTING)
+			continue;
+		while (obj->type == OBJ_TAG)
+			obj = deref_tag(obj, NULL, 0);
+		if (obj->type != OBJ_COMMIT)
+			die("Non commit %s?", revs->pending.objects[i].name);
+		if (commit)
+			die("More than one commit to dig from: %s and %s?",
+			    revs->pending.objects[i].name, name);
+		commit = obj;
+		name = revs->pending.objects[i].name;
+	}
+
+	if (commit == NULL)
+		die("No commit specified?");
+
+	return commit;
+}
+
+static void fill_blob_sha1(struct commit *commit, struct diff_line_range *r)
+{
+	unsigned mode;
+	unsigned char sha1[20];
+
+	while (r) {
+		if (get_tree_entry(commit->object.sha1, r->spec->path,
+			sha1, &mode))
+			goto error;
+		fill_filespec(r->spec, sha1, mode);
+		r = r->next;
+	}
+
+	return;
+error:
+	die("There is no path %s in the commit", r->spec->path);
+}
+
+static void fill_line_ends(struct diff_filespec *spec, long *lines,
+	unsigned long **line_ends)
+{
+	int num = 0, size = 50;
+	long cur = 0;
+	unsigned long *ends = NULL;
+	char *data = NULL;
+
+	if (diff_populate_filespec(spec, 0))
+		die("Cannot read blob %s", sha1_to_hex(spec->sha1));
+
+	ends = xmalloc(size * sizeof(*ends));
+	ends[cur++] = -1;
+	data = spec->data;
+	while (num < spec->size) {
+		if (data[num] == '\n' || num == spec->size - 1) {
+			ALLOC_GROW(ends, (cur + 1), size);
+			ends[cur++] = num;
+		}
+		num++;
+	}
+
+	/* shrink the array to fit the elements */
+	ends = xrealloc(ends, cur * sizeof(*ends));
+	*lines = cur;
+	*line_ends = ends;
+}
+
+static const char *nth_line(struct diff_filespec *spec, int line,
+		long lines, unsigned long *line_ends)
+{
+	assert(line < lines);
+	assert(spec && spec->data);
+
+	return (char *)spec->data + line_ends[line] + 1;
+}
+
+/*
+ * copied from blame.c, indeed, we can even to use this to test
+ * whether line log works. :)
+ */
+static const char *parse_loc(const char *spec, struct diff_filespec *file,
+			     long lines, unsigned long *line_ends,
+			     long begin, long *ret)
+{
+	char *term;
+	const char *line;
+	long num;
+	int reg_error;
+	regex_t regexp;
+	regmatch_t match[1];
+
+	/* Allow "-L <something>,+20" to mean starting at <something>
+	 * for 20 lines, or "-L <something>,-5" for 5 lines ending at
+	 * <something>.
+	 */
+	if (1 < begin && (spec[0] == '+' || spec[0] == '-')) {
+		num = strtol(spec + 1, &term, 10);
+		if (term != spec + 1) {
+			if (spec[0] == '-')
+				num = 0 - num;
+			if (0 < num)
+				*ret = begin + num - 2;
+			else if (!num)
+				*ret = begin;
+			else
+				*ret = begin + num;
+			return term;
+		}
+		return spec;
+	}
+	num = strtol(spec, &term, 10);
+	if (term != spec) {
+		*ret = num;
+		return term;
+	}
+	if (spec[0] != '/')
+		return spec;
+
+	/* it could be a regexp of form /.../ */
+	for (term = (char *) spec + 1; *term && *term != '/'; term++) {
+		if (*term == '\\')
+			term++;
+	}
+	if (*term != '/')
+		return spec;
+
+	/* try [spec+1 .. term-1] as regexp */
+	*term = 0;
+	begin--; /* input is in human terms */
+	line = nth_line(file, begin, lines, line_ends);
+
+	if (!(reg_error = regcomp(&regexp, spec + 1, REG_NEWLINE)) &&
+	    !(reg_error = regexec(&regexp, line, 1, match, 0))) {
+		const char *cp = line + match[0].rm_so;
+		const char *nline;
+
+		while (begin++ < lines) {
+			nline = nth_line(file, begin, lines, line_ends);
+			if (line <= cp && cp < nline)
+				break;
+			line = nline;
+		}
+		*ret = begin;
+		regfree(&regexp);
+		*term++ = '/';
+		return term;
+	}
+	else {
+		char errbuf[1024];
+		regerror(reg_error, &regexp, errbuf, 1024);
+		die("-L parameter '%s': %s", spec + 1, errbuf);
+	}
+}
+
+static void parse_range(long lines, unsigned long *line_ends,
+		struct range *r, struct diff_filespec *spec)
+{
+	const char *term;
+
+	term = parse_loc(r->arg, spec, lines, line_ends, 1, &r->start);
+	if (*term == ',') {
+		term = parse_loc(term + 1, spec, lines, line_ends,
+			r->start + 1, &r->end);
+		if (*term) {
+			die("-L parameter's argument should be <start>,<end>");
+		}
+	}
+
+	if (*term) {
+		die("-L parameter's argument should be <start>,<end>");
+	}
+
+	if (r->start > r->end) {
+		long tmp = r->start;
+		r->start = r->end;
+		r->end = tmp;
+	}
+
+	if (r->start < 1)
+		r->start = 1;
+	if (r->end >= lines)
+		r->end = lines - 1;
+}
+
+static void parse_lines(struct commit *commit, struct diff_line_range *r)
+{
+	int i;
+	struct range *old_range = NULL;
+	long lines = 0;
+	unsigned long *ends = NULL;
+
+	while (r) {
+		struct diff_filespec *spec = r->spec;
+		int num = r->nr;
+		assert(spec);
+		fill_blob_sha1(commit, r);
+		old_range = r->ranges;
+		r->ranges = NULL;
+		r->nr = r->alloc = 0;
+		fill_line_ends(spec, &lines, &ends);
+		for (i = 0; i < num; i++) {
+			parse_range(lines, ends, old_range + i, spec);
+			diff_line_range_insert(r, old_range[i].arg,
+				old_range[i].start, old_range[i].end);
+		}
+
+		free(ends);
+		ends = NULL;
+
+		r = r->next;
+		free(old_range);
+	}
+}
+
+/*
+ * Insert a new line range into a diff_line_range struct, and keep the
+ * r->ranges sorted by their starting line number.
+ */
+struct range *diff_line_range_insert(struct diff_line_range *r, const
char *arg,
+				int start, int end)
+{
+	int i = 0;
+	struct range *rs = r->ranges;
+	int left_extend = 0, right_extend = 0;
+
+	assert(r != NULL);
+	assert(start <= end);
+
+	if (r->nr == 0 || rs[r->nr - 1].end < start - 1) {
+		DIFF_LINE_RANGE_GROW(r);
+		rs = r->ranges;
+		int num = r->nr - 1;
+		rs[num].arg = arg;
+		rs[num].start = start;
+		rs[num].end = end;
+		return rs + num;
+	}
+
+	for (; i < r->nr; i++) {
+		if (rs[i].end < start - 1)
+			continue;
+		if (rs[i].end == start - 1) {
+			rs[i].end = end;
+			right_extend = 1;
+			goto out;
+		}
+
+		assert(rs[i].end > start - 1);
+		if (rs[i].start <= start) {
+			if (rs[i].end < end) {
+				rs[i].end = end;
+				right_extend = 1;
+			}
+			goto out;
+		} else if (rs[i].start <= end + 1) {
+			rs[i].start = start;
+			left_extend = 1;
+			if (rs[i].end < end) {
+				rs[i].end = end;
+				right_extend = 1;
+			}
+			goto out;
+		} else {
+			int num = r->nr - i;
+			DIFF_LINE_RANGE_GROW(r);
+			rs = r->ranges;
+			memmove(rs + i + 1, rs + i, num * sizeof(struct range));
+			rs[i].arg = arg;
+			rs[i].start = start;
+			rs[i].end = end;
+			goto out;
+		}
+	}
+
+out:
+	assert(r->nr != i);
+	if (left_extend) {
+		int j = i;
+		for (; j > -1; j--) {
+			if (rs[j].end >= rs[i].start - 1)
+				if (rs[j].start < rs[i].start)
+					rs[i].start = rs[j].start;
+		}
+		memmove(rs + j + 1, rs + i, (r->nr - i) * sizeof(struct range));
+		r->nr -= i - j - 1;
+	}
+	if (right_extend) {
+		int j = i;
+		for (; j < r->nr; j++) {
+			if (rs[j].start <= rs[i].end + 1)
+				if (rs[j].end > rs[i].end)
+					rs[i].end = rs[j].end;
+		}
+		if (j < r->nr) {
+			memmove(rs + i + 1, rs + j, (r->nr - j) * sizeof(struct range));
+		}
+		r->nr -= j - i - 1;
+	}
+	assert(r->nr);
+
+	return rs + i;
+}
+
+void diff_line_range_clear(struct diff_line_range *r)
+{
+	int i = 0, zero = 0;
+
+	for (; i < r->nr; i++) {
+		struct range *rg = r->ranges + i;
+		RANGE_CLEAR(rg);
+	}
+
+	if (r->prev) {
+		zero = 0;
+		if (r->prev->count == 1) {
+			zero = 1;
+		}
+		free_filespec(r->prev);
+		if (zero)
+			r->prev = NULL;
+	}
+	if (r->spec) {
+		zero = 0;
+		if (r->spec->count == 1) {
+			zero = 1;
+		}
+		free_filespec(r->spec);
+		if (zero)
+			r->spec = NULL;
+	}
+
+	r->status = '\0';
+	r->alloc = r->nr = 0;
+
+	if (r->ranges)
+		free(r->ranges);
+	r->ranges = NULL;
+	r->next = NULL;
+}
+
+void diff_line_range_append(struct diff_line_range *r, const char *arg)
+{
+	DIFF_LINE_RANGE_GROW(r);
+	r->ranges[r->nr - 1].arg = arg;
+}
+
+struct diff_line_range *diff_line_range_merge(struct diff_line_range *out,
+	struct diff_line_range *other)
+{
+	struct diff_line_range *one = out, *two = other;
+	struct diff_line_range *pone;
+
+	while (one) {
+		struct diff_line_range *ptwo;
+		two = other;
+		ptwo = other;
+		while (two) {
+			if (!strcmp(one->spec->path, two->spec->path)) {
+				int i = 0;
+				for (; i < two->nr; i++) {
+					diff_line_range_insert(one, NULL,
+						two->ranges[i].start,
+						two->ranges[i].end);
+				}
+				if (two == other) {
+					other = other->next;
+				} else {
+					ptwo->next = two->next;
+				}
+				DIFF_LINE_RANGE_CLEAR(two);
+				free(two);
+				two = NULL;
+
+				break;
+			}
+
+			ptwo = two;
+			two = two->next;
+		}
+
+		pone = one;
+		one = one->next;
+	}
+	pone->next = other;
+
+	return out;
+}
+
+void add_line_range(struct rev_info *revs, struct commit *commit,
struct diff_line_range *r)
+{
+	struct diff_line_range *ret = NULL;
+
+	if (r != NULL) {
+		ret = lookup_decoration(&revs->line_range, &commit->object);
+		if (ret != NULL) {
+			diff_line_range_merge(ret, r);
+		} else {
+			add_decoration(&revs->line_range, &commit->object, r);
+		}
+		commit->object.flags |= RANGE_UPDATE;
+	}
+}
+
+struct diff_line_range *lookup_line_range(struct rev_info *revs,
struct commit *commit)
+{
+	struct diff_line_range *ret = NULL;
+
+	ret = lookup_decoration(&revs->line_range, &commit->object);
+	return ret;
+}
+
+void setup_line(struct rev_info *rev, struct diff_line_range *r)
+{
+	struct commit *commit = NULL;
+	struct diff_options *opt = &rev->diffopt;
+
+	commit = (struct commit *)verify_commit(rev);
+	parse_lines(commit, r);
+
+	add_line_range(rev, commit, r);
+	/*
+	 * Note we support -M/-C to detect file rename
+	 */
+	opt->nr_paths = 0;
+	diff_tree_release_paths(opt);
+}
+
diff --git a/line.h b/line.h
new file mode 100644
index 0000000..a2f8545
--- /dev/null
+++ b/line.h
@@ -0,0 +1,122 @@
+#ifndef LINE_H
+#define LINE_H
+
+#include "diffcore.h"
+
+struct rev_info;
+struct commit;
+struct diff_line_range;
+struct diff_options;
+
+struct print_range {
+	int start, end;
+	int pstart, pend;
+	int line_added : 1;
+};
+
+struct print_pair {
+	int alloc, nr;
+	struct print_range *ranges;
+};
+
+#define PRINT_RANGE_INIT(r) \
+	do { \
+		(r)->line_added = 0; \
+	} while(0);
+
+#define PRINT_PAIR_INIT(p) \
+	do { \
+		(p)->alloc = (p)->nr = 0; \
+		(p)->ranges = NULL; \
+	} while(0);
+
+#define PRINT_PAIR_GROW(p) \
+	do { \
+		(p)->nr ++; \
+		ALLOC_GROW((p)->ranges, (p)->nr, (p)->alloc); \
+	} while (0);
+
+#define PRINT_PAIR_CLEAR(p) \
+	do { \
+		(p)->alloc = (p)->nr = 0; \
+		if ((p)->ranges) \
+			free((p)->ranges); \
+		(p)->ranges = NULL; \
+	} while(0);
+
+struct range {
+	const char *arg;	//The argument to specify this line range
+	long start, end;	//The start line number, inclusive
+	long pstart, pend;	//The end line number, inclusive
+	struct print_pair pair;
+			//The changed lines inside this range
+	unsigned int diff:1;
+};
+
+struct diff_line_range {
+	struct diff_filespec *prev;
+	struct diff_filespec *spec;
+	char status;
+	int alloc;
+	int nr;
+	struct range *ranges;
+	unsigned int	touch:1,
+			diff:1;
+	struct diff_line_range *next;
+};
+
+#define RANGE_INIT(r) \
+	do { \
+		(r)->arg = NULL; \
+		(r)->start = (r)->end = 0; \
+		(r)->pstart = (r)->pend = 0; \
+		PRINT_PAIR_INIT(&((r)->pair)); \
+		(r)->diff = 0; \
+	} while (0);
+
+#define RANGE_CLEAR(r) \
+	do { \
+		(r)->arg = NULL; \
+		(r)->start = (r)->end = 0; \
+		(r)->pstart = (r)->pend = 0; \
+		PRINT_PAIR_CLEAR(&r->pair); \
+		(r)->diff = 0; \
+	} while(0);
+
+#define DIFF_LINE_RANGE_INIT(r) \
+	do { \
+		(r)->prev = (r)->spec = NULL; \
+		(r)->status = '\0'; \
+		(r)->alloc = (r)->nr = 0; \
+		(r)->ranges = NULL; \
+		(r)->next = NULL; \
+		(r)->touch = 0; \
+		(r)->diff = 0; \
+	} while(0);
+
+#define DIFF_LINE_RANGE_GROW(r) \
+	do { \
+		(r)->nr ++; \
+		ALLOC_GROW((r)->ranges, (r)->nr, (r)->alloc); \
+		RANGE_INIT(((r)->ranges + (r)->nr - 1)); \
+	} while(0);
+
+#define DIFF_LINE_RANGE_CLEAR(r) \
+	diff_line_range_clear((r));
+
+extern struct range *diff_line_range_insert(struct diff_line_range
*r, const char *arg,
+				int start, int end);
+extern void diff_line_range_append(struct diff_line_range *r, const char *arg);
+
+extern void diff_line_range_clear(struct diff_line_range *r);
+
+extern struct diff_line_range *diff_line_range_merge(struct
diff_line_range *out,
+	struct diff_line_range *other);
+
+extern void setup_line(struct rev_info *rev, struct diff_line_range *r);
+
+extern void add_line_range(struct rev_info *revs, struct commit
*commit, struct diff_line_range *r);
+
+extern struct diff_line_range *lookup_line_range(struct rev_info
*revs, struct commit *commit);
+
+#endif
diff --git a/revision.c b/revision.c
index 7e82efd..63f37ea 100644
--- a/revision.c
+++ b/revision.c
@@ -1637,6 +1637,12 @@ int setup_revisions(int argc, const char
**argv, struct rev_info *revs, struct s
 	if (revs->combine_merges)
 		revs->ignore_merges = 0;
 	revs->diffopt.abbrev = revs->abbrev;
+
+	if (revs->line) {
+		revs->limited = 1;
+		revs->topo_order = 1;
+	}
+
 	if (diff_setup_done(&revs->diffopt) < 0)
 		die("diff_setup_done failed");

diff --git a/revision.h b/revision.h
index 36fdf22..55836e0 100644
--- a/revision.h
+++ b/revision.h
@@ -14,7 +14,8 @@
 #define CHILD_SHOWN	(1u<<6)
 #define ADDED		(1u<<7)	/* Parents already parsed and added? */
 #define SYMMETRIC_LEFT	(1u<<8)
-#define ALL_REV_FLAGS	((1u<<9)-1)
+#define RANGE_UPDATE	(1u<<9) /* for line level traverse */
+#define ALL_REV_FLAGS	((1u<<10)-1)

 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2
@@ -68,7 +69,8 @@ struct rev_info {
 			cherry_pick:1,
 			bisect:1,
 			ancestry_path:1,
-			first_parent_only:1;
+			first_parent_only:1,
+			line:1;

 	/* Diff flags */
 	unsigned int	diff:1,
@@ -137,6 +139,8 @@ struct rev_info {
 	/* commit counts */
 	int count_left;
 	int count_right;
+	/* line-level intersting range */
+	struct decoration line_range;
 };

 #define REV_TREE_SAME		0
-- 
1.7.0.2.273.gc2413.dirty

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/13] parse-options: stop when encounter a non-option
  2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
                   ` (11 preceding siblings ...)
  2010-07-11  6:19 ` [PATCH v3 13/13] some document update Bo Yang
@ 2010-07-12 16:45 ` Junio C Hamano
  12 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 16:45 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> Subject: Re: [PATCH v3 01/13] parse-options: stop when encounter a non-option
>
> We support the syntax like:
> -L n1,m1 pathspec1 -L n2,m2 pathspec2.
>
> Make the parse-options API not stop when encounter a
> non-option argument, report the status and go on parsing
> the remain options.

-ECANTPARSE

Read the above again and wonder...

Does it stop, like the subject line says, or does it not stop???

Perhaps you meant to say something like this...

  Subject: [PATCH v3 01/13] parse-options: enhance STOP_AT_NON_OPTION

  Make parse_options_step() report PARSE_OPT_NON_OPTION, instead of
  PARSE_OPT_DONE, to the caller, when it sees a non-option argument.  

  This will help implementing a nonstandard option syntax that takes more
  than one parameters to an option, e.g.

      -L n1,m1 pathspec1 -L n2,m2 pathspec2

  by directly calling parse_options_step().  The parse_options() API only
  calls parse_options_step() once, and its callers are not affected by
  this change.

  Thanks-to: Jonathan Nieder <jrnieder@gmail.com>
  Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>

Currently blame and shortlog seems to use parse_options_step() but neither
of them uses STOP_AT_NON_OPTION, so this change shouldn't break them.

>
> Thanks-to: Jonathan Nieder <jrnieder@gmail.com>
> Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
> ---
>  parse-options.c |    3 ++-
>  parse-options.h |    3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/parse-options.c b/parse-options.c
> index 0fa79bc..cbb49d3 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -374,7 +374,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
>  			if (parse_nodash_opt(ctx, arg, options) == 0)
>  				continue;
>  			if (ctx->flags & PARSE_OPT_STOP_AT_NON_OPTION)
> -				break;
> +				return PARSE_OPT_NON_OPTION;
>  			ctx->out[ctx->cpidx++] = ctx->argv[0];
>  			continue;
>  		}
> @@ -456,6 +456,7 @@ int parse_options(int argc, const char **argv, const char *prefix,
>  	switch (parse_options_step(&ctx, options, usagestr)) {
>  	case PARSE_OPT_HELP:
>  		exit(129);
> +	case PARSE_OPT_NON_OPTION:
>  	case PARSE_OPT_DONE:
>  		break;
>  	default: /* PARSE_OPT_UNKNOWN */
> diff --git a/parse-options.h b/parse-options.h
> index 7435cdb..407697a 100644
> --- a/parse-options.h
> +++ b/parse-options.h
> @@ -161,7 +161,8 @@ extern NORETURN void usage_msg_opt(const char *msg,
>  enum {
>  	PARSE_OPT_HELP = -1,
>  	PARSE_OPT_DONE,
> -	PARSE_OPT_UNKNOWN
> +	PARSE_OPT_NON_OPTION,
> +	PARSE_OPT_UNKNOWN,
>  };
>  
>  /*
> -- 
> 1.7.0.2.273.gc2413.dirty

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/13] parse-options: add two helper functions
  2010-07-11  6:18 ` [PATCH v3 02/13] parse-options: add two helper functions Bo Yang
@ 2010-07-12 16:45   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 16:45 UTC (permalink / raw)
  To: Bo Yang; +Cc: git, gitster

Bo Yang <struggleyb.nku@gmail.com> writes:

> +int parse_options_next(struct parse_opt_ctx_t *ctx, int retain)

I think the latter is usually called "keep".

> +{
> +	if (ctx->argc <= 0)
> +		return -1;
> +
> +	if (retain == 1)

Should this really check for "1", or would

	if (keep)

be more conventional?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/13] add the basic data structure for line level  history
  2010-07-12 14:16   ` Bo Yang
@ 2010-07-12 16:50     ` Junio C Hamano
  2010-07-18 14:35       ` Bo Yang
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 16:50 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> Subject: [PATCH v3 revised 03/13] add the basic data structure for line level history
>
> 'struct diff_line_range' is the main data structure to store
> the user interesting line range. There is one 'diff_line_range'

-ECANTPARSE. Perhaps "to keep track of the line ranges the user is
interested in"?

Is it the end user, or the code, that is interested in the range recorded
in this structure?  If you are adjusting the line range while traversing
the history, then it is more of the latter; i.e. "the user originally
started digging from this range, but after examining the diff that affect
that range by this commit, we found that the range corresponds to this
widened/narrowed range in the history leading to the commit, so we will
update the range and use it from now on until we hit another commit that
affects this range".  In that case "to keep track of the line ranges we
are currently interested in" would be more appropriate than "the ranges
the user is interested in".

> for each file, and there are multiple 'struct range' in each
> 'diff_line_range'. In this way, we support multiple ranges.
>
> Within 'struct range', there are multiple 'struct print_range'
> which represent a diff chunk.

We call them "hunks" not "chunks", I think.

> diff --git a/diffcore.h b/diffcore.h
> index 491bea0..13d8e93 100644
> --- a/diffcore.h
> +++ b/diffcore.h
> @@ -23,6 +23,7 @@
>  #define MINIMUM_BREAK_SIZE     400 /* do not break a file smaller than this */
>
>  struct userdiff_driver;
> +struct diff_options;

Why???

>  struct diff_filespec {
>  	unsigned char sha1[20];

> diff --git a/line.c b/line.c
> new file mode 100644
> index 0000000..2f82fd8
> --- /dev/null
> +++ b/line.c
> @@ -0,0 +1,456 @@
> ...
> +
> +static struct object *verify_commit(struct rev_info *revs)
> +{
> +	struct object *commit = NULL;
> +	const char *name = NULL;

Why not use "int found = -1" instead and point at the commit in the
pending.objects[] array you found with that variable?  Or use a variable
of the type of a pointer to an element in the pending.objects[] array,
initialized to NULL.

That way, you do not have to tell stupid compilers that name is set once
commit is set to avoid warnings, which is the only reason you initialize
name to NULL.

> +	int i;
> +
> +	for (i = 0; i < revs->pending.nr; i++) {
> +		struct object *obj = revs->pending.objects[i].item;
> +		if (obj->flags & UNINTERESTING)
> +			continue;
> +		while (obj->type == OBJ_TAG)
> +			obj = deref_tag(obj, NULL, 0);
> +		if (obj->type != OBJ_COMMIT)
> +			die("Non commit %s?", revs->pending.objects[i].name);
> +		if (commit)
> +			die("More than one commit to dig from: %s and %s?",
> +			    revs->pending.objects[i].name, name);
> +		commit = obj;
> +		name = revs->pending.objects[i].name;
> +	}
> +
> +	if (commit == NULL)
> +		die("No commit specified?");
> +
> +	return commit;
> +}
> +
> +static void fill_blob_sha1(struct commit *commit, struct diff_line_range *r)
> +{
> +	unsigned mode;
> +	unsigned char sha1[20];
> +
> +	while (r) {
> +		if (get_tree_entry(commit->object.sha1, r->spec->path,
> +			sha1, &mode))
> +			goto error;
> +		fill_filespec(r->spec, sha1, mode);
> +		r = r->next;
> +	}
> +
> +	return;
> +error:
> +	die("There is no path %s in the commit", r->spec->path);
> +}

These two look vaguely familiar...  Cut and paste without refactoring?

> +static void fill_line_ends(struct diff_filespec *spec, long *lines,
> +	unsigned long **line_ends)
> +{
> +	int num = 0, size = 50;
> +	long cur = 0;
> +	unsigned long *ends = NULL;
> +	char *data = NULL;
> +
> +	if (diff_populate_filespec(spec, 0))
> +		die("Cannot read blob %s", sha1_to_hex(spec->sha1));
> +
> +	ends = xmalloc(size * sizeof(*ends));
> +	ends[cur++] = -1;

Wasn't this an array of unsigned longs?

> +static const char *nth_line(struct diff_filespec *spec, int line,
> +		long lines, unsigned long *line_ends)
> +{
> +	assert(line < lines);
> +	assert(spec && spec->data);

Why aren't "line" and "lines" of the same type?

> +static void parse_range(long lines, unsigned long *line_ends,
> +		struct range *r, struct diff_filespec *spec)
> +{
> +	const char *term;
> +
> +	term = parse_loc(r->arg, spec, lines, line_ends, 1, &r->start);
> +	if (*term == ',') {
> +		term = parse_loc(term + 1, spec, lines, line_ends,
> +			r->start + 1, &r->end);
> +		if (*term) {
> +			die("-L parameter's argument should be <start>,<end>");
> +		}

Excess {} around a single statement.

> +struct range *diff_line_range_insert(struct diff_line_range *r, const
> char *arg,
> +				int start, int end)
> +{
> +	int i = 0;
> +	struct range *rs = r->ranges;
> +	int left_extend = 0, right_extend = 0;
> +
> +	assert(r != NULL);
> +	assert(start <= end);
> +
> +	if (r->nr == 0 || rs[r->nr - 1].end < start - 1) {
> +		DIFF_LINE_RANGE_GROW(r);
> +		rs = r->ranges;
> +		int num = r->nr - 1;

decl-after-statement

> ...
> +out:
> +	assert(r->nr != i);
> +	if (left_extend) {

Sounds more like a "merge" than "extend" to me...

> diff --git a/line.h b/line.h
> new file mode 100644
> index 0000000..a2f8545
> --- /dev/null
> +++ b/line.h
> @@ -0,0 +1,122 @@
> ...
> +#define PRINT_RANGE_INIT(r) \
> +	do { \
> +		(r)->line_added = 0; \
> +	} while(0);

	} while (0)

- No semicolon after a macro (the user will give it to us).
- A SP after "while".

Does this even have to be "do { ... } while (0)"???

> +
> +#define PRINT_PAIR_INIT(p) \
> +	do { \
> +		(p)->alloc = (p)->nr = 0; \
> +		(p)->ranges = NULL; \
> +	} while(0);
> +
> +#define PRINT_PAIR_GROW(p) \
> +	do { \
> +		(p)->nr ++; \

	(p)->nr++;

> +		ALLOC_GROW((p)->ranges, (p)->nr, (p)->alloc); \
> +	} while (0);
> +
> +#define PRINT_PAIR_CLEAR(p) \
> +	do { \
> +		(p)->alloc = (p)->nr = 0; \
> +		if ((p)->ranges) \
> +			free((p)->ranges); \
> +		(p)->ranges = NULL; \
> +	} while(0);
> +
> +struct range {
> +	const char *arg;	//The argument to specify this line range

No C++/C99 comments.

Do you need to keep track of this string representation once you have
parsed the user input and your main computation have started?

Isn't "range" too generic a term?   Unless you make this as a static
declaration only visible to functions where "range" can only mean "line
ranges" in their context, that is.

> +#define RANGE_INIT(r) \

Same comment for the name possibly being too generic.

> +extern struct range *diff_line_range_insert(struct diff_line_range
> *r, const char *arg,

Linewrapped?

> diff --git a/revision.c b/revision.c
> index 7e82efd..63f37ea 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1637,6 +1637,12 @@ int setup_revisions(int argc, const char
> **argv, struct rev_info *revs, struct s
>  	if (revs->combine_merges)
>  		revs->ignore_merges = 0;
>  	revs->diffopt.abbrev = revs->abbrev;
> +
> +	if (revs->line) {

"line" what?

> diff --git a/revision.h b/revision.h
> index 36fdf22..55836e0 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -14,7 +14,8 @@
>  #define CHILD_SHOWN	(1u<<6)
>  #define ADDED		(1u<<7)	/* Parents already parsed and added? */
>  #define SYMMETRIC_LEFT	(1u<<8)
> -#define ALL_REV_FLAGS	((1u<<9)-1)
> +#define RANGE_UPDATE	(1u<<9) /* for line level traverse */
> +#define ALL_REV_FLAGS	((1u<<10)-1)
>
>  #define DECORATE_SHORT_REFS	1
>  #define DECORATE_FULL_REFS	2
> @@ -68,7 +69,8 @@ struct rev_info {
>  			cherry_pick:1,
>  			bisect:1,
>  			ancestry_path:1,
> -			first_parent_only:1;
> +			first_parent_only:1,
> +			line:1;

Is this really a traversal flag that affects how the history is walked?

>  	/* Diff flags */
>  	unsigned int	diff:1,
> @@ -137,6 +139,8 @@ struct rev_info {
>  	/* commit counts */
>  	int count_left;
>  	int count_right;
> +	/* line-level intersting range */
> +	struct decoration line_range;

s/intersting//; or perhaps "line level range that we are chasing"

>  };
>
>  #define REV_TREE_SAME		0
> -- 
> 1.7.0.2.273.gc2413.dirty

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 04/13] refactor parse_loc
  2010-07-11  6:18 ` [PATCH v3 04/13] refactor parse_loc Bo Yang
@ 2010-07-12 18:32   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 18:32 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> Both 'git blame -L' and 'git log -L' parse the same style
> of line number arguments, so put the 'parse_loc' function
> to line.c and export it.

> The caller of parse_loc should provide a callback function
> which is used to calculate the nth line start position.

"the start position of the nth line"?

> Other parts such as regexp search, line number parsing are
> abstracted and re-used.
>
> Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
> ---
>  builtin/blame.c |   89 +++++-------------------------------------------------
>  line.c          |   35 ++++++++++++---------
>  line.h          |    5 +++
>  3 files changed, 34 insertions(+), 95 deletions(-)
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 01e62fd..17b71cd 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -21,6 +21,7 @@

Nice code reduction.  The abstraction feels right (but I didn't read it
very carefully).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 13/13] some document update
  2010-07-12 14:12     ` Bo Yang
@ 2010-07-12 18:45       ` Junio C Hamano
  2010-07-18 14:37         ` Bo Yang
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 18:45 UTC (permalink / raw)
  To: Bo Yang; +Cc: Jakub Narebski, git, Junio Hamano

Bo Yang <struggleyb.nku@gmail.com> writes:

> ...
> Thanks a lot for your advice, I have revised the patch.
>
> -----------------------------------------------------------
>>From 88ed88a53d83c2d46fa4917008efadc531ba1068 Mon Sep 17 00:00:00 2001
> From: Bo Yang <struggleyb.nku@gmail.com>
> Date: Sat, 26 Jun 2010 01:35:48 -0700
> Subject: [PATCH v3 revised 13/13] some document update
>
> Both 'git log' and 'git blame' support the same
> format of '-L' arguments, so put the argument
> format description into a new file.
>
> Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
> ---
>  Documentation/blame-options.txt     |   19 +------------------
>  Documentation/git-log.txt           |   13 +++++++++++++
>  Documentation/line-range-format.txt |   18 ++++++++++++++++++
>  3 files changed, 32 insertions(+), 18 deletions(-)

Every time you do this, you seem to send a corrupt patch, like...

> @@ -19,6 +20,9 @@ command to control what is shown and how, and
> options applicable to
> ...
> diff --git a/Documentation/line-range-format.txt
> b/Documentation/line-range-format.txt

Please don't.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 07/13] add range clone functions
  2010-07-11  6:18 ` [PATCH v3 07/13] add range clone functions Bo Yang
@ 2010-07-12 21:52   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 21:52 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> Both single range clone and deeply clone are supported.

Please explain what you mean by "deeply clone" (no, not just a language
nitpick that it should at least be "deep clone"), and document which one
needs to be used in what situations to help people who would want to
decide which of these functions to call.

What I am trying to get at is that you might want to have only one kind of
clone that is _semantically_ deep (i.e. the list is copied, but the
elements are shared with refcounting), but makes a copy of the element
that is being updated on demand (i.e.  have an API function to allow users
to modify, which internally does copy-on-write if it is still shared).  I
don't know that kind of arrangement makes sense; it depends on how the
cloned ranges are used but I haven't looked at the callers.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 08/13] map/take range to parent
  2010-07-11  6:18 ` [PATCH v3 08/13] map/take range to parent Bo Yang
@ 2010-07-12 21:52   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 21:52 UTC (permalink / raw)
  To: Bo Yang; +Cc: git, gitster

Bo Yang <struggleyb.nku@gmail.com> writes:

> The interesting line range will be passed to all of its parents.

This needs too much context to understand what you are saying, doesn't it?
What is "it" (presumably "a commit" but that is merely a guess from the
use of word "parents" and in git the only thing that has parents are
commits)?  In what context is this "passing" happen (presumably while
digging the history with specific line ranges in mind, but the Subject
line does not even give any useful hint)?

> For non-merge commit, we just map_range the ranges. Generally, the
> algorithm do:

s/do//;

>
> 1. Run a diffcore_std to find out the pairs;

s/Run/Runs/; ( won't do any more language nitpicks from now on in this
message.

> 2. Run a xdi_diff_hunks on each interesting file pair;
> 3. The map_range_cb callback will be invoked for each diff hunk,
>    and in the function map_lines we will calculate the pre-image
>    range from the post-image range.
>
> For merge commit, another take_range pass will be done except the
> above normal map_range work. It is used to subtract each same
> part lines of the current range out. After this pass, if there is
> any line range left, this means the merge must be a non-trivial
> merge. This is how the non-trivial merge detect work.
>
> The algorithm that map lines from post-image to pre-image is in
> the function map_lines. Generally, we use simple line number
> calculation method to do the map.
>
> Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
> ---
>  line.c     |  453 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  revision.h |    5 +-
>  2 files changed, 457 insertions(+), 1 deletions(-)
>
> diff --git a/line.c b/line.c
> index ba96d7d..e1af2f6 100644
> --- a/line.c
> +++ b/line.c
> @@ -498,3 +498,456 @@ void setup_line(struct rev_info *rev, struct diff_line_range *r)
>  	diff_tree_release_paths(opt);
>  }
>  
> +struct take_range_cb_data {
> +	struct diff_line_range *interesting;
> +	struct diff_line_range *range;
> +	long plno, tlno;

What does "p" and "t" stand for?

> +	int diff;
> +};
> +
> +#define SCALE_FACTOR 4
> +void map_lines(long p_start, long p_end, long t_start, long t_end,
> +		long start, long end, long *o_start, long *o_end)

What does "p", "t", "o" and "" stand for?

> +{
> +	/*
> +	 * xdiff always assign 'same' with the last line number of two
> +	 * same ranges. When some lines are added from scratch,
> +	 * p_start = same + 1;
> +	 * p_end = same;
> +	 * so, we can understand the following condition.

Not me, sorry.

> +	 */
> +	if (p_start > p_end) {
> +		*o_start = *o_end = 0;
> +		return;
> +	}
> +	/* A deletion */
> +	if (t_start > t_end) {
> +		*o_start = p_start;
> +		*o_end = p_end;
> +		return;
> +	}
> +
> +	if (start == t_start && end == t_end) {
> +		*o_start = p_start;
> +		*o_end = p_end;
> +		return;
> +	}
> +
> +	/*
> +	 * A new strategy for lines mapping:

New compared to which old one?  I think s/strategy/heuristic/ might be a
better wording here.

> +	 * When the pre-image is no more than 1/4 of the post-image, it
> +	 * take no sense to say what the real pre-image is, wan can just
> +	 * treat all the post-image lines as added from scratch.

What kind of weird result would you get if you do not use this heuristic?
I am guessing the rationale is something like "If we consider too small
contributions from the preimage in a hunk, we end up splitting the hunk
into very many smaller pieces and will keep digging into older commits.
The result will show commits that add a line here and another line there
without a real impact to the resulting logic.  It is easier to read the
logic if we stop traversal at a commit that adds large material relative
to the original to a hunk." but it is unclear.

> +	 */
> +	if (SCALE_FACTOR * (p_end - p_start + 1) < (t_end - t_start + 1)) {
> +		*o_start = *o_end = 0;
> +		return;
> +	}
> ...
> +static void map_range(struct take_range_cb_data *data, int same,
> +		long p_start, long p_end, long t_start, long t_end)
> +{
> ...
> +		if (added) {
> +			/* Code movement/copy detect here */
This probably will not even compile.
> +		} else {
> ...

> +static void take_range_cb(void *data, long same, long p_next, long t_next)
> +{
> +	struct take_range_cb_data *d = data;
> +	long p_start = d->plno + 1, t_start = d->tlno + 1;
> +	long p_end = p_start + same - t_start, t_end = same;
> +
> +	/* If one file is added from scratch, this may confuse take_range */
> +	if (t_end >= t_start)
> +		take_range(d, p_start, p_end, t_start, t_end);

Is this a "FIXME" comment, or is it justifying the if statement?

> +static void assign_range_to_parent(struct rev_info *rev, struct commit *c,
> +		struct commit *p, struct diff_line_range *r,
> +		struct diff_options *opt, int map)
> +{
> ...
> +	xpp.flags = XDF_NEED_MINIMAL;

Do you need minimal?

> +		/*
> +		 * The remain part is the same part.
> +		 * Instead of calculating the true line number of the two files,
> +		 * use the biggest integer.
> +		 */
> +		if (map)
> +			map_range(&cb, 1, cb.plno + 1, 0x7FFFFFFF, cb.tlno + 1, 0x7FFFFFFF);
> +		else
> +			take_range(&cb, cb.plno + 1, 0x7FFFFFFF, cb.tlno + 1, 0x7FFFFFFF);
> +	}

Do you mean INT_MAX or something (I didn't bother to check the type you
are using)?

As there is no good explanation of parameters to map/take_range(), it is
hard to see if an arbitrary large number like this is correct, and if it
were correct, then perhaps it means the *_end paramters are not useful to
these functions?  I dunno.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/13] print the line log
  2010-07-11  6:18 ` [PATCH v3 09/13] print the line log Bo Yang
@ 2010-07-12 21:52   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 21:52 UTC (permalink / raw)
  To: Bo Yang; +Cc: git, gitster

Bo Yang <struggleyb.nku@gmail.com> writes:

> +static void flush_lines(struct diff_options *opt, const char **ptr, const char *end,
> +		int slno, int elno, int *lno, const char *color, const char heading)
> +{
> +	const char *p = *ptr;
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *reset;
> +
> +	if (strcmp(color, ""))

Did you mean "if (*color)"?

> +	/*
> +	 * todo: when --graph landed on master, this should be changed
> +	 * a little.
> +	 */

Hmm, don't we already have it?

> +static void diff_flush_filepair(struct rev_info *rev, struct diff_line_range *range)
> +{
> +...
> +}

Looks suspiciously familiar...  Cut-and-paste without refactoring?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 10/13] map/print ranges along traversing the history topologically
  2010-07-11  6:18 ` [PATCH v3 10/13] map/print ranges along traversing the history topologically Bo Yang
@ 2010-07-12 21:52   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-12 21:52 UTC (permalink / raw)
  To: Bo Yang; +Cc: git, gitster

Bo Yang <struggleyb.nku@gmail.com> writes:

> Since ranges may change in different branches, we should
> make sure we do not pass range to parent until all the
> ranges get 'combined' at the commit which is a split commit.
> So, topological traversing is necessary.

Without explaining what a "split commit" is, "So, ... is necessary"
doesn't really justify this.

>
> Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
> ---
>  builtin/log.c |    5 ++++-
>  line.c        |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  line.h        |    2 ++
>  3 files changed, 60 insertions(+), 1 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 1e90b03..47b386d 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -615,7 +615,10 @@ int cmd_log(int argc, const char **argv, const char *prefix)
>  	memset(&opt, 0, sizeof(opt));
>  	opt.def = "HEAD";
>  	cmd_log_init(argc, argv, prefix, &rev, &opt);
> -	return cmd_log_walk(&rev);
> +	if (rev.line)
> +		return cmd_line_log_walk(&rev);
> +	else
> +		return cmd_log_walk(&rev);
>  }
>  
>  /* format-patch */
> diff --git a/line.c b/line.c
> index 8813376..5c8f77a 100644
> --- a/line.c
> +++ b/line.c
> @@ -1196,3 +1196,57 @@ static void line_log_flush(struct rev_info *rev, struct commit *c)
>  	}
>  }
>  
> +int cmd_line_log_walk(struct rev_info *rev)
> +{
> +	struct commit *commit;
> +	struct commit_list *list = NULL;
> +	struct diff_line_range *r = NULL;
> +
> +	if (prepare_revision_walk(rev))
> +		die("revision walk prepare failed");
> +
> +	list = rev->commits;
> +	if (list) {
> +		list->item->object.flags |= RANGE_UPDATE;
> +		list = list->next;
> +	}
> +	/* Clear the flags */
> +	while (list) {
> +		list->item->object.flags &= 0x0;

Just assign "= 0" instead, please.

But more importantly, why is it safe to clear all the flags?  Even if your
traversal is limited (i.e. prepare_revision_walk() already walked
everything), at least wouldn't you need to keep flags like SYMMETRIC_LEFT,
UNINTERESTING, BOUNDARY etc. depending on what the end user asked from the
command line?

If you ask prepare_revison_walk()->limit_list() callchain to run a
limited and topo-sorted traversal, wouldn't the resulting revs->list
have commits in the order you need already?  Why do you need to have a
custom walker here?

I am not suggesting to roll this logic into prepare_revision_walk() and
limit_list(); there are a lot more than "we need topological order" going
on in this function, and the log message needs to explain what they are.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 11/13] add --always-print option
  2010-07-11  6:18 ` [PATCH v3 11/13] add --always-print option Bo Yang
@ 2010-07-13 20:35   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-13 20:35 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> Subject: Re: [PATCH v3 11/13] add --always-print option

Please be always careful about your Subject: line.  Imagine the above
appears in the shortlog output for v1.7.8..v1.7.9 among 500 other commits;
do you expect people to be able to tell what this commit is about?

> Always print the interesting ranges even if the current
> commit does not change any line of it.

Please explain why it is a good thing to do so.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-11  6:18 ` [PATCH v3 05/13] parse the -L options Bo Yang
@ 2010-07-13 23:03   ` Junio C Hamano
  2010-07-18 14:49     ` Bo Yang
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2010-07-13 23:03 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> +static int log_line_range_callback(const struct option *option, const char *arg, int unset)
> +{
> +	struct line_opt_callback_data *data = option->value;
> +	struct diff_line_range *r = *data->range;
> +	struct parse_opt_ctx_t *ctx = data->ctx;

Need a blank line here for readability.

> +	if (!arg)
> +		return -1;
> ...
> @@ -75,6 +118,58 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
>  	 */
>  	if (argc == 2 && !strcmp(argv[1], "-h"))
>  		usage(builtin_log_usage);
> +
> +	parse_options_start(&ctx, argc, argv, prefix, PARSE_OPT_KEEP_DASHDASH |
> +			PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_STOP_AT_NON_OPTION);
> +	for (;;) {
> +		switch (parse_options_step(&ctx, options, log_opt_usage)) {
> +		case PARSE_OPT_HELP:
> +			exit(129);
> +		case PARSE_OPT_DONE:
> +			goto parse_done;
> +		case PARSE_OPT_NON_OPTION:
> +			path = parse_options_current(&ctx);
> +			pathspec = prefix_path(prefix, prefix ? strlen(prefix) : 0, path);
> +			range->spec = alloc_filespec(pathspec);
> +			free((void *)pathspec);
> +			if (range->nr == 0) {
> +				if(range->next) {
> +					die("Path %s need a -L <range> option\n"
> +					"If you want follow the history of the whole file "
> +					"whether to using 'git log' without -L or using "
> +					"'git log -L 1,$ <path>'", range->spec->path);
> +				} else {
> +					parse_options_next(&ctx, 1);
> +					continue;

This loop smells bad.

When "-L n,m" appears on the command line, log_line_range_callback() would
be called and would eat n,m (which is correct), but at that point you
would not just want to be prepared to accept a non-option ("path" in "-L
n,m path"), but actually would want to force the user to give a path, no?
IOW, isn't "git log -L n,m -U20" an error, unless "-U20" is a filename
that the user wants to track?

I somehow suspect that futzing with STOP_AT_NON_OPTION (done in the first
two patches in the series) to parse "-L n,m path" is a misguided design
attempt.  Shouldn't you be instead giving a support for option callback to
take more than one argument to do this?

> +				}
> +			}
> +			struct diff_line_range *r = xmalloc(sizeof(*r));

decl-after-statement, but at this point it may be moot as I am doubting
the higher-level design of this option parser now.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/13] add the basic data structure for line level  history
  2010-07-12 16:50     ` Junio C Hamano
@ 2010-07-18 14:35       ` Bo Yang
  0 siblings, 0 replies; 39+ messages in thread
From: Bo Yang @ 2010-07-18 14:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,
On Tue, Jul 13, 2010 at 12:50 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> diff --git a/diffcore.h b/diffcore.h
>> index 491bea0..13d8e93 100644
>> --- a/diffcore.h
>> +++ b/diffcore.h
>> @@ -23,6 +23,7 @@
>>  #define MINIMUM_BREAK_SIZE     400 /* do not break a file smaller than this */
>>
>>  struct userdiff_driver;
>> +struct diff_options;
>
> Why???

The line:
extern void diffcore_rename(struct diff_options *);
later in this file use it. So, forward declare it.

>>
...
>>  #define DECORATE_SHORT_REFS  1
>>  #define DECORATE_FULL_REFS   2
>> @@ -68,7 +69,8 @@ struct rev_info {
>>                       cherry_pick:1,
>>                       bisect:1,
>>                       ancestry_path:1,
>> -                     first_parent_only:1;
>> +                     first_parent_only:1,
>> +                     line:1;
>
> Is this really a traversal flag that affects how the history is walked?

Hmm, a 'line' means topologically traverse at least. So, I added it
here. And I can't find a better place to put it. :)

I have changed my code according your comments, thanks a lot!

-- 
Regards!
Bo
----------------------------
My blog: http://blog.morebits.org
Why Git: http://www.whygitisbetterthanx.com/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 13/13] some document update
  2010-07-12 18:45       ` Junio C Hamano
@ 2010-07-18 14:37         ` Bo Yang
  0 siblings, 0 replies; 39+ messages in thread
From: Bo Yang @ 2010-07-18 14:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git

Hi Junio,

On Tue, Jul 13, 2010 at 2:45 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Bo Yang <struggleyb.nku@gmail.com> writes:
>
>> ...
>> Thanks a lot for your advice, I have revised the patch.
>>
>> -----------------------------------------------------------
>>>From 88ed88a53d83c2d46fa4917008efadc531ba1068 Mon Sep 17 00:00:00 2001
>> From: Bo Yang <struggleyb.nku@gmail.com>
>> Date: Sat, 26 Jun 2010 01:35:48 -0700
>> Subject: [PATCH v3 revised 13/13] some document update
>>
>> Both 'git log' and 'git blame' support the same
>> format of '-L' arguments, so put the argument
>> format description into a new file.
>>
>> Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
>> ---
>>  Documentation/blame-options.txt     |   19 +------------------
>>  Documentation/git-log.txt           |   13 +++++++++++++
>>  Documentation/line-range-format.txt |   18 ++++++++++++++++++
>>  3 files changed, 32 insertions(+), 18 deletions(-)
>
> Every time you do this, you seem to send a corrupt patch, like...
>
>> @@ -19,6 +20,9 @@ command to control what is shown and how, and
>> options applicable to
>> ...
>> diff --git a/Documentation/line-range-format.txt
>> b/Documentation/line-range-format.txt
>
> Please don't.
>

I am sorry, I promise I won't do the same thing from now on. :)

-- 
Regards!
Bo
----------------------------
My blog: http://blog.morebits.org
Why Git: http://www.whygitisbetterthanx.com/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-13 23:03   ` Junio C Hamano
@ 2010-07-18 14:49     ` Bo Yang
  2010-07-19 18:44       ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-18 14:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Wed, Jul 14, 2010 at 7:03 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>       if (argc == 2 && !strcmp(argv[1], "-h"))
>>               usage(builtin_log_usage);
>> +
>> +     parse_options_start(&ctx, argc, argv, prefix, PARSE_OPT_KEEP_DASHDASH |
>> +                     PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_STOP_AT_NON_OPTION);
>> +     for (;;) {
>> +             switch (parse_options_step(&ctx, options, log_opt_usage)) {
>> +             case PARSE_OPT_HELP:
>> +                     exit(129);
>> +             case PARSE_OPT_DONE:
>> +                     goto parse_done;
>> +             case PARSE_OPT_NON_OPTION:
>> +                     path = parse_options_current(&ctx);
>> +                     pathspec = prefix_path(prefix, prefix ? strlen(prefix) : 0, path);
>> +                     range->spec = alloc_filespec(pathspec);
>> +                     free((void *)pathspec);
>> +                     if (range->nr == 0) {
>> +                             if(range->next) {
>> +                                     die("Path %s need a -L <range> option\n"
>> +                                     "If you want follow the history of the whole file "
>> +                                     "whether to using 'git log' without -L or using "
>> +                                     "'git log -L 1,$ <path>'", range->spec->path);
>> +                             } else {
>> +                                     parse_options_next(&ctx, 1);
>> +                                     continue;
>
> This loop smells bad.
>
> When "-L n,m" appears on the command line, log_line_range_callback() would
> be called and would eat n,m (which is correct), but at that point you
> would not just want to be prepared to accept a non-option ("path" in "-L
> n,m path"), but actually would want to force the user to give a path, no?
> IOW, isn't "git log -L n,m -U20" an error, unless "-U20" is a filename
> that the user wants to track?
>
> I somehow suspect that futzing with STOP_AT_NON_OPTION (done in the first
> two patches in the series) to parse "-L n,m path" is a misguided design
> attempt.  Shouldn't you be instead giving a support for option callback to
> take more than one argument to do this?
>
>> +                             }
>> +                     }
>> +                     struct diff_line_range *r = xmalloc(sizeof(*r));
>
> decl-after-statement, but at this point it may be moot as I am doubting
> the higher-level design of this option parser now.
>

The point is that, the syntax we support is:

-L n1,m1 -L n2,m2 pathspec1  -L n3,m3 pathspec2

There can be multiple -L options for one pathspec, so I think the
STOP_AT_NON_OPTION way is better.

-- 
Regards!
Bo
----------------------------
My blog: http://blog.morebits.org
Why Git: http://www.whygitisbetterthanx.com/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-18 14:49     ` Bo Yang
@ 2010-07-19 18:44       ` Junio C Hamano
  2010-07-19 22:48         ` Thomas Rast
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2010-07-19 18:44 UTC (permalink / raw)
  To: Bo Yang; +Cc: git

Bo Yang <struggleyb.nku@gmail.com> writes:

> The point is that, the syntax we support is:
>
> -L n1,m1 -L n2,m2 pathspec1  -L n3,m3 pathspec2

That itself smells like a bad design, unless done very carefully and
documented clearly.

For example, what does this mean?

    $ git log -L n1,m1 master

As -L wants to see at least one path before running out of the command
line argument, we take "master" as the filename.  Hence, the command line
does not have any revision specified and defaults to HEAD.  I.e. "traverse
from the current HEAD and show only commits that touch the line region
n1,m1 that appears in the version of path 'master' in HEAD".

What about this?

    $ git log -L n1,m1 master..next one two

Clearly the user wants to traverse revision range between master and next.
The -L option wants to see one path so slurps "one".  The traversal
however is further limited by a pathspec "two".  Should the use of "one"
as an argument to -L automatically add it also to the pathspec?  I.e.
"traverse from 'next' down to 'master', checking commits that touch either
path 'one' or 'two', and show only commits that touch the line region
n1,m1 that appears in the version of path 'one' in 'next'"?

Or perhaps master..next is the name of the file the user is interested in?
I.e. "Starting from branches 'one' and 'two', show only commits that touch
the line region n1,m1 that appears in file 'master..next'"?  But that is a
broken interpretation, as "-L range path" cannto possibly make sense if
you have more than one starting point, and this interpretation gives you
two (i.e. 'one' and 'two').

How would you disambiguate -Lpaths, revisions and pathspecs?  How does -Lpath
interact with pathspecs?

What if the name of the file the user wants to annotate begins with a "-"?
For pathspec limiter, the users have already learned that "--" can be used
to say "everything that comes after this token is pathspec", but that
knowledge cannot be reused with this syntax.

It almost feels as if you want to have something more like

    -L <begin>,<end>[,<path>]

where <path> is mandatory for the first use of -L (i.e. missing ,<path>
means the same path from the previous -L that has one) to make it clear
that this is completely different from the normal pathspec.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-19 18:44       ` Junio C Hamano
@ 2010-07-19 22:48         ` Thomas Rast
  2010-07-19 23:53           ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Rast @ 2010-07-19 22:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Bo Yang, git

Junio C Hamano wrote:
> It almost feels as if you want to have something more like
> 
>     -L <begin>,<end>[,<path>]
> 
> where <path> is mandatory for the first use of -L (i.e. missing ,<path>
> means the same path from the previous -L that has one) to make it clear
> that this is completely different from the normal pathspec.

I think that would just needlessly break the analogy to git-blame.[0]
With the current code,

  git blame -L 2,3 <path>
  git log -L 2,3 <path>

work the same.  Multiple -L options could be retrofitted to git-blame,
making

  git {blame,log} -L 2,3 -L 4,5 <path>

work as expected.

As long as you only give a single path, even blame disambiguates in
favour of the filename:

  git blame -L 2,3 master # wants a file 'master'

It only starts breaking down as soon as you put the -L further away
from the filename:

  git blame -L 2,3 master master # looks for the file master:master
  git blame master -L 2,3 master # ditto
  git log -L 2,3 master master   # errors out since the second file has no -L [1]
  git log master -L 2,3 master   # looks for the file master:master

And as you have noted, the -- can also cause weird effects.  Currently

  git log -L 2,3 --branches      # error
  git log -L 2,3 ./--branches    # ok
  git log -L 2,3 -- --branches   # error

The last one is unfortunate, but fixing it while allowing a natural
positioning of the -L would require parsing the -L after a --.  That's
just as inconsistent, only in a different way.


[0] It also requires special support from shell completion.

[1] I agree with erroring out but I think the message wrongly
    recommends path filtering (since that doesn't work well with
    renames):

      fatal: Path master need a -L <range> option
      If you want follow the history of the whole file whether to using 'git log' without -L or using 'git log -L 1,$ <path>'

    Bo, can you please change it to e.g.:

      fatal: Path 'master' needs a -L<range> option
      If you want to follow the history of the whole file, use 'git log -L 1,$ <path>'

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-19 22:48         ` Thomas Rast
@ 2010-07-19 23:53           ` Junio C Hamano
  2010-07-20  7:51             ` Thomas Rast
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2010-07-19 23:53 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Bo Yang, git

Thomas Rast <trast@student.ethz.ch> writes:

> Junio C Hamano wrote:
>> It almost feels as if you want to have something more like
>> 
>>     -L <begin>,<end>[,<path>]
>> 
>> where <path> is mandatory for the first use of -L (i.e. missing ,<path>
>> means the same path from the previous -L that has one) to make it clear
>> that this is completely different from the normal pathspec.
>
> I think that would just needlessly break the analogy to git-blame.[0]
> With the current code,
>
>   git blame -L 2,3 <path>
>   git log -L 2,3 <path>
>
> work the same.  Multiple -L options could be retrofitted to git-blame,
> making
>
>   git {blame,log} -L 2,3 -L 4,5 <path>
>
> work as expected.

I like the general direction, but I am not sure how far we want to take
that analogue with blame, though.

For example, Bo's "log -L" thing also works on more than one path, no?  I
suspect it might be be reasonable to teach "blame" to annotate more than
one path (with ranges) the same way.  There is no technical limitation in
the underlying scoreboard mechanism to prevent it from happening.

Very similar to "blame" but quite differently from any normal "log"
operation, "log -L" allows only one positive commit (starting point).

Perhaps these argue that this new feature shouldn't be a new option of
"log" at the UI level after all; rather, I wonder if this should be better
presented as a new mode of "blame" (e.g. "blame --log", unlike showing
"cvs annotate" like output, shows history like "git log" does).

I _think_ I've already said the above at least a few times, though.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-19 23:53           ` Junio C Hamano
@ 2010-07-20  7:51             ` Thomas Rast
  2010-07-20 15:45               ` Junio C Hamano
  2010-07-20 15:46               ` Bo Yang
  0 siblings, 2 replies; 39+ messages in thread
From: Thomas Rast @ 2010-07-20  7:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Bo Yang, git

Junio C Hamano wrote:
> Thomas Rast <trast@student.ethz.ch> writes:
> > I think that would just needlessly break the analogy to git-blame.[0]
> > With the current code,
> >
> >   git blame -L 2,3 <path>
> >   git log -L 2,3 <path>
> >
> > work the same.
> 
> I like the general direction, but I am not sure how far we want to take
> that analogue with blame, though.
> 
> For example, Bo's "log -L" thing also works on more than one path, no?  I
> suspect it might be be reasonable to teach "blame" to annotate more than
> one path (with ranges) the same way.  There is no technical limitation in
> the underlying scoreboard mechanism to prevent it from happening.
> 
> Very similar to "blame" but quite differently from any normal "log"
> operation, "log -L" allows only one positive commit (starting point).

AFAICT this is not a conceptual requirement, only one of the current
implementation/option parsing.  [Bo, how hard would it be to remove
this requirement?]

'git log --follow', if it were ever unbroken, would have much the same
problem that while technically allowing more than one starting point,
using that is only possible if the starting filename happens to agree
on all of them.

> Perhaps these argue that this new feature shouldn't be a new option of
> "log" at the UI level after all; rather, I wonder if this should be better
> presented as a new mode of "blame" (e.g. "blame --log", unlike showing
> "cvs annotate" like output, shows history like "git log" does).

Then you suddenly have a mode of git-blame that takes the normal
options, and another that takes all the git-log arguments that pertain
to commit formatting.

(Ok, you can't reasonably give it any of the diff-formatting options,
but e.g. --graph and --pretty are supposed to work.)

What's more, we (well, I do anyway) absolutely want gitk to display
the same output eventually.  So that would also suddenly give you an
example where gitk isn't so equivalent to git-log-in-a-fancy-window
any more.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-20  7:51             ` Thomas Rast
@ 2010-07-20 15:45               ` Junio C Hamano
  2010-07-22  9:06                 ` Thomas Rast
  2010-07-20 15:46               ` Bo Yang
  1 sibling, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2010-07-20 15:45 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Bo Yang, git

Thomas Rast <trast@student.ethz.ch> writes:

> Junio C Hamano wrote:
> ...
>> I like the general direction, but I am not sure how far we want to take
>> that analogue with blame, though.
>> 
>> For example, Bo's "log -L" thing also works on more than one path, no?  I
>> suspect it might be be reasonable to teach "blame" to annotate more than
>> one path (with ranges) the same way.  There is no technical limitation in
>> the underlying scoreboard mechanism to prevent it from happening.
>> 
>> Very similar to "blame" but quite differently from any normal "log"
>> operation, "log -L" allows only one positive commit (starting point).
>
> AFAICT this is not a conceptual requirement, only one of the current
> implementation/option parsing.

I think "log -L" conceptually is very different from the normal "log with
pathspecs".

The "log with pathspecs" is about "I have these histories that lead to
these commits; show commits that touch paths that match them".  On the
other hand, "log -L" asks "I have these lines in these files; explain how
they came about in the history leading to...".  Now, the question is:
"leading to" WHAT?

Because you start from "I have _these lines_", you are fundamentally
discussing contents that appear as a set of line ranges _in a particular
version_, adjusting the range to match their older incarnations as
necessary as you dig deeper, no?

I would say that this is not just _only_ the implementation and option
parsing.  What is happening looks to me fundamentally very different.

Having said that, it is not entirely implausible to imagine that "log -L"
can further be enhanced to also dig into the future.  The request becomes
"I have these lines; explain how they came about, and how they went".

E.g. if you have a history of this shape where commit C is at the tip:

    ---o---A---x---B---o---C

you _could_ ask "log -p -L1,10,B:Makefile A..C" (I am not proposing this
syntax at all, by the way) to browse the history between A and C, looking
for commits that touch the region of Makefile that appeared as lines 1..10
in revision B.  Between B and C, some new lines might have been introduced
inside the range and you would dig in reverse enlarging the range to show
what have been added to it.

If you have that kind of "forward digging", then it would also not be
implausible to deal with a forked history like this:

                 y---z---D
                /
    ---o---A---x---B---o---C

and ask a similar "log -p -L1,10,C:Makefile A..C D" question.  You would
need to dig from C to x, finding the corresponding range in x, and then
map that corresponding range further while traversing through y, z and to
D in reverse, just like you mapped and traversed from B through o to C in
reverse that explained a part of Makefile in revision B in the earlier
example.

Also the implementation will have to be very different if you really
wanted to do this properly.  I would imagine you would first build a
sparse merge-simplified graph that consists of commits that touch Makefile
(using "Makefile" as a pathspec) in the first pass, map the line ranges
backward and forward in the second pass, and output the result in the
final pass.  Presumably you could collapse the backward traversing half of
the second pass and the third one (which is what Bo's patchset already
does), but I don't think you can stream x-y-z-D going forward, as "log -L"
output needs to follow the usual new-to-old order as the other parts of
the history.

IOW, that "explain how they came and how they went" is also very different
from the normal "log", even if you choose to lift that restriction.

Also I am not entirely sure if it is easy to explain to the users.

If you instead choose to retain the "start from one commit", then the way
you ask the question becomes "I have these lines in these files at this
commit; explain how they came about in the history", which is exactly what
you ask "blame".  You are just using a different presentation from what
the default "blame" output gives.

>> Perhaps these argue that this new feature shouldn't be a new option of
>> "log" at the UI level after all; rather, I wonder if this should be better
>> presented as a new mode of "blame" (e.g. "blame --log", unlike showing
>> "cvs annotate" like output, shows history like "git log" does).
>
> Then you suddenly have a mode of git-blame that takes the normal
> options, and another that takes all the git-log arguments that pertain
> to commit formatting.

I would consider it similar to "blame --incremental"; you can choose a
very different output format driven by a single switch to change the mode
of operation.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-20  7:51             ` Thomas Rast
  2010-07-20 15:45               ` Junio C Hamano
@ 2010-07-20 15:46               ` Bo Yang
  2010-07-20 15:47                 ` Bo Yang
  1 sibling, 1 reply; 39+ messages in thread
From: Bo Yang @ 2010-07-20 15:46 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, git

On Tue, Jul 20, 2010 at 3:51 PM, Thomas Rast <trast@student.ethz.ch> wrote:
> Junio C Hamano wrote:
>> Thomas Rast <trast@student.ethz.ch> writes:
>> > I think that would just needlessly break the analogy to git-blame.[0]
>> > With the current code,
>> >
>> >   git blame -L 2,3 <path>
>> >   git log -L 2,3 <path>
>> >
>> > work the same.
>>
>> I like the general direction, but I am not sure how far we want to take
>> that analogue with blame, though.
>>
>> For example, Bo's "log -L" thing also works on more than one path, no?  I
>> suspect it might be be reasonable to teach "blame" to annotate more than
>> one path (with ranges) the same way.  There is no technical limitation in
>> the underlying scoreboard mechanism to prevent it from happening.
>>
>> Very similar to "blame" but quite differently from any normal "log"
>> operation, "log -L" allows only one positive commit (starting point).
>
> AFAICT this is not a conceptual requirement, only one of the current
> implementation/option parsing.  [Bo, how hard would it be to remove
> this requirement?]
>
> 'git log --follow', if it were ever unbroken, would have much the same
> problem that while technically allowing more than one starting point,
> using that is only possible if the starting filename happens to agree
> on all of them.

I think Junio here did not ask for 'git log -L' to support multiple
starting commits. [1]
Considering the arguments you give below, I am in support of put it in
'git log -L' as what we do, now.

-- 
Regards!
Bo
----------------------------
My blog: http://blog.morebits.org
Why Git: http://www.whygitisbetterthanx.com/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-20 15:46               ` Bo Yang
@ 2010-07-20 15:47                 ` Bo Yang
  0 siblings, 0 replies; 39+ messages in thread
From: Bo Yang @ 2010-07-20 15:47 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, git

On Tue, Jul 20, 2010 at 11:46 PM, Bo Yang <struggleyb.nku@gmail.com> wrote:
> On Tue, Jul 20, 2010 at 3:51 PM, Thomas Rast <trast@student.ethz.ch> wrote:
>> Junio C Hamano wrote:
>>> Thomas Rast <trast@student.ethz.ch> writes:
>>> > I think that would just needlessly break the analogy to git-blame.[0]
>>> > With the current code,
>>> >
>>> >   git blame -L 2,3 <path>
>>> >   git log -L 2,3 <path>
>>> >
>>> > work the same.
>>>
>>> I like the general direction, but I am not sure how far we want to take
>>> that analogue with blame, though.
>>>
>>> For example, Bo's "log -L" thing also works on more than one path, no?  I
>>> suspect it might be be reasonable to teach "blame" to annotate more than
>>> one path (with ranges) the same way.  There is no technical limitation in
>>> the underlying scoreboard mechanism to prevent it from happening.
>>>
>>> Very similar to "blame" but quite differently from any normal "log"
>>> operation, "log -L" allows only one positive commit (starting point).
>>
>> AFAICT this is not a conceptual requirement, only one of the current
>> implementation/option parsing.  [Bo, how hard would it be to remove
>> this requirement?]
>>
>> 'git log --follow', if it were ever unbroken, would have much the same
>> problem that while technically allowing more than one starting point,
>> using that is only possible if the starting filename happens to agree
>> on all of them.
>
> I think Junio here did not ask for 'git log -L' to support multiple
> starting commits. [1]

Forget [1]: gmane.comp.version-control.git:142900

-- 
Regards!
Bo
----------------------------
My blog: http://blog.morebits.org
Why Git: http://www.whygitisbetterthanx.com/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-20 15:45               ` Junio C Hamano
@ 2010-07-22  9:06                 ` Thomas Rast
  2010-07-23  2:08                   ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Rast @ 2010-07-22  9:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Bo Yang, git

Junio C Hamano wrote:
[moved up this chunk from the end]
> If you instead choose to retain the "start from one commit", then the way
> you ask the question becomes "I have these lines in these files at this
> commit; explain how they came about in the history", which is exactly what
> you ask "blame".  You are just using a different presentation from what
> the default "blame" output gives.
[...]
> > Then you suddenly have a mode of git-blame that takes the normal
> > options, and another that takes all the git-log arguments that pertain
> > to commit formatting.
> 
> I would consider it similar to "blame --incremental"; you can choose a
> very different output format driven by a single switch to change the mode
> of operation.

For me (and Bo seemed to agree off-list) the main interest is in the
*history*.  Yes, it is an important feature that the diffs are also
limited to the ranges you are scanning, but that is the secondary
feature.

[If you think that is unfair, consider: if you remove the history
filtering or view, the line-log becomes log or blame, respectively.
If you just remove the diff display, it's still a new mode of history
simplification.]

In the rest of the mail I'll make some more points why it is more
log-like than blame-like, but this is the main point.


The comparison to "blame --incremental" is not valid because blame
always shows exactly one commit per line.  That's not history in my
book.  It does not even compute the history that line-log computes (it
gives up on tracking a line once it has assigned blame for it) so it
is not a new output format of existing data either.

> I think "log -L" conceptually is very different from the normal "log with
> pathspecs".
> 
> The "log with pathspecs" is about "I have these histories that lead to
> these commits; show commits that touch paths that match them".  On the
> other hand, "log -L" asks "I have these lines in these files; explain how
> they came about in the history leading to...".  Now, the question is:
> "leading to" WHAT?

If pathspec limiting wasn't broken in the sense that it does not
support --follow (properly, anyway), they would be exactly equivalent.
With --follow, you would mean "I have these files, please explain how
they came about", and you would expect git to track renames and
copies.

That the current "I want to see some history, but only when it touched
this file" works is a fortunate side effect of the fact that people
rarely rename files (and that if they do, there are few collisions so
simply adding the old name usually works).

In particular, proper log --follow would have a problem exactly
analogous to:

> Because you start from "I have _these lines_", you are fundamentally
> discussing contents that appear as a set of line ranges _in a particular
> version_, adjusting the range to match their older incarnations as
> necessary as you dig deeper, no?
[...]
> E.g. if you have a history of this shape where commit C is at the tip:
> 
>     ---o---A---x---B---o---C
> 
> you _could_ ask "log -p -L1,10,B:Makefile A..C" (I am not proposing this
> syntax at all, by the way) to browse the history between A and C, looking
> for commits that touch the region of Makefile that appeared as lines 1..10
> in revision B.  Between B and C, some new lines might have been introduced
> inside the range and you would dig in reverse enlarging the range to show
> what have been added to it.

[I think this would be entirely possible to do for the line-log case,
though it would be too much effort for GSoC since it is also a
hard-to-explain and probably rarely-used case.]

Similarly, the plans for simplification that I have discussed with Bo
are roughly:

* commits that take all ranges from one of their parents are simplified
  away (even if a merge)

* commits that take each range fully from a single parent should stay
  but not show a diff

* ranges that are "split" between parents should show some format of
  diff

Compare this to pathspec filtering:

* commits that are TREESAME to one of their parents are simplified
  away

* commits that take each hunk literally from a single parent do not
  show a diff with --cc

* commits that have hunks "mixed" from the parents show a diff

[The latter was first and served a guideline, of course.]

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 05/13] parse the -L options
  2010-07-22  9:06                 ` Thomas Rast
@ 2010-07-23  2:08                   ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2010-07-23  2:08 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Bo Yang, git

Thomas Rast <trast@student.ethz.ch> writes:

>> I would consider it similar to "blame --incremental"; you can choose a
>> very different output format driven by a single switch to change the mode
>> of operation.
>
> For me (and Bo seemed to agree off-list) the main interest is in the
> *history*.  Yes, it is an important feature that the diffs are also
> limited to the ranges you are scanning, but that is the secondary
> feature.
>
> [If you think that is unfair, consider:...

We are not saying very different things, I think.

I guess I shouldn't have mentined "blame --incremental"; I was referring
to a quick-hack experiment we did to show an early output when we pass
blame to parent.  You can imagine that inside pass_blame_to_parent() we
can detect if the lines around the range we are passing the blame to
parent were touched, we could show pretty-printed commit log and relevant
diff-tree output at that point, which would give you pretty much the same
history we are talking about.

> If pathspec limiting wasn't broken in the sense that it does not
> support --follow (properly, anyway), they would be exactly equivalent...

Well, --follow is from "content-id" camp, while pathspec is a revision
limiter that follows quite different philosophy, so they do not mix very
well.  "log --follow -- $path" is an unfortunate cross between these two
philosophies and spelled that way only because that was the easiest way to
cheat in the command line parser.

As you alluded to, "log --follow HEAD -- $path" would mean exactly the
same as "log -L<range>,HEAD:$path HEAD" in the mock syntax I gave you
earlier in the thread, but that does not change the fact that it is
conceptually very different from the normal "git log".

>> E.g. if you have a history of this shape where commit C is at the tip:
>> 
>>     ---o---A---x---B---o---C
>> 
>> you _could_ ask "log -p -L1,10,B:Makefile A..C" (I am not proposing this
>> syntax at all, by the way) to browse the history between A and C, looking
>> for commits that touch the region of Makefile that appeared as lines 1..10
>> in revision B.  Between B and C, some new lines might have been introduced
>> inside the range and you would dig in reverse enlarging the range to show
>> what have been added to it.
>
> [I think this would be entirely possible to do for the line-log case,

I am sure it is "entirely possible", even though it might be quite
expensive (both development wise and also the end result).

A funny thing is, I do not think it is so "hard to explain and rarely
used" case.  Imagine that you have this history:

                 y---z---D maint
                /
    ---o---A---x---B---o---C master

Wouldn't it be nice if you can ask this question?

    I have these lines in frotz.c at 'master' version, and I recently made
    fixes that I need to backport to 'maint'.  How does the corresponding
    region look like in 'maint', and what changes that may be necessary
    for and interfere with my backport are there since these two branches
    forked?

In the ugly mock syntax, you would write it like this:

    log -p -L<range>,master:frotz.c master...maint

In this example, you have two positive revisions (notice three-dots for
symmetric difference), but you can easily imagine that this will work well
when you have more than two branches you are interested in.

To produce a useful output, you would need to dig from C down to x while
listing the commit that touch the specified range and adjusting it, then
dig the other way from x through y, z and D, again mapping the corresponding
range.

I am not saying that what -L does is a bad thing, though.  All I am saying
is that it is conceptually very different, and trying to coax it into
existing command line parsing rules of "git log" might not work well.

You could use a few rules to allow something like this which might be more
friendly to the eyes that are used to looking at the input to
setup_revisions() parser:

    log -p -L<range1> <path1> -L<range2> <revs>...

The short-hand rules may look something like this:

    - -L<range> with missing <path> is the same as -L<range>,<path> for
       the last <path> we have seen used with -L (it would be cleaner to
       use a single string given to -L i.e. "-L<range>,<path>" instead of
       a rule "-L takes one or two parameters" that is totally odd-ball
       from the point of view of normal POSIX command line rules, though);

    - If there is no <path> anywhere in the first use of -L, we take the
      first pathspec element as if it were <path>.

    - -L<range>,<path> is the same as -L<range>,<rev>:<path> where <rev>
       is the first positive revision given on the command line.

No matter what syntax sugarcoating is done, conceptually when -L says
"these lines in this path", it has to talk about the range in one
revision.  -L<range>,<rev>:<path> instructs us: "you are to follow this
block of lines in that path in this particular revision---figure out how
that migrated while you traverse the history and show its evolution".
There is no room for the normal pathspec rules to take effect---as we are
operating in the "content-id" domain, we would ideally want to
automatically even do --find-copies-harder, just like --follow and blame
automatically runs rename detection (that is another reason I think the
traversal semantics is very different from the regular "log", in addition
to the need to traverse forward in addition to backward which I already
mentioned).  We may want to error out when we have any <pathspec> (after
taking the second short-hand rule above into account, that is).

In short, I don't really have a very deep preference between "log -L" and
"blame -L".

But as a very special case of allowing only one single positive rev and no
forward-walking of the history, what "log -L" gives us is much closer to
"blame -L".  That is where my suggestion came from.  If we were to
eventually have that "log -p -L<range>,master:frotz.c master...maint", not
the syntax but the capability, I am all for it to be part of the "log"
command.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2010-07-23  2:08 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-11  6:18 [PATCH v3 01/13] parse-options: stop when encounter a non-option Bo Yang
2010-07-11  6:18 ` [PATCH v3 02/13] parse-options: add two helper functions Bo Yang
2010-07-12 16:45   ` Junio C Hamano
2010-07-11  6:18 ` [PATCH v3 03/13] add the basic data structure for line level history Bo Yang
2010-07-12 14:16   ` Bo Yang
2010-07-12 16:50     ` Junio C Hamano
2010-07-18 14:35       ` Bo Yang
2010-07-11  6:18 ` [PATCH v3 04/13] refactor parse_loc Bo Yang
2010-07-12 18:32   ` Junio C Hamano
2010-07-11  6:18 ` [PATCH v3 05/13] parse the -L options Bo Yang
2010-07-13 23:03   ` Junio C Hamano
2010-07-18 14:49     ` Bo Yang
2010-07-19 18:44       ` Junio C Hamano
2010-07-19 22:48         ` Thomas Rast
2010-07-19 23:53           ` Junio C Hamano
2010-07-20  7:51             ` Thomas Rast
2010-07-20 15:45               ` Junio C Hamano
2010-07-22  9:06                 ` Thomas Rast
2010-07-23  2:08                   ` Junio C Hamano
2010-07-20 15:46               ` Bo Yang
2010-07-20 15:47                 ` Bo Yang
2010-07-11  6:18 ` [PATCH v3 06/13] export three functions from diff.c Bo Yang
2010-07-11  6:18 ` [PATCH v3 07/13] add range clone functions Bo Yang
2010-07-12 21:52   ` Junio C Hamano
2010-07-11  6:18 ` [PATCH v3 08/13] map/take range to parent Bo Yang
2010-07-12 21:52   ` Junio C Hamano
2010-07-11  6:18 ` [PATCH v3 09/13] print the line log Bo Yang
2010-07-12 21:52   ` Junio C Hamano
2010-07-11  6:18 ` [PATCH v3 10/13] map/print ranges along traversing the history topologically Bo Yang
2010-07-12 21:52   ` Junio C Hamano
2010-07-11  6:18 ` [PATCH v3 11/13] add --always-print option Bo Yang
2010-07-13 20:35   ` Junio C Hamano
2010-07-11  6:19 ` [PATCH v3 12/13] add two test cases Bo Yang
2010-07-11  6:19 ` [PATCH v3 13/13] some document update Bo Yang
2010-07-11  8:27   ` Jakub Narebski
2010-07-12 14:12     ` Bo Yang
2010-07-12 18:45       ` Junio C Hamano
2010-07-18 14:37         ` Bo Yang
2010-07-12 16:45 ` [PATCH v3 01/13] parse-options: stop when encounter a non-option Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).