Git development
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] Use is_pseudo_dir_name everywhere
From: Alexander Potashev @ 2009-01-09 10:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, Git Mailing List
In-Reply-To: <7vy6xk280e.fsf@gitster.siamese.dyndns.org>

On 00:33 Fri 09 Jan     , Junio C Hamano wrote:
> Johannes Sixt <j.sixt@viscovery.net> writes:
> 
> > Johannes Sixt schrieb:
> >> Alexander Potashev schrieb:
> >>> -		if ((ent->d_name[0] == '.') &&
> >>> -		    (ent->d_name[1] == 0 ||
> >>> -		     ((ent->d_name[1] == '.') && (ent->d_name[2] == 0))))
> >>> +		if (is_pseudo_dir_name(ent->d_name))
> >> 
> >> Nit-pick: When I read the resulting code, then I will have to look up that
> >>   is_pseudo_dir_name() indeed only checks for "." and "..". But if it were
> >> named is_dot_or_dotdot(), then I would have to do that.
> >
> > ... then I would *not* have to do that, of course.
> 
> I think the unstated motivation of this choice of the name is to keep the
> door open to include lost+found and friends to the repertoire, and perhaps
> to have an isolated place for customization for non-POSIX platforms and
> for local conventions.  It is more like is_uninteresting_dirent_name().

I didn't think over the support of 'lost+found'. But the name like
is_uninteresting_dirent_name is more flexible, indeed. I prefer a bit
shorter name, 'is_dummy_dirent_name'.

But if you're going to support 'lost+found's, remember that a Git
repository might have its own 'lost+found' directory. It's a bit crazy,
but it's possible:
	---
	 lost+found/file |    1 +
	 1 files changed, 1 insertions(+), 0 deletions(-)
	 create mode 100644 lost+found/file

	diff --git a/lost+found/file b/lost+found/file
	new file mode 100644
	index 0000000..190a180
	--- /dev/null
	+++ b/lost+found/file
	@@ -0,0 +1 @@
	+123
	-- 

Git shouldn't allow to clone at least repositories that have lost+found
directory into a directory with already existing lost+found (neither
it's a ordinary directory created using 'mkdir' nor it's an ext2's
property)

We should probably forbid cloning to a directory with lost+found,
because a 'lost+found' may appear after pulling from somebody and the
user won't be able to resolve this anyhow.

> 
> As long as this function is used only to detect and skip "uninteresting"
> dirent, I think that is not a bad direction.
> 
> On the other hand, I am a bit worried about is_empty_dir() abused outside
> its intended purpose to say "this directory does not have anything
> interesting".  E.g. "Oh, it's empty so we can nuke it":

I propose to rename it (if it's really necessary) to is_clean_dir, which
means "There's no old crap here, we can safely clone".

> 
> 	if (is_empty_dir(dir))
>         	rmdir(dir);
> 
> even though the current callers do not do something crazy like this (the
> usual order we do things is rmdir() and then check for errors).

I think, it's rather early to send [PATCHES v2] (with updated function
names), will wait for your comments.

^ permalink raw reply

* Re: Funny: git -p submodule summary
From: Johannes Sixt @ 2009-01-09 10:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, git
In-Reply-To: <20090109101335.GA4346@coredump.intra.peff.net>

Jeff King schrieb:
> On Fri, Jan 09, 2009 at 11:09:08AM +0100, Johannes Sixt wrote:
> 
>>> Below is a patch that uses the three-process mechanism, and it fixes the
>>> problem. _But_ it is not satisfactory for inclusion, because it won't
>>> work on MINGW32. Since it is actually splitting git into two processes
>>> (one to monitor the pager and one to actually run git), it uses fork.
>> We have start_async()/finish_async() to replace a fork() of the sort that
>> we have here.
> 
> It looks like start_async is implemented using threads on Windows. Will
> that survive an execvp call? Because we don't know at this point whether
> we are going to actually run builtin code, or if we will exec an
> external.

Ah, no, it would not survive.

But there's a more serious problem why we cannot use start_async() in its
current form: It expects that there is a *function* that produces the
output; but here we don't have a function - output is produced by
*returning* (from setup_pager).

I'll test your other patch (that replaces the execvp in git.c by run_command).

-- Hannes

^ permalink raw reply

* Re: Funny: git -p submodule summary
From: Jeff King @ 2009-01-09 10:47 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Johannes Schindelin, git
In-Reply-To: <496728B9.7090200@viscovery.net>

On Fri, Jan 09, 2009 at 11:36:41AM +0100, Johannes Sixt wrote:

> Ah, no, it would not survive.
> 
> But there's a more serious problem why we cannot use start_async() in its
> current form: It expects that there is a *function* that produces the
> output; but here we don't have a function - output is produced by
> *returning* (from setup_pager).

Good point.

> I'll test your other patch (that replaces the execvp in git.c by run_command).

Note that it only covers the case of external commands. Hitting ctrl-c
to interrupt git will still cause funniness. For that we need to
intercept signals to call wait_for_pager(). But there is a slight snag:
we also intercept signals elsewhere (e.g., for tempfile cleanup). So we
need to start remembering the old signal handlers everywhere and
chaining to them.

-Peff

^ permalink raw reply

* Re: [PATCH, resend] git-commit: colored status when color.ui is set
From: Johannes Schindelin @ 2009-01-09 10:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: markus.heidelberg, git
In-Reply-To: <7viqoo26rq.fsf@gitster.siamese.dyndns.org>

Hi,

On Fri, 9 Jan 2009, Junio C Hamano wrote:

> Markus Heidelberg <markus.heidelberg@web.de> writes:
> 
> > When using "git commit" and there was nothing to commit (the editor
> > wasn't launched), the status output wasn't colored, even though color.ui
> > was set. Only when setting color.status it worked.
> >
> > Signed-off-by: Markus Heidelberg <markus.heidelberg@web.de>
> > ---
> >  builtin-commit.c |    3 +++
> >  1 files changed, 3 insertions(+), 0 deletions(-)
> >
> > diff --git a/builtin-commit.c b/builtin-commit.c
> > index e88b78f..2d90f74 100644
> > --- a/builtin-commit.c
> > +++ b/builtin-commit.c
> > @@ -945,6 +945,9 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
> >  
> >  	git_config(git_commit_config, NULL);
> >  
> > +	if (wt_status_use_color == -1)
> > +		wt_status_use_color = git_use_color_default;
> > +
> >  	argc = parse_and_validate_options(argc, argv, builtin_commit_usage, prefix);
> >  
> >  	index_file = prepare_index(argc, argv, prefix);
> 
> My first reaction was:
> 
> 	When the editor does get launched, what would the new code do with
> 	your patch?  Would we see bunch of escape codes in the editor now?
> 
> But we do disable color explicitly when we generate contents to feed the
> editor in that case since bc5d248 (builtin-commit: do not color status
> output shown in the message template, 2007-11-18), so that fear is
> unfounded.

I had the same reaction, so I would like to see this reasoning in the 
commit message.

Ciao,
Dscho

^ permalink raw reply

* Re: [RFC PATCH] make diff --color-words customizable
From: Johannes Schindelin @ 2009-01-09 11:15 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git
In-Reply-To: <200901090151.10880.trast@student.ethz.ch>

Hi,

On Fri, 9 Jan 2009, Thomas Rast wrote:

> Johannes Schindelin wrote:
> > On Fri, 9 Jan 2009, Thomas Rast wrote:
> > 
> > > Allows for user-configurable word splits when using --color-words. This 
> > > can make the diff more readable if the regex is configured according to 
> > > the language of the file.
> > > 
> > > For now the (POSIX extended) regex must be set via the environment
> > > GIT_DIFF_WORDS_REGEX.  Each (non-overlapping) match of the regex is
> > > considered a word.  Anything characters not matched are considered
> > > whitespace.  For example, for C try
> > > 
> > >   GIT_DIFF_WORDS_REGEX='[0-9]+|[a-zA-Z_][a-zA-Z0-9_]*|(\+|-|&|\|){1,2}|\S'
> [...]
> > Interesting idea.  However, I think it would be better to do the opposite, 
> > have _word_ patterns.  And even better to have _one_ pattern.
> 
> I'm not sure I understand.  It _is_ a single pattern.  The examples
> just have several cases to distinguish various semantic groups that
> can occur, as a sort of "half tokenizer".  (The C example isn't very
> complete however.)

Oh, I was fooled by your use of an array of enums whose purpose I did not 
understand at all.

> > BTW I think you could do what you intended to do with a _way_ smaller 
> > and more intuitive patch.
> 
> How?

Intuitively, all you would have to do is to replace this part in 
diff_words_show()

        for (i = 0; i < minus.size; i++)
                if (isspace(minus.ptr[i]))
                        minus.ptr[i] = '\n';

by a loop finding the next word boundary.  I would suggest making that a 
function, say,

	int find_word_boundary(struct diff_words_data *data, char *minus);

This function would also be responsible to initialize the regexp.

However, as I said, I think it would be much more intuitive to 
characterize the _words_ instead of the _word boundaries_.

And I would like to keep the default as-is (together _with_ the 
performance.  IOW if the user did not specify a regexp, it should fall 
back to what it does now, which is slow enough).

Ciao,
Dscho

^ permalink raw reply

* Re: [RFC PATCH] make diff --color-words customizable
From: Johannes Schindelin @ 2009-01-09 11:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Thomas Rast, git
In-Reply-To: <20090109095300.GA4099@coredump.intra.peff.net>

Hi,

On Fri, 9 Jan 2009, Jeff King wrote:

> On Fri, Jan 09, 2009 at 01:05:05AM +0100, Thomas Rast wrote:
> 
> > Apart from possible bugs, the main issue is: where should I put the 
> > configuration for this?
> 
> It's a per-file thing, so probably in the diff driver that is triggered 
> via attributes. See userdiff.[ch]; you'll need to add an entry to the 
> userdiff_driver struct. You can look at the funcname pattern stuff as a 
> template, as this is very similar.

I am not sure I would want that in the config or the attributes.  For me, 
it always has been a question of "oh, that LaTeX diff looks ugly, let's 
see what words actually changed".

Only rarely did I wish for a different word boundary detection algorithm.

So I'd rather have an alias than a config/attribute setting.

Ciao,
Dscho

^ permalink raw reply

* Re: [RFC PATCH] make diff --color-words customizable
From: Jeff King @ 2009-01-09 11:22 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Thomas Rast, git
In-Reply-To: <alpine.DEB.1.00.0901091215590.30769@pacific.mpi-cbg.de>

On Fri, Jan 09, 2009 at 12:18:37PM +0100, Johannes Schindelin wrote:

> > It's a per-file thing, so probably in the diff driver that is triggered 
> > via attributes. See userdiff.[ch]; you'll need to add an entry to the 
> > userdiff_driver struct. You can look at the funcname pattern stuff as a 
> > template, as this is very similar.
> 
> I am not sure I would want that in the config or the attributes.  For me, 
> it always has been a question of "oh, that LaTeX diff looks ugly, let's 
> see what words actually changed".
> 
> Only rarely did I wish for a different word boundary detection algorithm.
> 
> So I'd rather have an alias than a config/attribute setting.

I am not sure what you are saying.

If it is "I do not want color-words on by default for LaTeX", then I
agree. I meant merely that _if_ color-words is enabled, the word
boundaries would be taken from the diff driver config (just like we do
for matching the funcname header).

If it is "I want to specify the color-words boundary on a per-run basis
rather than a per-file basis", then I want the opposite. However, there
is no reason that both cannot be supported (with command line or
environment taking precedence over what's in the config).

-Peff

^ permalink raw reply

* [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words
From: Johannes Schindelin @ 2009-01-09 11:59 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0901091202250.30769@pacific.mpi-cbg.de>


In some applications, words are not delimited by white space.  To
allow for that, you can specify a regular expression describing
what makes a word with

	git diff --color-words='^[A-Za-z0-9]*'

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	On Fri, 9 Jan 2009, Johannes Schindelin wrote:

	> Intuitively, all you would have to do is to replace this part in 
	> diff_words_show()
	> 
	>         for (i = 0; i < minus.size; i++)
	>                 if (isspace(minus.ptr[i]))
	>                         minus.ptr[i] = '\n';
	> 
	> by a loop finding the next word boundary.  I would suggest making that a 
	> function, say,
	> 
	> 	int find_word_boundary(struct diff_words_data *data, char *minus);
	> 
	> This function would also be responsible to initialize the regexp.
	> 
	> However, as I said, I think it would be much more intuitive to 
	> characterize the _words_ instead of the _word boundaries_.
	> 
	> And I would like to keep the default as-is (together _with_ the 
	> performance.  IOW if the user did not specify a regexp, it should fall 
	> back to what it does now, which is slow enough).

	And this patch does all that, and it _is_ substantially more 
	compact, as promised.

	It lacks testing, a test script and documentation, as well as 
	configurability via config and/or attributes, but that's your
	job, as I am not really _that_ interested in the feature myself.

 diff.c |   45 +++++++++++++++++++++++++++++++++++++++------
 diff.h |    1 +
 2 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 4643ffc..c7ddb60 100644
--- a/diff.c
+++ b/diff.c
@@ -339,6 +339,7 @@ static void diff_words_append(char *line, unsigned long len,
 struct diff_words_data {
 	struct diff_words_buffer minus, plus;
 	FILE *file;
+	regex_t *word_regex;
 };
 
 static void print_word(FILE *file, struct diff_words_buffer *buffer, int len, int color,
@@ -398,6 +399,25 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len)
 	}
 }
 
+static int find_word_boundary(struct diff_words_data *diff_words,
+		mmfile_t *buffer, int i)
+{
+	if (i >= buffer->size)
+		return i;
+
+	if (diff_words->word_regex) {
+		regmatch_t match[1];
+		if (!regexec(diff_words->word_regex, buffer->ptr + i,
+				1, match, 0))
+			i += match[0].rm_eo;
+	}
+	else
+		while (i < buffer->size && !isspace(i))
+			i++;
+
+	return i;
+}
+
 /* this executes the word diff on the accumulated buffers */
 static void diff_words_show(struct diff_words_data *diff_words)
 {
@@ -412,17 +432,17 @@ static void diff_words_show(struct diff_words_data *diff_words)
 	minus.size = diff_words->minus.text.size;
 	minus.ptr = xmalloc(minus.size);
 	memcpy(minus.ptr, diff_words->minus.text.ptr, minus.size);
-	for (i = 0; i < minus.size; i++)
-		if (isspace(minus.ptr[i]))
-			minus.ptr[i] = '\n';
+	for (i = 0; (i = find_word_boundary(diff_words, &minus, i))
+			< minus.size; i++)
+		minus.ptr[i] = '\n';
 	diff_words->minus.current = 0;
 
 	plus.size = diff_words->plus.text.size;
 	plus.ptr = xmalloc(plus.size);
 	memcpy(plus.ptr, diff_words->plus.text.ptr, plus.size);
-	for (i = 0; i < plus.size; i++)
-		if (isspace(plus.ptr[i]))
-			plus.ptr[i] = '\n';
+	for (i = 0; (i = find_word_boundary(diff_words, &plus, i))
+			< plus.size; i++)
+		plus.ptr[i] = '\n';
 	diff_words->plus.current = 0;
 
 	xpp.flags = XDF_NEED_MINIMAL;
@@ -461,6 +481,7 @@ static void free_diff_words_data(struct emit_callback *ecbdata)
 
 		free (ecbdata->diff_words->minus.text.ptr);
 		free (ecbdata->diff_words->plus.text.ptr);
+		free(ecbdata->diff_words->word_regex);
 		free(ecbdata->diff_words);
 		ecbdata->diff_words = NULL;
 	}
@@ -1483,6 +1504,14 @@ static void builtin_diff(const char *name_a,
 			ecbdata.diff_words =
 				xcalloc(1, sizeof(struct diff_words_data));
 			ecbdata.diff_words->file = o->file;
+			if (o->word_regex) {
+				ecbdata.diff_words->word_regex = (regex_t *)
+					xmalloc(sizeof(regex_t));
+				if (regcomp(ecbdata.diff_words->word_regex,
+						o->word_regex, REG_EXTENDED))
+					die ("Invalid regular expression: %s",
+							o->word_regex);
+			}
 		}
 		xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata,
 			      &xpp, &xecfg, &ecb);
@@ -2496,6 +2525,10 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 		DIFF_OPT_CLR(options, COLOR_DIFF);
 	else if (!strcmp(arg, "--color-words"))
 		options->flags |= DIFF_OPT_COLOR_DIFF | DIFF_OPT_COLOR_DIFF_WORDS;
+	else if (!prefixcmp(arg, "--color-words=")) {
+		options->flags |= DIFF_OPT_COLOR_DIFF | DIFF_OPT_COLOR_DIFF_WORDS;
+		options->word_regex = arg + 14;
+	}
 	else if (!strcmp(arg, "--exit-code"))
 		DIFF_OPT_SET(options, EXIT_WITH_STATUS);
 	else if (!strcmp(arg, "--quiet"))
diff --git a/diff.h b/diff.h
index 4d5a327..23cd90c 100644
--- a/diff.h
+++ b/diff.h
@@ -98,6 +98,7 @@ struct diff_options {
 
 	int stat_width;
 	int stat_name_width;
+	const char *word_regex;
 
 	/* this is set by diffcore for DIFF_FORMAT_PATCH */
 	int found_changes;
-- 
1.6.1.203.gc8be3

^ permalink raw reply related

* Re: [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words
From: Thomas Rast @ 2009-01-09 12:24 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0901091255230.30769@pacific.mpi-cbg.de>

[-- Attachment #1: Type: text/plain, Size: 2785 bytes --]

Johannes Schindelin wrote:
> 
> In some applications, words are not delimited by white space.  To
> allow for that, you can specify a regular expression describing
> what makes a word with
> 
> 	git diff --color-words='^[A-Za-z0-9]*'
[...]
> 	> Intuitively, all you would have to do is to replace this part in 
> 	> diff_words_show()
> 	> 
> 	>         for (i = 0; i < minus.size; i++)
> 	>                 if (isspace(minus.ptr[i]))
> 	>                         minus.ptr[i] = '\n';
> 	> 
> 	> by a loop finding the next word boundary.
[...]
> 	> However, as I said, I think it would be much more intuitive to 
> 	> characterize the _words_ instead of the _word boundaries_.

That doesn't work.  You cannot overwrite actual content in the strings
to be diffed with newlines.  The current --color-words exploits the
fact that we don't care about spaces anyway, so we might as well
replace them with newlines, but we _do_ care about the words and in
the regexed version, you have no guarantees about where they might start.

To wit:

  thomas@thomas:~/tmp/foo(master)$ cat >foo
  foo_bar_baz
  quux
  thomas@thomas:~/tmp/foo(master)$ git add foo
  thomas@thomas:~/tmp/foo(master)$ git ci -m initial
  [master (root-commit)]: created f110c6c: "initial"
   1 files changed, 2 insertions(+), 0 deletions(-)
   create mode 100644 foo
  thomas@thomas:~/tmp/foo(master)$ cat >foo
  foo_
  ar_
  az
  quux
  thomas@thomas:~/tmp/foo(master)$ git diff
  diff --git i/foo w/foo
  index 5b34f11..a2762c6 100644
  --- i/foo
  +++ w/foo
  @@ -1,2 +1,4 @@
  -foo_bar_baz
  +foo_
  +ar_
  +az
   quux
  thomas@thomas:~/tmp/foo(master)$ git diff --color-words
  diff --git i/foo w/foo
  index 5b34f11..a2762c6 100644
  --- i/foo
  +++ w/foo
  @@ -1,2 +1,4 @@
  foo_bar_bafoo_
  ar_
  az
  quux
  thomas@thomas:~/tmp/foo(master)$ git diff --color-words='[a-zA-Z]+_?'
  diff --git i/foo w/foo
  index 5b34f11..a2762c6 100644
  --- i/foo
  +++ w/foo
  @@ -1,2 +1,4 @@
  quux

Even without the colours, you can see that it has a blind spot for
changes around a newline.  Perhaps there is an easier way to remember
them, but we definitely cannot *forget* about the word boundaries.

That being said, even though my patch correctly sees the changes, the
above test case also exposes some sort of string overrun :-(

> 	> And I would like to keep the default as-is (together _with_ the 
> 	> performance.  IOW if the user did not specify a regexp, it should fall 
> 	> back to what it does now, which is slow enough).

That's definitely a valid request.

I'll come up with a fixed patch, and probably make it both
funcname-like (Jeff's idea) and command line configurable.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words
From: Teemu Likonen @ 2009-01-09 13:05 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Johannes Schindelin, git
In-Reply-To: <200901091324.40583.trast@student.ethz.ch>

Thomas Rast (2009-01-09 13:24 +0100) wrote:

> Johannes Schindelin wrote:
>> > And I would like to keep the default as-is (together _with_ the
>> > performance. IOW if the user did not specify a regexp, it should
>> > fall back to what it does now, which is slow enough).
>
> That's definitely a valid request.

I agree with that too. A good thing about the current --color-words is
that it automatically works with UTF-8 encoded text. This is _very_
important as --color-words is usually the best diff tool for
human-language texts.

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Johannes Schindelin @ 2009-01-09 13:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Adeodato Simó, Linus Torvalds, Clemens Buchacher,
	Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <7v7i552clz.fsf@gitster.siamese.dyndns.org>

Hi,

On Thu, 8 Jan 2009, Junio C Hamano wrote:

> If we find the "common" context lines that have only blank and 
> punctuation letters in Dscho output, turn each of them into "-" and "+", 
> and rearrange them so that all "-" are together followed by "+", it will 
> match Bzr output.

So we'd need something like this (I still think we should treat curly 
brackets the same as punctuation, and for good measure I just handled 
everything that is not alphanumerical the same):

-- snipsnap --
[TOY PATCH] Add diff option '--collapse-non-alnums'

With the option --collapse-non-alnums, there will be no interhunks
consisting solely of non-alphanumerical letters.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 diff.c         |    2 ++
 xdiff/xdiff.h  |    1 +
 xdiff/xdiffi.c |   48 +++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/diff.c b/diff.c
index c7ddb60..4b387fb 100644
--- a/diff.c
+++ b/diff.c
@@ -2503,6 +2503,8 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 		options->xdl_opts |= XDF_IGNORE_WHITESPACE_AT_EOL;
 	else if (!strcmp(arg, "--patience"))
 		options->xdl_opts |= XDF_PATIENCE_DIFF;
+	else if (!strcmp(arg, "--collapse-non-alnums"))
+		options->xdl_opts |= XDF_COLLAPSE_NON_ALNUMS;
 
 	/* flags options */
 	else if (!strcmp(arg, "--binary")) {
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 4da052a..a444f9a 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -33,6 +33,7 @@ extern "C" {
 #define XDF_IGNORE_WHITESPACE_CHANGE (1 << 3)
 #define XDF_IGNORE_WHITESPACE_AT_EOL (1 << 4)
 #define XDF_PATIENCE_DIFF (1 << 5)
+#define XDF_COLLAPSE_NON_ALNUMS (1 << 6)
 #define XDF_WHITESPACE_FLAGS (XDF_IGNORE_WHITESPACE | XDF_IGNORE_WHITESPACE_CHANGE | XDF_IGNORE_WHITESPACE_AT_EOL)
 
 #define XDL_PATCH_NORMAL '-'
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 3e97462..b8e7ee8 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -396,6 +396,50 @@ static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1,
 	return xch;
 }
 
+static int xdl_record_contains_alnum(xrecord_t *record)
+{
+	long i;
+	for (i = 0; i < record->size; i++)
+		if (isalnum(record->ptr[i]))
+			return 1;
+	return 0;
+}
+
+static int xdl_collapse_non_alnum(xdfile_t *xdf, xdfile_t *xdfo)
+{
+	long ix, ixo, len = 0;
+
+	/*
+	 * Collapse all interhunk parts consisting solely of non-alnum
+	 * characters into the hunks.
+	 */
+	for (ix = 0, ixo = 0; ix < xdf->nrec && ixo < xdfo->nrec; ix++, ixo++) {
+		if (xdf->rchg[ix] == 1 || xdfo->rchg[ixo] == 1) {
+			/* collapse non-alnum interhunks */
+			while (len > 0) {
+				xdf->rchg[ix - len] = 1;
+				xdfo->rchg[ixo - len] = 1;
+				len--;
+			}
+
+			/* look for end of hunk */
+			while (ix < xdf->nrec && xdf->rchg[ix] == 1)
+				ix++;
+			while (ixo < xdfo->nrec && xdfo->rchg[ixo] == 1)
+				ixo++;
+			if (ix >= xdf->nrec)
+				return 0;
+			len = !xdl_record_contains_alnum(xdf->recs[ix]);
+		}
+		else if (len > 0) {
+			if (xdl_record_contains_alnum(xdf->recs[ix]))
+				len = 0;
+			else
+				len++;
+		}
+	}
+	return 0;
+}
 
 int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 	long ix, ixo, ixs, ixref, grpsiz, nrec = xdf->nrec;
@@ -548,7 +592,9 @@ int xdl_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
 
 		return -1;
 	}
-	if (xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
+	if (((xpp->flags & XDF_COLLAPSE_NON_ALNUMS) &&
+	     xdl_collapse_non_alnum(&xe.xdf1, &xe.xdf2)) ||
+	    xdl_change_compact(&xe.xdf1, &xe.xdf2, xpp->flags) < 0 ||
 	    xdl_change_compact(&xe.xdf2, &xe.xdf1, xpp->flags) < 0 ||
 	    xdl_build_script(&xe, &xscr) < 0) {
 
-- 
1.6.1.203.gc8be3

^ permalink raw reply related

* Re: Comments on Presentation Notes Request.
From: Jakub Narebski @ 2009-01-09 13:50 UTC (permalink / raw)
  To: Tim Visher; +Cc: git
In-Reply-To: <c115fd3c0901061433i78bf3b26v77e5981aada6728e@mail.gmail.com>

"Tim Visher" <tim.visher@gmail.com> writes:

> Hello Everyone,
> 
> I'm putting together a little 15 minute presentation for my company
> regarding SCMSes in an attempt to convince them to at the very least
> use a Distributed SCMS and at best to use git.  I put together all my
> notes, although I didn't put together the actual presentation yet.  I
> figured I'd post them here and maybe get some feedback about it.  Let
> me know what you think.
> 
> Thanks in advance!

Take a look at the following links:
 * "Understanding Version-Control Systems (DRAFT)" by Eric Raymond
   http://www.catb.org/esr/writings/version-control/version-control.html
 * "Version Control Habits of Effective Developers" at The Daily Build
   http://blog.bstpierre.org/version-control-habits

Note that the first one is DRAFT; on the other hand it explains
lock-edit, merge-then-commit, and commit-then-merge workflows quite
well, and has a host of links.
   
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply

* Re: 1.5.6.5 fails to clone git.kernel.org/[...]/rostedt/linux-2.6-rt
From: Miklos Vajna @ 2009-01-09 14:08 UTC (permalink / raw)
  To: Tim Shepard; +Cc: git
In-Reply-To: <E1LLAn5-0001JM-00@alva.home>

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Fri, Jan 09, 2009 at 01:24:19AM -0500, Tim Shepard <shep@alum.mit.edu> wrote:
> 
> 
> I have git 1.5.6.5 installed from the Debian/lenny package.
> 
> Poking around in http://git.kernel.org/ looking for a git repository
> that might have the latest -rt development happening, I found
> 
>   http://git.kernel.org/?p=linux/kernel/git/rostedt/linux-2.6-rt.git
> 
> which looked promising.
> 
> But when I tried cloning it using:
> 
>     git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-rt.git linux-2.6-rt

I would use the git:// link from gitweb.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH] allow 8bit data in email body sent by send-email
From: Andre Przywara @ 2009-01-09 14:16 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Andre Przywara
In-Reply-To: <20090109072814.GA21180@coredump.intra.peff.net>

Jeff King wrote:
>> when sending patch files via git send-email, the perl script assumes
>> 7bit characters only. If there are other bytes in the body (foreign language
>> characters in names or translations), some servers (like vger.kernel.org)
>> reject the mail because of th?t. This patch always adds an 8bit header line
>> to each mail.
> 
> This should be done already by git-format-patch when you generate the
> patch to feed to send-email.
Well, this could be discussed, after all the problem lies in the actual 
transportation, which should be the responsibility of git-send-email. 
But I am OK with putting this into format-patch.
 > What exactly is the workflow you use to generate this problem?
I use git format-patch to generate a patch file for a single-mail patch 
(not a patch series). Then I edit this file manually to add questions 
and comments and include my signature. During this step the umlauts came 
in. If you have a suggestion to improve this workflow, I am all ears, I 
am fairly new to git.
> Does it matter where the non-ascii characters are
> (commit versus patch, etc)?
Oh, right you are. If there are 8bit characters in the commit message, 
git-format-patch adds the appropriate headers.
 > What version of git are you using?
Version 1.5.5 on one machine and 1.5.2.2 on another. I know, i know ;-) 
but I haven't had time to compile a newer one, yet.

> 
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> index 77ca8fe..68a462c 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -793,6 +793,7 @@ To: $to${ccline}
>>  Subject: $subject
>>  Date: $date
>>  Message-Id: $message_id
>> +Content-Transfer-Encoding: 8bit
>>  X-Mailer: git-send-email $gitversion
>>  ";
> 
> This fix isn't right anyway. For one thing, if you're going to include
> C-T-E, you should also include a MIME-Version header. But more
> importantly, we are already handling encoding elsewhere. So
> unconditionally adding this means that you may conflict with existing
> MIME headers in the @xh variable.

Ok, so what about adding a flag to git-format-patch that forces the 8bit 
headers on? I think a workaround would be to add a --subject-prefix with 
a special character and later remove this, but this is not really a 
long-term solution ;-)

Thanks and regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 277-84917
----to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Jochen Polster; Thomas M. McCoy; Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply

* Re: [PATCH] allow 8bit data in email body sent by send-email
From: Jeff King @ 2009-01-09 14:44 UTC (permalink / raw)
  To: Andre Przywara; +Cc: git
In-Reply-To: <49675C38.8060208@amd.com>

On Fri, Jan 09, 2009 at 03:16:24PM +0100, Andre Przywara wrote:

>> This should be done already by git-format-patch when you generate the
>> patch to feed to send-email.
> Well, this could be discussed, after all the problem lies in the actual  
> transportation, which should be the responsibility of git-send-email. But 
> I am OK with putting this into format-patch.

I didn't mean "this functionality should go into format-patch" but
rather "this functionality is _already_ in format-patch, and it should
have been triggered".

The reason it has to be in format-patch is that only format-patch knows
what the correct encoding is. It's not that useful to just say "oh, this
is some 8-bit data." You also want to give a content-type header that
specifies the correct encoding. And anything that contains non-ascii
characters should come out of format-patch with such a header.

> > What exactly is the workflow you use to generate this problem?
> I use git format-patch to generate a patch file for a single-mail patch  
> (not a patch series). Then I edit this file manually to add questions and 
> comments and include my signature. During this step the umlauts came in. 
> If you have a suggestion to improve this workflow, I am all ears, I am 
> fairly new to git.

Ah, I see. I'm not sure what the best solution is there. send-email has
intentionally been kept pretty dumb, because implementing full MUA
behavior would make it pretty unwieldy. You could add an option to
send-email to add the 8-bit transfer-encoding header if necessary, but
it will have to guess at (or be configured to know) the correct encoding
of the characters.

Personally, when I want to add information like that to a patch, I pull
the output of format-patch into my MUA (mutt, in my case). I don't know
if that is a workable solution for you.

> Ok, so what about adding a flag to git-format-patch that forces the 8bit  
> headers on? I think a workaround would be to add a --subject-prefix with  
> a special character and later remove this, but this is not really a  
> long-term solution ;-)

Now that you've explained your workflow, I do think send-email is a more
appropriate place to add a header, since format-patch never even sees
the data that is causing the problem. Probably the sanest thing would be
to check each input file for non-ascii characters. If they are found,
and the message does not already have some MIME headers, then add an
8bit content-transfer-encoding and a text/plain content-type. In the
latter, you would need to specify some encoding. Most of git defaults
to utf-8, but it should probably be configurable.

We have to do a similar thing for the --compose option, so looking at
what that does is probably a good starting point.

-Peff

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Adeodato Simó @ 2009-01-09 15:59 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Linus Torvalds, Clemens Buchacher,
	Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.DEB.1.00.0901091405460.30769@pacific.mpi-cbg.de>

* Johannes Schindelin [Fri, 09 Jan 2009 14:07:28 +0100]:

> > If we find the "common" context lines that have only blank and 
> > punctuation letters in Dscho output, turn each of them into "-" and "+", 
> > and rearrange them so that all "-" are together followed by "+", it will 
> > match Bzr output.

> So we'd need something like this (I still think we should treat curly 
> brackets the same as punctuation, and for good measure I just handled 
> everything that is not alphanumerical the same):

Nice. With this patch of yours, --patience --collapse-non-alnums
produces the same output as bzr for this last test case (the util_sock.c
one). However, also for this last case, without --patience, diff
--collapse-non-alnums finds *no* common lines at all. Mentioning in case
you'd be interested in knowing.

Cheers,

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
- You look beaten.
- I just caught Tara laughing with another man.
- Are you sure they weren't just... kissing or something?
- No, they were laughing.
                -- Denny Crane and Alan Shore

^ permalink raw reply

* Re: [PATCH 1/2] bash completion: Add '--intent-to-add' long option for 'git add'
From: Lee Marlow @ 2009-01-09 16:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn O. Pearce
In-Reply-To: <20081210194131.GB11928@spearce.org>

On Wed, Dec 10, 2008 at 12:41 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Lee Marlow <lee.marlow@gmail.com> wrote:
>> Signed-off-by: Lee Marlow <lee.marlow@gmail.com>
>> ---
>>  contrib/completion/git-completion.bash |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> Trivially-Acked-by: Shawn O. Pearce <spearce@spearce.org>
>

Bump :)

^ permalink raw reply

* Re: [PATCH 2/2] bash completion: Use 'git add' completions for 'git stage'
From: Lee Marlow @ 2009-01-09 16:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn O. Pearce
In-Reply-To: <20081210195957.GE11928@spearce.org>

On Wed, Dec 10, 2008 at 12:59 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Lee Marlow <lee.marlow@gmail.com> wrote:
>> Signed-off-by: Lee Marlow <lee.marlow@gmail.com>
>> ---
>>  contrib/completion/git-completion.bash |    1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> Also,
>
> Trivially-Acked-by: Shawn O. Pearce <spearce@spearce.org>

Nudge

^ permalink raw reply

* Re: [PATCH, resend] git-commit: colored status when color.ui is set
From: Markus Heidelberg @ 2009-01-09 16:24 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git
In-Reply-To: <alpine.DEB.1.00.0901091155210.30769@pacific.mpi-cbg.de>

Johannes Schindelin, 09.01.2009:
> Hi,
> 
> On Fri, 9 Jan 2009, Junio C Hamano wrote:
> 
> > Markus Heidelberg <markus.heidelberg@web.de> writes:
> > > When using "git commit" and there was nothing to commit (the editor
> > > wasn't launched), the status output wasn't colored, even though color.ui
> > > was set. Only when setting color.status it worked.
> > 
> > My first reaction was:
> > 
> > 	When the editor does get launched, what would the new code do with
> > 	your patch?  Would we see bunch of escape codes in the editor now?

Of course I tested this case :)

> > But we do disable color explicitly when we generate contents to feed the
> > editor in that case since bc5d248 (builtin-commit: do not color status
> > output shown in the message template, 2007-11-18), so that fear is
> > unfounded.
> 
> I had the same reaction, so I would like to see this reasoning in the 
> commit message.

wt_status_use_color could have already been set during git_config()
without this patch, just not with color.ui, but color.status, so this is
not really a new case. But I can understand the reaction and don't have
ocjections against this explanation in the commit message.

Markus

^ permalink raw reply

* [PATCH v2] t7501-commit.sh: explicitly check that -F prevents invoking the editor
From: Adeodato Simó @ 2009-01-09 17:30 UTC (permalink / raw)
  To: git, gitster; +Cc: Johannes Schindelin, Adeodato Simó
In-Reply-To: <alpine.DEB.1.00.0812301250210.30769@pacific.mpi-cbg.de>

The "--signoff" test case in t7500-commit.sh was setting VISUAL while
using -F -, which indeed tested that the editor is not spawned with -F.
However, having it there was confusing, since there was no obvious reason
to the casual reader for it to be there.

This commits removes the setting of VISUAL from the --signoff test, and
adds in t7501-commit.sh a dedicated test case, where the rest of tests for
-F are.

Signed-off-by: Adeodato Simó <dato@net.com.org.es>
---
* Johannes Schindelin [Tue, 30 Dec 2008 13:04:46 +0100]:

> Hmm.  Obviously, I failed to document properly why I tested the editor, 
> but I think it makes sense to assume that -F still triggered an 
> interactive editor at some stage in the development of builtin commit.

> I do not have anything against separating that issue into another test 
> case, but I am strongly opposed to simply removing it.

Ok, I've moved it to a separate test case, please review to see if you
approve of it.

Thanks,

 t/t7500-commit.sh |    5 +----
 t/t7501-commit.sh |   20 ++++++++++++++++++++
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/t/t7500-commit.sh b/t/t7500-commit.sh
index 6e18a96..5998baf 100755
--- a/t/t7500-commit.sh
+++ b/t/t7500-commit.sh
@@ -149,10 +149,7 @@ EOF
 
 test_expect_success '--signoff' '
 	echo "yet another content *narf*" >> foo &&
-	echo "zort" | (
-		test_set_editor "$TEST_DIRECTORY"/t7500/add-content &&
-		git commit -s -F - foo
-	) &&
+	echo "zort" | git commit -s -F - foo &&
 	git cat-file commit HEAD | sed "1,/^$/d" > output &&
 	test_cmp expect output
 '
diff --git a/t/t7501-commit.sh b/t/t7501-commit.sh
index 63bfc6d..b4e2b4d 100755
--- a/t/t7501-commit.sh
+++ b/t/t7501-commit.sh
@@ -127,6 +127,26 @@ test_expect_success \
 	"showing committed revisions" \
 	"git rev-list HEAD >current"
 
+cat >editor <<\EOF
+#!/bin/sh
+sed -e "s/good/bad/g" < "$1" > "$1-"
+mv "$1-" "$1"
+EOF
+chmod 755 editor
+
+cat >msg <<EOF
+A good commit message.
+EOF
+
+test_expect_success \
+	'editor not invoked if -F is given' '
+	 echo "moo" >file &&
+	 VISUAL=./editor git commit -a -F msg &&
+	 git show -s --pretty=format:"%s" | grep -q good &&
+	 echo "quack" >file &&
+	 echo "Another good message." | VISUAL=./editor git commit -a -F - &&
+	 git show -s --pretty=format:"%s" | grep -q good
+	 '
 # We could just check the head sha1, but checking each commit makes it
 # easier to isolate bugs.
 
-- 
1.6.1.134.g55c35

^ permalink raw reply related

* git-svn: File was not found in commit
From: Morgan Christiansson @ 2009-01-09 17:19 UTC (permalink / raw)
  To: git

Hi, i'm trying to "git svn fetch" my repository from a local file:///
repo and i'm running into this problem:

$ git svn init -t tags -b branches -T trunk file:///path/to/svn/repo
$ git svn fetch
branches/rails/rails/vendor/plugins/acts_as_xapian/.git/refs/heads/master
was not found in commit a643e882c557593f36bb9fd0966490010b9dba61 (r10576)


I found another report that seems to describe the same error:
http://marc.info/?l=git&m=121537767308135&w=2
Investigating the the history it's committed in r10577 and it's looking
for it in r10576, so it seems to be off by one revision number. Exactly
like the other report.
I've tried the latest git version of git-svn.perl and the problem is not
fixed there.


$ svn log file:///path/to/repo -r10576:10577 -v
------------------------------------------------------------------------
r10576 | morgan | 2008-11-28 14:35:53 +0000 (Fri, 28 Nov 2008) | 3 lines
Changed paths:
   A /branches/rails/rails/app/controllers/browse_sheetmusic_controller.rb
   M /branches/rails/rails/app/controllers/scores_controller.rb
   M /branches/rails/rails/app/models/composer.rb
   M /branches/rails/rails/app/models/score.rb
   M /branches/rails/rails/config/routes.rb

Commit message.

------------------------------------------------------------------------
r10577 | morgan | 2008-11-28 18:31:00 +0000 (Fri, 28 Nov 2008) | 3 lines
Changed paths:
   A /branches/rails/rails/vendor/plugins/acts_as_xapian/.git/FETCH_HEAD
   M /branches/rails/rails/vendor/plugins/acts_as_xapian/.git/config
   M /branches/rails/rails/vendor/plugins/acts_as_xapian/.git/index
   M /branches/rails/rails/vendor/plugins/acts_as_xapian/.git/logs/HEAD
   M
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/logs/refs/heads/master 

# <-- THIS FILE
   M
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/logs/refs/remotes/origin/HEAD
   M
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/logs/refs/remotes/origin/master
   A
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/objects/pack/pack-41ebdff27c581340ac7a71850e2e3a7d1cfea138.idx
   A
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/objects/pack/pack-41ebdff27c581340ac7a71850e2e3a7d1cfea138.pack
   M
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/refs/heads/master
   M
/branches/rails/rails/vendor/plugins/acts_as_xapian/.git/refs/remotes/origin/master
   A /branches/rails/rails/vendor/plugins/acts_as_xapian/README.textile
   M
/branches/rails/rails/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb

Switched repo to git://github.com/Overbryd/acts_as_xapian.git

------------------------------------------------------------------------




I did some digging in the perl script and managed to generate this stack
trace, it shows that gs_do_update is called with $rev_a=10576 and
$rev_b=10577, the file is in $rev_b but it complains it's not found in
$rev_a.

SVN::Git::Fetcher::open_file('SVN::Git::Fetcher=HASH(0x25faf38)',
'branches/rails/rails/vendor/plugins/acts_as_xapian/.git/refs/heads/master', 

'HASH(0x25fdb00)', 10576, '_p_apr_pool_t=SCALAR(0x24f8978)') called at
/usr/lib/perl5/SVN/Ra.pm line 623
SVN::Ra::Reporter::AUTOLOAD('SVN::Ra::Reporter=ARRAY(0x24f8948)',
'SVN::Pool=REF(0x24f8528)') called at ../git-svn.perl line 4087
Git::SVN::Ra::gs_do_update('Git::SVN::Ra=HASH(0x24beac8)', 10576, 10577,
'Git::SVN=HASH(0x24f7d18)', 'SVN::Git::Fetcher=HASH(0x25faf38)') called
at ../git-svn.perl line 2481
Git::SVN::do_fetch('Git::SVN=HASH(0x24f7d18)', 'HASH(0x24c01f0)', 10577)
called at ../git-svn.perl line 4227
Git::SVN::Ra::gs_fetch_loop_common('Git::SVN::Ra=HASH(0x24beac8)',
10575, 10724, 'ARRAY(0x1da1c20)', 'ARRAY(0x1da1c50)') called at
../git-svn.perl line 1506
Git::SVN::fetch_all('svn', 'HASH(0x21d6440)') called at ../git-svn.perl
line 387
main::cmd_fetch at ../git-svn.perl line 268
eval {...} at ../git-svn.perl line 266
branches/rails/rails/vendor/plugins/acts_as_xapian/.git/refs/heads/master
was not found in commit a643e882c557593f36bb9fd0966490010b9dba61
(r10576) at ../git-svn.perl line 3271.


I'm not sure whether this is correct behavior or not and I'm not
familiar with SVN::Ra::Reporter... so some help would be appreciated.

Thanks,
Morgan

^ permalink raw reply

* Curious about details of optimization of object database...
From: chris @ 2009-01-09 17:46 UTC (permalink / raw)
  To: git

I'm told a commit is *not* a patch (diff), but, rather a copy of the entire
tree.

Can anyone say, in a few sentences, how git avoids needing to keep multiple
slightly different copies of entire files without just storing lots of
patches/diffs?

cs

^ permalink raw reply

* Re: Curious about details of optimization of object database...
From: David Brown @ 2009-01-09 17:56 UTC (permalink / raw)
  To: chris; +Cc: git
In-Reply-To: <20090109174623.GC12552@seberino.org>

On Fri, Jan 09, 2009 at 09:46:23AM -0800, chris@seberino.org wrote:
>I'm told a commit is *not* a patch (diff), but, rather a copy of the entire
>tree.
>
>Can anyone say, in a few sentences, how git avoids needing to keep multiple
>slightly different copies of entire files without just storing lots of
>patches/diffs?

   Documentation/technical/pack-heuristics.txt

David

^ permalink raw reply

* Re: Curious about details of optimization of object database...
From: Matthieu Moy @ 2009-01-09 17:55 UTC (permalink / raw)
  To: chris; +Cc: git
In-Reply-To: <20090109174623.GC12552@seberino.org>

chris@seberino.org writes:

> I'm told a commit is *not* a patch (diff), but, rather a copy of the entire
> tree.

Conceptually, yes. But obviously, the storage format (pack) does what
people usually call "delta-compression", which is basically storing
only the diff against another, similar object.

-- 
Matthieu

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Linus Torvalds @ 2009-01-09 18:09 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Adeodato Simó, Clemens Buchacher,
	Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.DEB.1.00.0901091405460.30769@pacific.mpi-cbg.de>



On Fri, 9 Jan 2009, Johannes Schindelin wrote:
> 
> -- snipsnap --
> [TOY PATCH] Add diff option '--collapse-non-alnums'

I really don't think it should be about "alnum".

Think about languages that use "begin" and "end" instead of "{" "}".

I think we'd be better off just looking at the size, but _this_ is a 
really good area where "uniqueness" matters.

Don't combine unique lines, but lines that have the same hash as a 
thousand other lines? Go right ahead.

		Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox