Git development
 help / color / mirror / Atom feed
* Re: [RFC/PATCH] add diffstat information to rebase
From: Duy Nguyen @ 2016-12-25  0:52 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Git Mailing List, Johannes Schindelin
In-Reply-To: <20161222221327.7354-1-sbeller@google.com>

On Fri, Dec 23, 2016 at 5:13 AM, Stefan Beller <sbeller@google.com> wrote:
> *Ideally* I would rather have a different formatting, e.g. maybe:
>
>     $ git checkout remotes/origin/js/sequencer-wo-die
>     $ git rebase -i --new-magic v2.10.0-rc2^
>
> which produces:
>
>     pick d5cb9cbd64 Git 2.10-rc2                                    | Documentation/RelNo.., GIT-VERSION-GEN -..(+5, -1)
>     ...
>     pick dbfad033d4 sequencer: do not die() in do_pick_commit()     | sequencer.c - do_pick_commit              (+7, -6)
>     pick 4ef3d8f033 sequencer: lib'ify write_message()              | sequencer.c - write_message, do_pick_com..(+11, -9)
>     ...
>     ...
>     pick 88d5a271b0 sequencer: lib'ify save_opts()                  | sequencer.c - save_opts + sequencer_pick..(+20, -20)
>     pick 0e408fc347 sequencer: lib'ify fast_forward_to()            | sequencer.c - fast_forward_to             (+1, -1)
>     pick 55f5704da6 sequencer: lib'ify checkout_fast_forward()      | sequencer.c - checkout_fast_forward       (+6, -3)
>     pick 49fb937e9a sequencer: ensure to release the lock when w... | sequencer.c - read_and_refresh_cache      (+6, -3)

Instead of guessing the changes based on diffstat, you could add the
actual commit message (besides the subject line) after that '|' for
fixup! and squash! commits (and it's probably should be "#" instead of
"!"). Then you could just describe what you have changed in the commit
messages. If you don't use autosquash, you'll need some way to mark
"to-be-merged" commits though, so that you don't put all commit
messages here.
-- 
Duy

^ permalink raw reply

* Re: [PATCH] log: support 256 colors with --graph=256colors
From: Duy Nguyen @ 2016-12-25  2:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List
In-Reply-To: <xmqqtw9vg4vh.fsf@gitster.mtv.corp.google.com>

On Fri, Dec 23, 2016 at 2:06 AM, Junio C Hamano <gitster@pobox.com> wrote:
> If I were doing this, I'd just prepare a table with 32 color slots
> or so [*1*], start at a random spot (say 017:00005f) of
>
>     https://en.wikipedia.org/wiki/File:Xterm_256color_chart.svg,
>
> and pick spots by jumping southeast like a chess knight
> (i.e. 017->030->043->086->...) until the table is filled, wrapping
> around at the edge of that color chart as necessary.

If you want to play with that, [1] may help, which sorts colors in hsv
space, and you can select a few colors to see how they look, either
manually or by calculation. Yeah we probably get maybe 32 (or 48 if
you stretch it a bit) distinct colors. Not sure if it's any better
with true color terminals, probably not.

[1] https://gist.github.com/pclouds/616b67f0b5a9b20d74373286574cd0ac
-- 
Duy

^ permalink raw reply

* Corner case involving null sha1, alternates, cache misses, and submodule config API
From: Mike Hommey @ 2016-12-25  2:29 UTC (permalink / raw)
  To: hvoigt, sbeller, gitster; +Cc: git

Hi,

As you might be aware, I'm working on a mercurial remote helper for git.
The way it stores metadata for mercurial manifests abuses "commit"
references in trees, which are normally used for submodules.

Some operations in the helper use git diff-tree on those trees to find
files faster than just using ls-tree on every commit would.

Anyways, long story short, it turns out that a combination of
everything mentioned in the subject of this email causes running git
diff-tree -r --stdin with a list of 300k+ pairs of commits to take 10
minutes, when (after investigation) adding --ignore-submodules=dirty
made it take 1 minute instead, for the exact same 3GB output.

It turns out, this all starts in is_submodule_ignored(), which contains:

        if (!DIFF_OPT_TST(options, OVERRIDE_SUBMODULE_CONFIG))
                set_diffopt_flags_from_submodule_config(options, path);

And set_diffopt_flags_from_submodule_config calls:

        submodule_from_path(null_sha1, path);

And because there is no actual submodule involved, at some point that
null_sha1 ends up in the call to read_sha1_file from
submodule-config.c's config_from, which then proceeds to try to open the
null sha1 as a loose object in every alternate, doing multiple system
calls in each directory for something that is bound to fail. And to add
pain to injury, it repeats that for each and every line of input to git
diff-tree because the object cache doesn't care about storing negatives
(which makes perfect sense for most cases).

Even worse, when read_object returns NULL because the object doesn't
exist, read_sha1_file_extended calls has_loose_object which does
another set of system calls.

Now, while I realize my use case is very atypical, and that I should
just use --ignore-submodule=dirty, the fact that using the null sha1 can
trigger such behavior strikes me as a footgun that would be better
avoided. Especially when you factor the fact that
read_sha1_file_extended calls lookup_replace_object_extended, which
suggests one might interfere by creating a replace object for the null
sha1. (BTW, it's not entirely clear to me, in the context of actual
submodules, what the various --ignore-submodule options are supposed to
mean for trees that are not the current HEAD ; also, the manual page say
"all" is the default, but that doesn't appear to be true)

From a cursory look at the output of `git grep \\bnull_sha1` it doesn't
look like the null sha1 is used anywhere else in a similar fashion where
it can be attempted to be read as an object. So, one could consider this
is something the submodule config code should handle on its own by
treating the null_sha1 argument to submodule_from_path (really
config_from) specially. After all, gitmodule_sha1_from_commit already
avoids a get_sha1() call when it's given the null sha1.

OTOH, it seems submodule_from_path and submodule_from_name, the only two
public functions that end up in config_from(), are *always* called with
either the null sha1 or a literal null pointer. The *only* calls to
these functions that doesn't involve a null sha1 or a null pointer is
from test code. So all in all, I'm not entirely sure what this sha1
argument is all about in the first place.

However, an argument could be made that null_sha1 should be treated
specially at a lower level (read_sha1_file, I guess).

What would be sensible to do here?

Mike

^ permalink raw reply

* git blame while resolving merge conflicts
From: Salil Surendran @ 2016-12-25  8:44 UTC (permalink / raw)
  To: git

I often rebase a large open source project and there are merge
conflicts where I need to figure out who made the change and when it
order to decide as to which change to take. So generally what I do is
that I go to both repos and look at the file and do a git blame. Is
there a mergetool that will provide this info during conflict
resolution. I would like to know who made and this change and when for
each version. Right now I am using meld.

-- 
Thanks,
Salil
"The surest sign that intelligent life exists elsewhere in the
universe is that none of it has tried to contact us."

^ permalink raw reply

* [PATCH] gitk: use right colour for remote refs in the "Tags and heads" dialog
From: Paul Wise @ 2016-12-25 14:06 UTC (permalink / raw)
  To: git; +Cc: Paul Mackerras, Paul Wise

Makes it easier to see which refs are local and which refs are remote.
Adds consistency with the remote background colour in the graph display.

Signed-off-by: Paul Wise <pabs3@bonedaddy.net>
---
 gitk-git/gitk | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gitk-git/gitk b/gitk-git/gitk
index 805a1c703..8d2868148 100755
--- a/gitk-git/gitk
+++ b/gitk-git/gitk
@@ -3399,6 +3399,8 @@ set rectmask {
 }
 image create bitmap reficon-H -background black -foreground lime \
     -data $rectdata -maskdata $rectmask
+image create bitmap reficon-R -background black -foreground "#ffddaa" \
+    -data $rectdata -maskdata $rectmask
 image create bitmap reficon-o -background black -foreground "#ddddff" \
     -data $rectdata -maskdata $rectmask
 
@@ -9908,6 +9910,7 @@ proc sel_reflist {w x y} {
     set n [lindex $ref 0]
     switch -- [lindex $ref 1] {
 	"H" {selbyid $headids($n)}
+	"R" {selbyid $headids($n)}
 	"T" {selbyid $tagids($n)}
 	"o" {selbyid $otherrefids($n)}
     }
@@ -9937,7 +9940,11 @@ proc refill_reflist {} {
     foreach n [array names headids] {
 	if {[string match $reflistfilter $n]} {
 	    if {[commitinview $headids($n) $curview]} {
-		lappend refs [list $n H]
+		if {[string match "remotes/*" $n]} {
+		    lappend refs [list $n R]
+		} else {
+		    lappend refs [list $n H]
+		}
 	    } else {
 		interestedin $headids($n) {run refill_reflist}
 	    }
-- 
2.11.0


^ permalink raw reply related

* Re: [PATCH v2] log --graph: customize the graph lines with config log.graphColors
From: Junio C Hamano @ 2016-12-25 23:02 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
In-Reply-To: <20161224113817.18407-1-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

>  Sounds like the good first step should be something like this instead
>  of jumping straight to generating a new color palette automatically.

I like this not merely as a good first step but potentially a good
endgame.

> diff --git a/graph.c b/graph.c
> index d4e8519..9c58fd1 100644
> --- a/graph.c
> +++ b/graph.c
> @@ -79,6 +79,39 @@ static void graph_show_line_prefix(const struct diff_options *diffopt)
>  static const char **column_colors;
>  static unsigned short column_colors_max;
>  
> +static void set_column_colors_by_config(void)
> +{
> +	static char **colors;
> +	static int colors_max, colors_alloc;

I doubt you want these to be static (and allow multiple instances of
the same configuration variable to accumulate).  As you defined the
value to be comma-separated colors, users would want the usual "last
one wins" rule to override previous settings, no?

> +	char *string = NULL;
> +	const char *end, *start;
> +
> +	if (git_config_get_string("log.graphcolors", &string)) {
> +		graph_set_column_colors(column_colors_ansi,
> +					column_colors_ansi_max);

On the other hand, if you do want cumulative semantics, then you'd
need to free the previous set you held in the statics here, I would
think.

> +		return;
> +	}
> +
> +	start = string;
> +	end = string + strlen(string);
> +	while (start < end) {
> +		const char *comma = strchrnul(start, ',');
> +		char color[COLOR_MAXLEN];
> +
> +		if (!color_parse_mem(start, comma - start, color)) {
> +			ALLOC_GROW(colors, colors_max + 1, colors_alloc);
> +			colors[colors_max++] = xstrdup(color);
> +		} else
> +			warning(_("ignore invalid color '%.*s'"),
> +				(int)(comma - start), start);

"ignore invalid color"?  Who is giving the command to ignore to whom?

> +		start = comma + 1;
> +	}
> +	free(string);
> +	ALLOC_GROW(colors, colors_max + 1, colors_alloc);
> +	colors[colors_max] = xstrdup(GIT_COLOR_RESET);
> +	graph_set_column_colors((const char **)colors, colors_max);
> +}
> +
>  void graph_set_column_colors(const char **colors, unsigned short colors_max)
>  {
>  	column_colors = colors;
> @@ -239,8 +272,7 @@ struct git_graph *graph_init(struct rev_info *opt)
>  	struct git_graph *graph = xmalloc(sizeof(struct git_graph));
>  
>  	if (!column_colors)
> -		graph_set_column_colors(column_colors_ansi,
> -					column_colors_ansi_max);
> +		set_column_colors_by_config();

"by" in the function name somehow sounds funny, at least to me.

^ permalink raw reply

* git-apply: warn/fail on *changed* end of line (eol) *only*?
From: Igor Djordjevic BugA @ 2016-12-25 23:49 UTC (permalink / raw)
  To: git

Hi to all,

This is a follow-up of a message posted at git-users Google group[1],
as the topic may actually be better suited for this mailing list
instead. If needed, some elaboration on the context can be found there,
as well as a detailed example describing the motive for the question
itself (or directly here[2], second half of the message).

In short -- git-apply warns on applying the patch with CRLF line endings
(new), considered whitespace errors, even when previous hunk version
(old) has/had that very same CRLF line endings, too, so nothing actually
changed in this regards. Even worse, it happily applies a patch with LF
line endings (new) without any warning/hint, even though previous (old)
line endings were CRLF, thus effectively (and silently) breaking the
(previous) line endings.

I understand that what should be considered "correct" line endings can
be set explicitly (and should be communicated on the project level in
the first place), but would it make sense to have an additional
git-apply --whitespace option (like "warn-changed" and "error-changed",
or something) to warn/fail on *changed* end of line *only*, without any
further knowledge needed?

So not to have Git need to assume or know which line endings should be
considered "correct", nor to think/bother too much with it, but just to
warn/fail on actual line ending *change* (in comparison between old and
new part/hunk of the patch).

This seems more intuitive to me, and much more cross-platform
collaboration friendly, at least as an option, as one wouldn`t need to
think much about "correct" project/file line endings (which would still
be advised, anyway), but would be promptly warned about possibly
breaking previous/existing line endings, maybe even with an option to
keep those previous ones, or force the new ones... yet as I`m still new
around, I accept that I might be missing some bigger picture here :)

Thanks, BugA

P.S. I`m discussing git-apply and end-of-line handling here in
particular as that`s what I`m currently concerned with, and for the sake
of simplicity, but might be that whole thing could theoretically be
broadened to other Git tools (like git-diff) and whitespace (error)
handling in general, too (warn/fail only if actually introduced in new
hunk, not previously being present in the old one).
--
[1] https://groups.google.com/d/msg/git-users/9Os_48yJdn0/j9S_W7h-EAAJ
[2] https://groups.google.com/d/msg/git-users/9Os_48yJdn0/5S9Ja1fEEAAJ

^ permalink raw reply

* Re: git-apply: warn/fail on *changed* end of line (eol) *only*?
From: Igor Djordjevic BugA @ 2016-12-25 23:55 UTC (permalink / raw)
  To: git
In-Reply-To: <ac97f925-d930-0592-0a2a-66c9218b1417@gmail.com>

On 26/12/2016 00:49, Igor Djordjevic BugA wrote:
> This is a follow-up of a message posted at git-users Google group[1],
> as the topic may actually be better suited for this mailing list
> instead. If needed, some elaboration on the context can be found there,
> as well as a detailed example describing the motive for the question
> itself (or directly here[2], second half of the message).

For some reason reference links got excluded, so here they are again:

[1] https://groups.google.com/d/msg/git-users/9Os_48yJdn0/j9S_W7h-EAAJ
[2] https://groups.google.com/d/msg/git-users/9Os_48yJdn0/5S9Ja1fEEAAJ

^ permalink raw reply

* Re: git-apply: warn/fail on *changed* end of line (eol) *only*?
From: Junio C Hamano @ 2016-12-26  3:25 UTC (permalink / raw)
  To: Igor Djordjevic BugA; +Cc: git
In-Reply-To: <ac97f925-d930-0592-0a2a-66c9218b1417@gmail.com>

Igor Djordjevic BugA <igor.d.djordjevic@gmail.com> writes:

> In short -- git-apply warns on applying the patch with CRLF line endings
> (new), considered whitespace errors, even when previous hunk version
> (old) has/had that very same CRLF line endings, too, so nothing actually
> changed in this regards. Even worse, it happily applies a patch with LF
> line endings (new) without any warning/hint, even though previous (old)
> line endings were CRLF, thus effectively (and silently) breaking the
> (previous) line endings.

Let me see if I understood your problem description correctly.

Imagine that the project wants LF line endings, i.e. it considers
that a line with CRLF ending has an unwanted "whitespace" at the
end.  Now, you start from this source file:

    1 <CRLF>
    3 <CRLF>
    5 <CRLF>

and a patch like this comes in:

     1 <CRLF>
    -3 <CRLF>
    +three <CRLF>
     5 <CRLF>

You think that "3 <CRLF>" was replaced by "three <CRLF>", and the
claim is "the 'previous' contents already had <CRLF> ending, so the
change is not making things worse".

But what if the patch was like this?

     1 <CRLF>
    -3 <CRLF>
    +three <CRLF>
    +four <CRLF>
     5 <CRLF>

Do you want to warn on "four <CRLF>" because it does not have any
"previous" corresponding line?

Extending the thought further, which line do you want to warn and
which line do you not want to, if the patch were like this instead?

     1 <CRLF>
    -3 <CRLF>
    +four <CRLF>
    +three <CRLF>
     5 <CRLF>

Extending this thought experiment further, you would realize that
fundamentally the concept of "previous contents" has no sensible
definition.

Incidentally, not realizing "there is no sensible definition of
'previous contents'" is a source of another often seen confusion by
new users of Git and any version control systems regarding the
"blame" feature.  When given the final answer by "blame", they often
say "what content did the line have _before_ the commit blame gave
me?"  As there is no sensible definition of 'previous contents' that
corresponds to the line, that question does not in general have a
sensible answer.

^ permalink raw reply

* Re: [PATCH v2 02/21] config: add git_config_get_split_index()
From: Christian Couder @ 2016-12-26  8:15 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Christian Couder
In-Reply-To: <20161219111442.GA24125@ash>

On Mon, Dec 19, 2016 at 12:14 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Dec 17, 2016 at 03:55:28PM +0100, Christian Couder wrote:
>> diff --git a/config.c b/config.c
>> index 2eaf8ad77a..c1343bbb3e 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -1709,6 +1709,16 @@ int git_config_get_untracked_cache(void)
>>       return -1; /* default value */
>>  }
>>
>> +int git_config_get_split_index(void)
>> +{
>> +     int val = -1;
>
> Is it redundant to set default value here because it's not used
> anywhere? The "return val;" will always have the new value from
> git_config_. And you don't use "val" in error case.

Yeah, it is redundant, so I will not set the default value here in the
next version.

^ permalink raw reply

* Re: [PATCH v2 04/21] read-cache: add and then use tweak_split_index()
From: Christian Couder @ 2016-12-26  8:20 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Christian Couder
In-Reply-To: <20161219111843.GB24125@ash>

On Mon, Dec 19, 2016 at 12:18 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Dec 17, 2016 at 03:55:30PM +0100, Christian Couder wrote:
>> +static void tweak_split_index(struct index_state *istate)
>> +{
>> +     switch (git_config_get_split_index()) {
>> +     case -1: /* unset: do nothing */
>> +             break;
>> +     case 0: /* false */
>> +             remove_split_index(istate);
>> +             break;
>> +     case 1: /* true */
>> +             add_split_index(istate);
>> +             break;
>> +     default: /* unknown value: do nothing */
>
> Probably should die("BUG:") here since it looks to me like
> git_config_maybe_bool() (inside git_config_get_split_index) in this
> case has been updated to return more values than we can handle. And we
> need to know about that so we can handle it properly.

If we later add another value to the possible ones, like for example
"auto", we might not want old versions of Git to fail with a "BUG:..."
message when they read config files tweaked for newer Git versions, so
I don't think we should die() here.

A warning would be ok, but given that in tweak_untracked_cache() we
decided to do nothing, I am thinking that it would be more consistent
to just do nothing here too.

^ permalink raw reply

* Re: [PATCH v2 10/21] read-cache: regenerate shared index if necessary
From: Christian Couder @ 2016-12-26  8:33 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Christian Couder
In-Reply-To: <20161219114807.GC24125@ash>

On Mon, Dec 19, 2016 at 12:48 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Dec 17, 2016 at 03:55:36PM +0100, Christian Couder wrote:
>> +static const int default_max_percent_split_change = 20;
>> +
>> +static int too_many_not_shared_entries(struct index_state *istate)
>> +{
>> +     int i, not_shared = 0;
>> +     int max_split = git_config_get_max_percent_split_change();
>> +
>> +     switch (max_split) {
>> +     case -1:
>> +             /* not or badly configured: use the default value */
>> +             max_split = default_max_percent_split_change;
>> +             break;
>> +     case 0:
>> +             return 1; /* 0% means always write a new shared index */
>> +     case 100:
>> +             return 0; /* 100% means never write a new shared index */
>
> I wonder if we really need to special case these here. If I read it
> correctly, the expression at the end of this function will return 1
> when max_split is 0, and 0 when max_split is 100 (not counting the
> case when cache_nr is zero).

It's better for performance if we can avoid computing the number of
unshared entries, which we can in case of 0 or 100.

> Perhaps it's good for documentation purpose.

Yeah, I think it's also good for documentation purpose.

> Though I find it hard to
> see a use case for max_split == 0. Always creating a new shared index
> sounds crazy.

Yeah, but perhaps to test shared index writing performance people
might want to use it. And I don't see any good way or any good reason
to disallow it.

^ permalink raw reply

* Re: [PATCH v2 16/21] read-cache: unlink old sharedindex files
From: Christian Couder @ 2016-12-26  9:54 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Christian Couder
In-Reply-To: <20161219115812.GD24125@ash>

On Mon, Dec 19, 2016 at 12:58 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Dec 17, 2016 at 03:55:42PM +0100, Christian Couder wrote:
>> +static int clean_shared_index_files(const char *current_hex)
>> +{
>> +     struct dirent *de;
>> +     DIR *dir = opendir(get_git_dir());
>> +
>> +     if (!dir)
>> +             return error_errno(_("unable to open git dir: %s"), get_git_dir());
>> +
>> +     while ((de = readdir(dir)) != NULL) {
>> +             const char *sha1_hex;
>> +             if (!skip_prefix(de->d_name, "sharedindex.", &sha1_hex))
>> +                     continue;
>> +             if (!strcmp(sha1_hex, current_hex))
>
> fspathcmp since we're dealing fs paths?
>
> In theory we should be ok using strcmp, even on Windows because case
> is preserved, I think. It's only a problem when we write path 'abc'
> down and read 'ABC' back.
>
> Oh well.. your call because if you go with fspathcmp, skip_prefix can't
> be used either. A lot more changes for a very rare case.

I'd rather keep using skip_prefix() and strcmp().

I could find no place in the code where fspathcmp() is used to compare
dir entries, but a few where other *cmp() functions are used:

$ git grep '>d_name' | grep cmp
builtin/repack.c:               if (strncmp(e->d_name, buf.buf +
dirlen, prefixlen))
dir.c:  if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
sha1_name.c:                    if (memcmp(de->d_name, ds->hex_pfx +
2, ds->len - 2))

and also a few where skip_prefix() is used:

$ git grep -n '>d_name' | grep prefix
builtin/repack.c:60:            if (strncmp(e->d_name, buf.buf +
dirlen, prefixlen))
help.c:147:             if (!skip_prefix(de->d_name, prefix, &ent))

so I think it is ok to use skip_prefix() and strcmp().

>> +                     continue;
>> +             if (can_delete_shared_index(sha1_hex) > 0 &&
>
> Probably shorter to pass full d->name here so you don't have to do
> another git_path() in can_delete_

Yeah, you are right, I will do that.

^ permalink raw reply

* [PATCH v3 00/21] Add configuration options for split-index
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder

Goal
~~~~

We want to make it possible to use the split-index feature
automatically by just setting a new "core.splitIndex" configuration
variable to true.

This can be valuable as split-index can help significantly speed up
`git rebase` especially along with the work to libify `git apply`
that has been merged to master
(see https://github.com/git/git/commit/81358dc238372793b1590efa149cc1581d1fbd98)
and is now in v2.11.

Design
~~~~~~

The design is similar as the previous work that introduced
"core.untrackedCache". 

The new "core.splitIndex" configuration option can be either true,
false or undefined which is the default.

When it is true, the split index is created, if it does not already
exists, when the index is read. When it is false, the split index is
removed if it exists, when the index is read. Otherwise it is left as
is.

Along with this new configuration variable, the two following options
are also introduced:

    - splitIndex.maxPercentChange

    This is to avoid having too many changes accumulating in the split
    index while in split index mode. The git-update-index
    documentation says:

	If split-index mode is already enabled and `--split-index` is
	given again, all changes in $GIT_DIR/index are pushed back to
	the shared index file.

    but it is probably better to not expect the user to think about it
    and to have a mechanism that pushes back all changes to the shared
    index file automatically when some threshold is reached.

    The default threshold is when the number of entries in the split
    index file reaches 20% of the number of entries in the shared
    index file. The new "splitIndex.maxPercentChange" config option
    lets people tweak this value.

    - splitIndex.sharedIndexExpire

    To make sure that old sharedindex files are eventually removed
    when a new one has been created, we "touch" the shared index file
    every time a split index file using the shared index file is
    either created or read from. Then we can delete shared indexes
    with an mtime older than one week (by default), when we create a
    new shared index file. The new "splitIndex.sharedIndexExpire"
    config option lets people tweak this grace period.

    This idea was suggested by Duy in:

    https://public-inbox.org/git/CACsJy8BqMFASHf5kJgUh+bd7XG98CafNydE964VJyPXz-emEvA@mail.gmail.com/

    and after some experiments, I agree that it is much simpler than
    what I thought could be done during our discussion.

    Junio also thinks that we have to do "time-based GC" in:
 
    https://public-inbox.org/git/xmqqeg33ccjj.fsf@gitster.mtv.corp.google.com/

Note that this patch series doesn't address a leak when removing a
split-index, but Duy wrote that he has a patch to fix this leak:

https://public-inbox.org/git/CACsJy8AisF2ZVs7JdnVFp5wdskkbVQQQ=DBq5UzE1MOsCfBMtQ@mail.gmail.com/

Highlevel view of the patches in the series
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Except for patch 1/21, there are 3 big steps, one for each new
configuration variable introduced.

There only a few small differences between this patch series and the
v2 patch series sent a few weeks ago. Only two commits have been
changed a little, as suggested by Duy.

    - Patch 1/21 marks a message for translation. It was a new patch
      in v2 and it can be applied separately. (The patch 1/19 in v1,
      which was a typo fix, has been merged separately into master
      already.)

Step 1 is:

    - Patches 2/21 to 5/21 introduce the functions that are reading
      the "core.splitIndex" configuration variable and tweaking the
      split index depending on its value.

    - Patch 6/21 adds a few tests for the new feature.

    - Patches 7/21 and 8/21 add some documentation for the new
      feature.

    The only change since v2 in this step is that an unnecessary
    variable initialization has been droped in 2/21, as suggested by
    Duy.

Step 2 is:

    - Patches 9/21 and 10/21 introduce the functions that are reading
      the "splitIndex.maxPercentChange" configuration variable and
      regenerating a new shared index file depending on its value.

    - Patch 11/21 adds a few tests for the new feature.

    - Patch 12/21 add some documentation for the new feature.

    There are no changes in this step since v2.

Step 3 is:

    - Patches 13/21 to 16/21 introduce the functions that are reading
      the "splitIndex.sharedIndexExpire" configuration variable and
      expiring old shared index files depending on its value.

    - Patch 17/21 adds a few tests for the new feature.

    - Patches 18/21 and 19/21 are new patches. They update the mtime
      of the shared index file when a split index based on the shared
      index file is read from. 18/21 is a refactoring to make the
      actual change in 19/21 easier.

    - Patches 20/21 and 21/21 add some documentation for the new
      feature.

    The only change since v2 in this step is that the full shared
    index path is passed to the can_delete_shared_index() function in
    16/21 as suggested by Duy.

Links
~~~~~

This patch series is also available here:

  https://github.com/chriscool/git/commits/config-split-index

The previous versions were:

  RFC: https://github.com/chriscool/git/commits/config-split-index7
  v1:  https://github.com/chriscool/git/commits/config-split-index72
  v2:  https://github.com/chriscool/git/commits/config-split-index99

On the mailing list the related patch series and discussions were:

  RFC: https://public-inbox.org/git/20160711172254.13439-1-chriscool@tuxfamily.org/
  v1:  https://public-inbox.org/git/20161023092648.12086-1-chriscool@tuxfamily.org/
  v2:  https://public-inbox.org/git/20161217145547.11748-1-chriscool@tuxfamily.org/

Christian Couder (21):
  config: mark an error message up for translation
  config: add git_config_get_split_index()
  split-index: add {add,remove}_split_index() functions
  read-cache: add and then use tweak_split_index()
  update-index: warn in case of split-index incoherency
  t1700: add tests for core.splitIndex
  Documentation/config: add information for core.splitIndex
  Documentation/git-update-index: talk about core.splitIndex config var
  config: add git_config_get_max_percent_split_change()
  read-cache: regenerate shared index if necessary
  t1700: add tests for splitIndex.maxPercentChange
  Documentation/config: add splitIndex.maxPercentChange
  sha1_file: make check_and_freshen_file() non static
  read-cache: touch shared index files when used
  config: add git_config_get_expiry() from gc.c
  read-cache: unlink old sharedindex files
  t1700: test shared index file expiration
  read-cache: refactor read_index_from()
  read-cache: use freshen_shared_index() in read_index_from()
  Documentation/config: add splitIndex.sharedIndexExpire
  Documentation/git-update-index: explain splitIndex.*

 Documentation/config.txt           |  28 +++++++
 Documentation/git-update-index.txt |  43 +++++++++--
 builtin/gc.c                       |  15 +---
 builtin/update-index.c             |  25 +++---
 cache.h                            |   8 ++
 config.c                           |  42 +++++++++-
 read-cache.c                       | 143 ++++++++++++++++++++++++++++++++--
 sha1_file.c                        |   2 +-
 split-index.c                      |  22 ++++++
 split-index.h                      |   2 +
 t/t1700-split-index.sh             | 154 +++++++++++++++++++++++++++++++++++++
 11 files changed, 442 insertions(+), 42 deletions(-)

-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply

* [PATCH v3 01/21] config: mark an error message up for translation
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 config.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/config.c b/config.c
index 83fdecb1bc..2eaf8ad77a 100644
--- a/config.c
+++ b/config.c
@@ -1701,8 +1701,8 @@ int git_config_get_untracked_cache(void)
 		if (!strcasecmp(v, "keep"))
 			return -1;
 
-		error("unknown core.untrackedCache value '%s'; "
-		      "using 'keep' default value", v);
+		error(_("unknown core.untrackedCache value '%s'; "
+			"using 'keep' default value"), v);
 		return -1;
 	}
 
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 02/21] config: add git_config_get_split_index()
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

This new function will be used in a following commit to know
if we want to use the split index feature or not.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h  |  1 +
 config.c | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/cache.h b/cache.h
index a50a61a197..c126fe475e 100644
--- a/cache.h
+++ b/cache.h
@@ -1821,6 +1821,7 @@ extern int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
+extern int git_config_get_split_index(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 2eaf8ad77a..421e8c9da6 100644
--- a/config.c
+++ b/config.c
@@ -1709,6 +1709,16 @@ int git_config_get_untracked_cache(void)
 	return -1; /* default value */
 }
 
+int git_config_get_split_index(void)
+{
+	int val;
+
+	if (!git_config_get_maybe_bool("core.splitindex", &val))
+		return val;
+
+	return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 03/21] split-index: add {add,remove}_split_index() functions
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

Also use the functions in cmd_update_index() in
builtin/update-index.c.

These functions will be used in a following commit to tweak
our use of the split-index feature depending on the setting
of a configuration variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/update-index.c | 18 ++++++------------
 split-index.c          | 22 ++++++++++++++++++++++
 split-index.h          |  2 ++
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d530e89368..24fdadfa4b 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,18 +1099,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (split_index > 0) {
-		init_split_index(&the_index);
-		the_index.cache_changed |= SPLIT_INDEX_ORDERED;
-	} else if (!split_index && the_index.split_index) {
-		/*
-		 * can't discard_split_index(&the_index); because that
-		 * will destroy split_index->base->cache[], which may
-		 * be shared with the_index.cache[]. So yeah we're
-		 * leaking a bit here.
-		 */
-		the_index.split_index = NULL;
-		the_index.cache_changed |= SOMETHING_CHANGED;
-	}
+		if (the_index.split_index)
+			the_index.cache_changed |= SPLIT_INDEX_ORDERED;
+		else
+			add_split_index(&the_index);
+	} else if (!split_index)
+		remove_split_index(&the_index);
 
 	switch (untracked_cache) {
 	case UC_UNSPECIFIED:
diff --git a/split-index.c b/split-index.c
index 615f4cac05..f519e60f87 100644
--- a/split-index.c
+++ b/split-index.c
@@ -317,3 +317,25 @@ void replace_index_entry_in_base(struct index_state *istate,
 		istate->split_index->base->cache[new->index - 1] = new;
 	}
 }
+
+void add_split_index(struct index_state *istate)
+{
+	if (!istate->split_index) {
+		init_split_index(istate);
+		istate->cache_changed |= SPLIT_INDEX_ORDERED;
+	}
+}
+
+void remove_split_index(struct index_state *istate)
+{
+	if (istate->split_index) {
+		/*
+		 * can't discard_split_index(&the_index); because that
+		 * will destroy split_index->base->cache[], which may
+		 * be shared with the_index.cache[]. So yeah we're
+		 * leaking a bit here.
+		 */
+		istate->split_index = NULL;
+		istate->cache_changed |= SOMETHING_CHANGED;
+	}
+}
diff --git a/split-index.h b/split-index.h
index c1324f521a..df91c1bda8 100644
--- a/split-index.h
+++ b/split-index.h
@@ -31,5 +31,7 @@ void merge_base_index(struct index_state *istate);
 void prepare_to_write_split_index(struct index_state *istate);
 void finish_writing_split_index(struct index_state *istate);
 void discard_split_index(struct index_state *istate);
+void add_split_index(struct index_state *istate);
+void remove_split_index(struct index_state *istate);
 
 #endif
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 04/21] read-cache: add and then use tweak_split_index()
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

This will make us use the split-index feature or not depending
on the value of the "core.splitIndex" config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index f92a912dcb..732b7fe611 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1566,10 +1566,27 @@ static void tweak_untracked_cache(struct index_state *istate)
 	}
 }
 
+static void tweak_split_index(struct index_state *istate)
+{
+	switch (git_config_get_split_index()) {
+	case -1: /* unset: do nothing */
+		break;
+	case 0: /* false */
+		remove_split_index(istate);
+		break;
+	case 1: /* true */
+		add_split_index(istate);
+		break;
+	default: /* unknown value: do nothing */
+		break;
+	}
+}
+
 static void post_read_index_from(struct index_state *istate)
 {
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
+	tweak_split_index(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 09/21] config: add git_config_get_max_percent_split_change()
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

This new function will be used in a following commit to get the
value of the "splitIndex.maxPercentChange" config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h  |  1 +
 config.c | 15 +++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/cache.h b/cache.h
index c126fe475e..e15b421b6f 100644
--- a/cache.h
+++ b/cache.h
@@ -1822,6 +1822,7 @@ extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
+extern int git_config_get_max_percent_split_change(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 421e8c9da6..cf212785bb 100644
--- a/config.c
+++ b/config.c
@@ -1719,6 +1719,21 @@ int git_config_get_split_index(void)
 	return -1; /* default value */
 }
 
+int git_config_get_max_percent_split_change(void)
+{
+	int val = -1;
+
+	if (!git_config_get_int("splitindex.maxpercentchange", &val)) {
+		if (0 <= val && val <= 100)
+			return val;
+
+		return error(_("splitIndex.maxPercentChange value '%d' "
+			       "should be between 0 and 100"), val);
+	}
+
+	return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 10/21] read-cache: regenerate shared index if necessary
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

When writing a new split-index and there is a big number of cache
entries in the split-index compared to the shared index, it is a
good idea to regenerate the shared index.

By default when the ratio reaches 20%, we will push back all
the entries from the split-index into a new shared index file
instead of just creating a new split-index file.

The threshold can be configured using the
"splitIndex.maxPercentChange" config variable.

We need to adjust the existing tests in t1700 by setting
"splitIndex.maxPercentChange" to 100 at the beginning of t1700,
as the existing tests are assuming that the shared index is
regenerated only when `git update-index --split-index` is used.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c           | 32 ++++++++++++++++++++++++++++++++
 t/t1700-split-index.sh |  1 +
 2 files changed, 33 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 732b7fe611..a1aaec5135 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2220,6 +2220,36 @@ static int write_shared_index(struct index_state *istate,
 	return ret;
 }
 
+static const int default_max_percent_split_change = 20;
+
+static int too_many_not_shared_entries(struct index_state *istate)
+{
+	int i, not_shared = 0;
+	int max_split = git_config_get_max_percent_split_change();
+
+	switch (max_split) {
+	case -1:
+		/* not or badly configured: use the default value */
+		max_split = default_max_percent_split_change;
+		break;
+	case 0:
+		return 1; /* 0% means always write a new shared index */
+	case 100:
+		return 0; /* 100% means never write a new shared index */
+	default:
+		; /* do nothing: just use the configured value */
+	}
+
+	/* Count not shared entries */
+	for (i = 0; i < istate->cache_nr; i++) {
+		struct cache_entry *ce = istate->cache[i];
+		if (!ce->index)
+			not_shared++;
+	}
+
+	return istate->cache_nr * max_split < not_shared * 100;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		       unsigned flags)
 {
@@ -2237,6 +2267,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		if ((v & 15) < 6)
 			istate->cache_changed |= SPLIT_INDEX_ORDERED;
 	}
+	if (too_many_not_shared_entries(istate))
+		istate->cache_changed |= SPLIT_INDEX_ORDERED;
 	if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
 		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index db8c39f446..507a1dd1ad 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -8,6 +8,7 @@ test_description='split index mode tests'
 sane_unset GIT_TEST_SPLIT_INDEX
 
 test_expect_success 'enable split index' '
+	git config splitIndex.maxPercentChange 100 &&
 	git update-index --split-index &&
 	test-dump-split-index .git/index >actual &&
 	indexversion=$(test-index-version <.git/index) &&
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 05/21] update-index: warn in case of split-index incoherency
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

When users are using `git update-index --(no-)split-index`, they
may expect the split-index feature to be used or not according to
the option they just used, but this might not be the case if the
new "core.splitIndex" config variable has been set. In this case
let's warn about what will happen and why.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/update-index.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 24fdadfa4b..d74d72cc7f 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,12 +1099,21 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (split_index > 0) {
+		if (git_config_get_split_index() == 0)
+			warning(_("core.splitIndex is set to false; "
+				  "remove or change it, if you really want to "
+				  "enable split index"));
 		if (the_index.split_index)
 			the_index.cache_changed |= SPLIT_INDEX_ORDERED;
 		else
 			add_split_index(&the_index);
-	} else if (!split_index)
+	} else if (!split_index) {
+		if (git_config_get_split_index() == 1)
+			warning(_("core.splitIndex is set to true; "
+				  "remove or change it, if you really want to "
+				  "disable split index"));
 		remove_split_index(&the_index);
+	}
 
 	switch (untracked_cache) {
 	case UC_UNSPECIFIED:
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 11/21] t1700: add tests for splitIndex.maxPercentChange
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 507a1dd1ad..f03addf654 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -238,4 +238,76 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+	git config core.splitIndex true &&
+	: >three &&
+	git update-index --add three &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >four &&
+	git update-index --add four &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	four
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check behavior with splitIndex.maxPercentChange unset' '
+	git config --unset splitIndex.maxPercentChange &&
+	: >five &&
+	git update-index --add five &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >six &&
+	git update-index --add six &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	six
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check splitIndex.maxPercentChange set to 0' '
+	git config splitIndex.maxPercentChange 0 &&
+	: >seven &&
+	git update-index --add seven &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >eight &&
+	git update-index --add eight &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
 test_done
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 15/21] config: add git_config_get_expiry() from gc.c
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

This function will be used in a following commit to get the expiration
time of the shared index files from the config, and it is generic
enough to be put in "config.c".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/gc.c | 15 ++-------------
 cache.h      |  3 +++
 config.c     | 13 +++++++++++++
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 069950d0b4..1e40d45aa2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -62,17 +62,6 @@ static void report_pack_garbage(unsigned seen_bits, const char *path)
 		string_list_append(&pack_garbage, path);
 }
 
-static void git_config_date_string(const char *key, const char **output)
-{
-	if (git_config_get_string_const(key, output))
-		return;
-	if (strcmp(*output, "now")) {
-		unsigned long now = approxidate("now");
-		if (approxidate(*output) >= now)
-			git_die_config(key, _("Invalid %s: '%s'"), key, *output);
-	}
-}
-
 static void process_log_file(void)
 {
 	struct stat st;
@@ -111,8 +100,8 @@ static void gc_config(void)
 	git_config_get_int("gc.auto", &gc_auto_threshold);
 	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
 	git_config_get_bool("gc.autodetach", &detach_auto);
-	git_config_date_string("gc.pruneexpire", &prune_expire);
-	git_config_date_string("gc.worktreepruneexpire", &prune_worktrees_expire);
+	git_config_get_expiry("gc.pruneexpire", &prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
 	git_config(git_default_config, NULL);
 }
 
diff --git a/cache.h b/cache.h
index f442f28189..279415afbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1827,6 +1827,9 @@ extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
 
+/* This dies if the configured or default date is in the future */
+extern int git_config_get_expiry(const char *key, const char **output);
+
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
  * ensure that they do not modify the untracked cache when reading it.
diff --git a/config.c b/config.c
index cf212785bb..d6c8f8f3ba 100644
--- a/config.c
+++ b/config.c
@@ -1685,6 +1685,19 @@ int git_config_get_pathname(const char *key, const char **dest)
 	return ret;
 }
 
+int git_config_get_expiry(const char *key, const char **output)
+{
+	int ret = git_config_get_string_const(key, output);
+	if (ret)
+		return ret;
+	if (strcmp(*output, "now")) {
+		unsigned long now = approxidate("now");
+		if (approxidate(*output) >= now)
+			git_die_config(key, _("Invalid %s: '%s'"), key, *output);
+	}
+	return ret;
+}
+
 int git_config_get_untracked_cache(void)
 {
 	int val = -1;
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 12/21] Documentation/config: add splitIndex.maxPercentChange
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 221c5982c0..e0f5a77980 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2773,6 +2773,19 @@ showbranch.default::
 	The default set of branches for linkgit:git-show-branch[1].
 	See linkgit:git-show-branch[1].
 
+splitIndex.maxPercentChange::
+	When the split index feature is used, this specifies the
+	percent of entries the split index can contain compared to the
+	whole number of entries in both the split index and the shared
+	index before a new shared index is written.
+	The value should be between 0 and 100. If the value is 0 then
+	a new shared index is always written, if it is 100 a new
+	shared index is never written.
+	By default the value is 20, so a new shared index is written
+	if the number of entries in the split index would be greater
+	than 20 percent of the total number of entries.
+	See linkgit:git-update-index[1].
+
 status.relativePaths::
 	By default, linkgit:git-status[1] shows paths relative to the
 	current directory. Setting this variable to `false` shows paths
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related

* [PATCH v3 19/21] read-cache: use freshen_shared_index() in read_index_from()
From: Christian Couder @ 2016-12-26 10:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161226102222.17150-1-chriscool@tuxfamily.org>

This way a share index file will not be garbage collected if
we still read from an index it is based from.

As we need to read the current index before creating a new
one, the tests have to be adjusted, so that we don't expect
an old shared index file to be deleted right away when we
create a new one.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c           |  1 +
 t/t1700-split-index.sh | 14 +++++++-------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 98ef1323d6..bf0ac1ce61 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1727,6 +1727,7 @@ int read_index_from(struct index_state *istate, const char *path)
 		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
 
+	freshen_shared_index(base_sha1_hex);
 	merge_base_index(istate);
 	post_read_index_from(istate);
 	return ret;
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index f448fc13cd..800f84a593 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -313,17 +313,17 @@ EOF
 test_expect_success 'shared index files expire after 7 days by default' '
 	: >ten &&
 	git update-index --add ten &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_under_7_days_ago=$((1-7*86400)) &&
 	test-chmtime =$just_under_7_days_ago .git/sharedindex.* &&
 	: >eleven &&
 	git update-index --add eleven &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_over_7_days_ago=$((-1-7*86400)) &&
 	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
 	: >twelve &&
 	git update-index --add twelve &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
@@ -331,12 +331,12 @@ test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
 	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
 	: >thirteen &&
 	git update-index --add thirteen &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_over_8_days_ago=$((-1-8*86400)) &&
 	test-chmtime =$just_over_8_days_ago .git/sharedindex.* &&
 	: >fourteen &&
 	git update-index --add fourteen &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"' '
@@ -345,13 +345,13 @@ test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"
 	test-chmtime =$just_10_years_ago .git/sharedindex.* &&
 	: >fifteen &&
 	git update-index --add fifteen &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	git config splitIndex.sharedIndexExpire now &&
 	just_1_second_ago=-1 &&
 	test-chmtime =$just_1_second_ago .git/sharedindex.* &&
 	: >sixteen &&
 	git update-index --add sixteen &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_done
-- 
2.11.0.209.gda91e66374.dirty


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox