Git development
 help / color / mirror / Atom feed
* Re: [PATCH v15 08/27] bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C
From: Stephan Beyer @ 2016-12-17 19:55 UTC (permalink / raw)
  To: Pranit Bauva; +Cc: Git List
In-Reply-To: <CAFZEwPMwviof5jNvQnxFZYdw324XpbLSQ9mzEQjrCW-K4A+GCA@mail.gmail.com>

Hi Pranit,

On 12/16/2016 08:35 PM, Pranit Bauva wrote:
> On Thu, Nov 17, 2016 at 5:17 AM, Stephan Beyer <s-beyer@gmx.net> wrote:
>> On 10/14/2016 04:14 PM, Pranit Bauva wrote:
>>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>>> index d84ba86..c542e8b 100644
>>> --- a/builtin/bisect--helper.c
>>> +++ b/builtin/bisect--helper.c
>>> @@ -123,13 +123,40 @@ static int bisect_reset(const char *commit)
>>>       return bisect_clean_state();
>>>  }
>>>
>>> +static int is_expected_rev(const char *expected_hex)
>>> +{
>>> +     struct strbuf actual_hex = STRBUF_INIT;
>>> +     int res = 0;
>>> +     if (strbuf_read_file(&actual_hex, git_path_bisect_expected_rev(), 0) >= 40) {
>>> +             strbuf_trim(&actual_hex);
>>> +             res = !strcmp(actual_hex.buf, expected_hex);
>>> +     }
>>> +     strbuf_release(&actual_hex);
>>> +     return res;
>>> +}
>>
>> I am not sure it does what it should.
>>
>> I would expect the following behavior from this function:
>>  - file does not exist (or is "broken") => return 0
>>  - actual_hex != expected_hex => return 0
>>  - otherwise return 1
>>
>> If I am not wrong, the code does the following instead:
>>  - file does not exist (or is "broken") => return 0
>>  - actual_hex != expected_hex => return 1
>>  - otherwise => return 0
> 
> It seems that I didn't carefully see what the shell code is (or
> apparently did a mistake in understanding it ;)). I think the C
> version does exactly what the shell version does. Can you confirm it
> once again, please?

I check again...

The shell code does the following:
 - file does not exist or is not a regular file => return false
 - actual_hex != expected_hex => return false
 - otherwise => return true

(false means a value != 0, true means 0)

Your code does the following:
 - file cannot be read or is broken => return 0
 - actual_hex != expected_hex => return 0
 - otherwise => return 1

So you are very right, it seems I had a weird thinko (I probably missed
the "!" in front of strcmp or something) :)

Sorry for bothering,
  Stephan

^ permalink raw reply

* [PATCH] mailinfo.c: move side-effects outside of assert
From: Kyle J. McKay @ 2016-12-17 19:54 UTC (permalink / raw)
  To: Jonathan Tan, Junio C Hamano; +Cc: Git mailing list

Since 6b4b013f18 (mailinfo: handle in-body header continuations,
2016-09-20, v2.11.0) mailinfo.c has contained new code with an
assert of the form:

	assert(call_a_function(...))

The function in question, check_header, has side effects.  This
means that when NDEBUG is defined during a release build the
function call is omitted entirely, the side effects do not
take place and tests (fortunately) start failing.

Move the function call outside of the assert and assert on
the result of the function call instead so that the code
still works properly in a release build and passes the tests.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---

Notes:
    Please include this PATCH in 2.11.x maint

 mailinfo.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mailinfo.c b/mailinfo.c
index 2fb3877e..47442fb5 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -708,9 +708,12 @@ static int is_scissors_line(const char *line)
 
 static void flush_inbody_header_accum(struct mailinfo *mi)
 {
+	int okay;
+
 	if (!mi->inbody_header_accum.len)
 		return;
-	assert(check_header(mi, &mi->inbody_header_accum, mi->s_hdr_data, 0));
+	okay = check_header(mi, &mi->inbody_header_accum, mi->s_hdr_data, 0);
+	assert(okay);
 	strbuf_reset(&mi->inbody_header_accum);
 }
 
---

^ permalink raw reply related

* Re: [PATCH v15 08/27] bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C
From: Stephan Beyer @ 2016-12-17 19:42 UTC (permalink / raw)
  To: Pranit Bauva; +Cc: Git List
In-Reply-To: <CAFZEwPO2WgBjOnmvu1VOiz3PMYYx2mxircCWk+BWxmuunC=VQA@mail.gmail.com>

Hi Pranit,

On 12/16/2016 08:00 PM, Pranit Bauva wrote:
> On Wed, Dec 7, 2016 at 1:03 AM, Pranit Bauva <pranit.bauva@gmail.com> wrote:
>>> I don't understand why the return value is int and not void. To avoid a
>>> "return 0;" line when calling this function?
>>
>> Initially I thought I would be using the return value but now I
>> realize that it is meaningless to do so. Using void seems better. :)
> 
> I just recollected when I was creating the next iteration of this
> series that I will need that int.
> 
>>>> +     case CHECK_EXPECTED_REVS:
>>>> +             return check_expected_revs(argv, argc);
> 
> See this.

This does not show that you need the "int", it just shows that you use
the return value of the function. But this return value is (in the
original shell code as well as in your v15) always 0. That is a sign
that the "void" return value makes more sense. Of course, then the line
above must be changed to

+     case CHECK_EXPECTED_REVS:
+             check_expected_revs(argv, argc);
+             return 0;

By the way, it also seems that the original function name
"check_expected_revs" was not the best choice (because the function
always returns 0, but the "check" implies that some semantically true or
false is returned).
On the other hand, it might also be useful (I cannot tell) to return
different values in check_expected_revs() — to signal the caller that
something changed or something did not change — but then ignore the
return value.

Best
  Stephan

^ permalink raw reply

* Re: test failure
From: Lars Schneider @ 2016-12-17 16:11 UTC (permalink / raw)
  To: Ramsay Jones, Jeff King; +Cc: Junio C Hamano, GIT Mailing-list
In-Reply-To: <50C75781-FE3B-410F-9866-63342607707B@gmail.com>


> On 17 Dec 2016, at 15:28, Lars Schneider <larsxschneider@gmail.com> wrote:
> 
> 
>> On 16 Dec 2016, at 21:32, Ramsay Jones <ramsay@ramsayjones.plus.com> wrote:
>> 
>> Hi Lars,
>> 
>> For the last two days, I've noticed t0021.15 on the 'pu' branch has been failing intermittently (well it fails with: 'make test >ptest-out', but
>> when run by hand, it fails only say 1-in-6, 1-in-18, etc.).
>> 
>> [yes, it's a bit strange; this hasn't changed in a couple of weeks!]
>> 
>> I don't have time to investigate further tonight and, since I had not
>> heard anyone else complain, I thought I should let you know.
>> 
>> See below for the output from a failing run. [Note: this is on Linux
>> Mint 18, tonight's pu branch @7c7984401].
> 
> Thanks Ramsay! 
> 
> I was able to reproduce the problem with this test:
> 
> 	test_expect_success 'ramsay-report' '
> 		test_config_global filter.protocol.clean cat &&
> 		git init &&
> 		echo "*.r filter=protocol" >.gitattributes &&
> 		echo "bla" >test.r &&
> 		git add . &&
> 		GIT_TRACE=1 git commit -m "test commit 2" > trace 2>&1 &&
> 		grep "run_command" trace
> 	'
> 
> It looks like as if Git occasionally forgets to run the clean filter.
> I bisected the problem and I think the problem starts with "diff: do not 
> reuse worktree files that need "clean" conversion" (06dec439a3) which
> definitively sounds related.
> 
> Back in June I reported that Git invokes the clean process 4 times if a
> single file is added. Peff took a closer look and suggested the patch
> mentioned above to remove one unnecessary invocation. I re-read his comments
> and everything sounds still reasonable to me:
> http://public-inbox.org/git/1469134747-26785-1-git-send-email-larsxschneider@gmail.com/#t
> 
> Does anyone have a clue what is going on? 
> I keep digging...

Ugh. I stopped coding, started cleaning the house, and it hit me:
"git commit" shouldn't call the filter anyways. I suspect it is called in
my tests because I add and commit the file to the index right after its 
creation. All this usually happens within 1 second and therefore Git cannot
know if the file was modified between "add" and "commit". That's why it needs
to run "clean" again.

I will adjust the tests.

Cheers,
Lars

^ permalink raw reply

* Re: [PATCH v2 00/21] Add configuration options for split-index
From: Christian Couder @ 2016-12-17 15:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

> The previous versions were:
>
>   RFC: https://github.com/chriscool/git/commits/config-split-index7
>   v1:  https://github.com/chriscool/git/commits/config-split-index72

The diff since v1 is:

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 8fbef25cb1..52a3cac4ff 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2782,9 +2782,9 @@ splitIndex.sharedIndexExpire::
     index file is created. The value "now" expires all entries
     immediately, and "never" suppresses expiration altogether.
     The default value is "one.week.ago".
-    Note that each time a new split-index file is created, the
-    mtime of the related shared index file is updated to the
-    current time.
+    Note that each time a split index based on a shared index file
+    is either created or read from, the mtime of the shared index
+    file is updated to the current time.
     See linkgit:git-update-index[1].

 status.relativePaths::
diff --git a/Documentation/git-update-index.txt
b/Documentation/git-update-index.txt
index 635d1574b2..46c953b2f2 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -407,10 +407,14 @@ specified by the splitIndex.maxPercentChange
config variable (see
 linkgit:git-config[1]).

 Each time a new shared index file is created, the old shared index
-files are deleted if they are older than what is specified by the
-splitIndex.sharedIndexExpire config variable (see
+files are deleted if their mtime is older than what is specified by
+the splitIndex.sharedIndexExpire config variable (see
 linkgit:git-config[1]).

+To avoid deleting a shared index file that is still used, its mtime is
+updated to the current time everytime a new split index based on the
+shared index file is either created or read from.
+
 Untracked cache
 ---------------

diff --git a/builtin/gc.c b/builtin/gc.c
index c1e9602892..1e40d45aa2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -100,8 +100,8 @@ static void gc_config(void)
     git_config_get_int("gc.auto", &gc_auto_threshold);
     git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
     git_config_get_bool("gc.autodetach", &detach_auto);
-    git_config_get_date_string("gc.pruneexpire", &prune_expire);
-    git_config_get_date_string("gc.worktreepruneexpire",
&prune_worktrees_expire);
+    git_config_get_expiry("gc.pruneexpire", &prune_expire);
+    git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
     git_config(git_default_config, NULL);
 }

diff --git a/builtin/update-index.c b/builtin/update-index.c
index a14dbf2612..dc1fd0d44d 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1099,18 +1099,18 @@ int cmd_update_index(int argc, const char
**argv, const char *prefix)

     if (split_index > 0) {
         if (git_config_get_split_index() == 0)
-            warning("core.splitIndex is set to false; "
-                "remove or change it, if you really want to "
-                "enable split index");
+            warning(_("core.splitIndex is set to false; "
+                  "remove or change it, if you really want to "
+                  "enable split index"));
         if (the_index.split_index)
             the_index.cache_changed |= SPLIT_INDEX_ORDERED;
         else
             add_split_index(&the_index);
     } else if (!split_index) {
         if (git_config_get_split_index() == 1)
-            warning("core.splitIndex is set to true; "
-                "remove or change it, if you really want to "
-                "disable split index");
+            warning(_("core.splitIndex is set to true; "
+                  "remove or change it, if you really want to "
+                  "disable split index"));
         remove_split_index(&the_index);
     }

diff --git a/cache.h b/cache.h
index 8e26aaf05e..279415afbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1823,11 +1823,13 @@ extern int git_config_get_bool(const char
*key, int *dest);
 extern int git_config_get_bool_or_int(const char *key, int *is_bool,
int *dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
-extern int git_config_get_date_string(const char *key, const char **output);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);

+/* This dies if the configured or default date is in the future */
+extern int git_config_get_expiry(const char *key, const char **output);
+
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
  * ensure that they do not modify the untracked cache when reading it.
diff --git a/config.c b/config.c
index f88c61bb30..5c52cefd78 100644
--- a/config.c
+++ b/config.c
@@ -1685,7 +1685,7 @@ int git_config_get_pathname(const char *key,
const char **dest)
     return ret;
 }

-int git_config_get_date_string(const char *key, const char **output)
+int git_config_get_expiry(const char *key, const char **output)
 {
     int ret = git_config_get_string_const(key, output);
     if (ret)
@@ -1714,8 +1714,8 @@ int git_config_get_untracked_cache(void)
         if (!strcasecmp(v, "keep"))
             return -1;

-        error("unknown core.untrackedCache value '%s'; "
-              "using 'keep' default value", v);
+        error(_("unknown core.untrackedCache value '%s'; "
+            "using 'keep' default value"), v);
         return -1;
     }

@@ -1740,9 +1740,8 @@ int git_config_get_max_percent_split_change(void)
         if (0 <= val && val <= 100)
             return val;

-        error("splitindex.maxpercentchange value '%d' "
-              "should be between 0 and 100", val);
-        return -1;
+        return error(_("splitIndex.maxPercentChange value '%d' "
+                   "should be between 0 and 100"), val);
     }

     return -1; /* default value */
diff --git a/read-cache.c b/read-cache.c
index 9727efcb5b..35377f0a3e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1685,10 +1685,25 @@ int do_read_index(struct index_state *istate,
const char *path, int must_exist)
     die("index file corrupt");
 }

+/*
+ * Signal that the shared index is used by updating its mtime.
+ *
+ * This way, shared index can be removed if they have not been used
+ * for some time. It's ok to fail to update the mtime if we are on a
+ * read only file system.
+ */
+void freshen_shared_index(char *base_sha1_hex)
+{
+    const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+    check_and_freshen_file(shared_index, 1);
+}
+
 int read_index_from(struct index_state *istate, const char *path)
 {
     struct split_index *split_index;
     int ret;
+    char *base_sha1_hex;
+    const char *base_path;

     /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
     if (istate->initialized)
@@ -1706,15 +1721,16 @@ int read_index_from(struct index_state
*istate, const char *path)
         discard_index(split_index->base);
     else
         split_index->base = xcalloc(1, sizeof(*split_index->base));
-    ret = do_read_index(split_index->base,
-                git_path("sharedindex.%s",
-                     sha1_to_hex(split_index->base_sha1)), 1);
+
+    base_sha1_hex = sha1_to_hex(split_index->base_sha1);
+    base_path = git_path("sharedindex.%s", base_sha1_hex);
+    ret = do_read_index(split_index->base, base_path, 1);
     if (hashcmp(split_index->base_sha1, split_index->base->sha1))
         die("broken index, expect %s in %s, got %s",
-            sha1_to_hex(split_index->base_sha1),
-            git_path("sharedindex.%s",
-                 sha1_to_hex(split_index->base_sha1)),
+            base_sha1_hex, base_path,
             sha1_to_hex(split_index->base->sha1));
+
+    freshen_shared_index(base_sha1_hex);
     merge_base_index(istate);
     post_read_index_from(istate);
     return ret;
@@ -2205,8 +2221,8 @@ static unsigned long get_shared_index_expire_date(void)
     static int shared_index_expire_date_prepared;

     if (!shared_index_expire_date_prepared) {
-        git_config_get_date_string("splitindex.sharedindexexpire",
-                       &shared_index_expire);
+        git_config_get_expiry("splitindex.sharedindexexpire",
+                      &shared_index_expire);
         shared_index_expire_date = approxidate(shared_index_expire);
         shared_index_expire_date_prepared = 1;
     }
@@ -2225,22 +2241,20 @@ static int can_delete_shared_index(const char
*shared_sha1_hex)
     if (!expiration)
         return 0;
     if (stat(shared_index, &st))
-        return error_errno("could not stat '%s", shared_index);
+        return error_errno(_("could not stat '%s"), shared_index);
     if (st.st_mtime > expiration)
         return 0;

     return 1;
 }

-static void clean_shared_index_files(const char *current_hex)
+static int clean_shared_index_files(const char *current_hex)
 {
     struct dirent *de;
     DIR *dir = opendir(get_git_dir());

-    if (!dir) {
-        error_errno("unable to open git dir: %s", get_git_dir());
-        return;
-    }
+    if (!dir)
+        return error_errno(_("unable to open git dir: %s"), get_git_dir());

     while ((de = readdir(dir)) != NULL) {
         const char *sha1_hex;
@@ -2250,9 +2264,11 @@ static void clean_shared_index_files(const char
*current_hex)
             continue;
         if (can_delete_shared_index(sha1_hex) > 0 &&
             unlink(git_path("%s", de->d_name)))
-            error_errno("unable to unlink: %s", git_path("%s", de->d_name));
+            error_errno(_("unable to unlink: %s"), git_path("%s", de->d_name));
     }
     closedir(dir);
+
+    return 0;
 }

 static struct tempfile temporary_sharedindex;
@@ -2286,7 +2302,7 @@ static int write_shared_index(struct index_state *istate,

 static const int default_max_percent_split_change = 20;

-int too_many_not_shared_entries(struct index_state *istate)
+static int too_many_not_shared_entries(struct index_state *istate)
 {
     int i, not_shared = 0;
     int max_split = git_config_get_max_percent_split_change();
@@ -2331,17 +2347,14 @@ int write_locked_index(struct index_state
*istate, struct lock_file *lock,
         if ((v & 15) < 6)
             istate->cache_changed |= SPLIT_INDEX_ORDERED;
     }
-    if (istate->cache_changed & SPLIT_INDEX_ORDERED ||
-        too_many_not_shared_entries(istate)) {
+    if (too_many_not_shared_entries(istate))
+        istate->cache_changed |= SPLIT_INDEX_ORDERED;
+    if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
         int ret = write_shared_index(istate, lock, flags);
         if (ret)
             return ret;
     } else {
-        /* Signal that the shared index is used */
-        const char *shared_index = git_path("sharedindex.%s",
-                            sha1_to_hex(si->base_sha1));
-        if (!check_and_freshen_file(shared_index, 1))
-            warning("could not freshen '%s'", shared_index);
+        freshen_shared_index(sha1_to_hex(si->base_sha1));
     }

     return write_split_index(istate, lock, flags);
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index f448fc13cd..800f84a593 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -313,17 +313,17 @@ EOF
 test_expect_success 'shared index files expire after 7 days by default' '
     : >ten &&
     git update-index --add ten &&
-    test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+    test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
     just_under_7_days_ago=$((1-7*86400)) &&
     test-chmtime =$just_under_7_days_ago .git/sharedindex.* &&
     : >eleven &&
     git update-index --add eleven &&
-    test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+    test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
     just_over_7_days_ago=$((-1-7*86400)) &&
     test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
     : >twelve &&
     git update-index --add twelve &&
-    test $(ls .git/sharedindex.* | wc -l) = 1
+    test $(ls .git/sharedindex.* | wc -l) -le 2
 '

 test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
@@ -331,12 +331,12 @@ test_expect_success 'check
splitIndex.sharedIndexExpire set to 8 days' '
     test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
     : >thirteen &&
     git update-index --add thirteen &&
-    test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+    test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
     just_over_8_days_ago=$((-1-8*86400)) &&
     test-chmtime =$just_over_8_days_ago .git/sharedindex.* &&
     : >fourteen &&
     git update-index --add fourteen &&
-    test $(ls .git/sharedindex.* | wc -l) = 1
+    test $(ls .git/sharedindex.* | wc -l) -le 2
 '

 test_expect_success 'check splitIndex.sharedIndexExpire set to
"never" and "now"' '
@@ -345,13 +345,13 @@ test_expect_success 'check
splitIndex.sharedIndexExpire set to "never" and "now"
     test-chmtime =$just_10_years_ago .git/sharedindex.* &&
     : >fifteen &&
     git update-index --add fifteen &&
-    test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+    test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
     git config splitIndex.sharedIndexExpire now &&
     just_1_second_ago=-1 &&
     test-chmtime =$just_1_second_ago .git/sharedindex.* &&
     : >sixteen &&
     git update-index --add sixteen &&
-    test $(ls .git/sharedindex.* | wc -l) = 1
+    test $(ls .git/sharedindex.* | wc -l) -le 2
 '

 test_done

^ permalink raw reply related

* [PATCH v2 07/21] Documentation/config: add information for core.splitIndex
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a0ab66aae7..dc44d8a417 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -331,6 +331,10 @@ core.trustctime::
 	crawlers and some backup systems).
 	See linkgit:git-update-index[1]. True by default.
 
+core.splitIndex::
+	If true, the split-index feature of the index will be used.
+	See linkgit:git-update-index[1]. False by default.
+
 core.untrackedCache::
 	Determines what to do about the untracked cache feature of the
 	index. It will be kept, if this variable is unset or set to
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 03/21] split-index: add {add,remove}_split_index() functions
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Also use the functions in cmd_update_index() in
builtin/update-index.c.

These functions will be used in a following commit to tweak
our use of the split-index feature depending on the setting
of a configuration variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/update-index.c | 18 ++++++------------
 split-index.c          | 22 ++++++++++++++++++++++
 split-index.h          |  2 ++
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index f3f07e7f1c..b75ea037dd 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1098,18 +1098,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (split_index > 0) {
-		init_split_index(&the_index);
-		the_index.cache_changed |= SPLIT_INDEX_ORDERED;
-	} else if (!split_index && the_index.split_index) {
-		/*
-		 * can't discard_split_index(&the_index); because that
-		 * will destroy split_index->base->cache[], which may
-		 * be shared with the_index.cache[]. So yeah we're
-		 * leaking a bit here.
-		 */
-		the_index.split_index = NULL;
-		the_index.cache_changed |= SOMETHING_CHANGED;
-	}
+		if (the_index.split_index)
+			the_index.cache_changed |= SPLIT_INDEX_ORDERED;
+		else
+			add_split_index(&the_index);
+	} else if (!split_index)
+		remove_split_index(&the_index);
 
 	switch (untracked_cache) {
 	case UC_UNSPECIFIED:
diff --git a/split-index.c b/split-index.c
index 615f4cac05..f519e60f87 100644
--- a/split-index.c
+++ b/split-index.c
@@ -317,3 +317,25 @@ void replace_index_entry_in_base(struct index_state *istate,
 		istate->split_index->base->cache[new->index - 1] = new;
 	}
 }
+
+void add_split_index(struct index_state *istate)
+{
+	if (!istate->split_index) {
+		init_split_index(istate);
+		istate->cache_changed |= SPLIT_INDEX_ORDERED;
+	}
+}
+
+void remove_split_index(struct index_state *istate)
+{
+	if (istate->split_index) {
+		/*
+		 * can't discard_split_index(&the_index); because that
+		 * will destroy split_index->base->cache[], which may
+		 * be shared with the_index.cache[]. So yeah we're
+		 * leaking a bit here.
+		 */
+		istate->split_index = NULL;
+		istate->cache_changed |= SOMETHING_CHANGED;
+	}
+}
diff --git a/split-index.h b/split-index.h
index c1324f521a..df91c1bda8 100644
--- a/split-index.h
+++ b/split-index.h
@@ -31,5 +31,7 @@ void merge_base_index(struct index_state *istate);
 void prepare_to_write_split_index(struct index_state *istate);
 void finish_writing_split_index(struct index_state *istate);
 void discard_split_index(struct index_state *istate);
+void add_split_index(struct index_state *istate);
+void remove_split_index(struct index_state *istate);
 
 #endif
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 04/21] read-cache: add and then use tweak_split_index()
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

This will make us use the split-index feature or not depending
on the value of the "core.splitIndex" config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index db5d910642..79aae6bd20 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1569,10 +1569,27 @@ static void tweak_untracked_cache(struct index_state *istate)
 	}
 }
 
+static void tweak_split_index(struct index_state *istate)
+{
+	switch (git_config_get_split_index()) {
+	case -1: /* unset: do nothing */
+		break;
+	case 0: /* false */
+		remove_split_index(istate);
+		break;
+	case 1: /* true */
+		add_split_index(istate);
+		break;
+	default: /* unknown value: do nothing */
+		break;
+	}
+}
+
 static void post_read_index_from(struct index_state *istate)
 {
 	check_ce_order(istate);
 	tweak_untracked_cache(istate);
+	tweak_split_index(istate);
 }
 
 /* remember to discard_cache() before reading a different cache! */
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 10/21] read-cache: regenerate shared index if necessary
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

When writing a new split-index and there is a big number of cache
entries in the split-index compared to the shared index, it is a
good idea to regenerate the shared index.

By default when the ratio reaches 20%, we will push back all
the entries from the split-index into a new shared index file
instead of just creating a new split-index file.

The threshold can be configured using the
"splitIndex.maxPercentChange" config variable.

We need to adjust the existing tests in t1700 by setting
"splitIndex.maxPercentChange" to 100 at the beginning of t1700,
as the existing tests are assuming that the shared index is
regenerated only when `git update-index --split-index` is used.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c           | 32 ++++++++++++++++++++++++++++++++
 t/t1700-split-index.sh |  1 +
 2 files changed, 33 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 79aae6bd20..280b01bd6c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2223,6 +2223,36 @@ static int write_shared_index(struct index_state *istate,
 	return ret;
 }
 
+static const int default_max_percent_split_change = 20;
+
+static int too_many_not_shared_entries(struct index_state *istate)
+{
+	int i, not_shared = 0;
+	int max_split = git_config_get_max_percent_split_change();
+
+	switch (max_split) {
+	case -1:
+		/* not or badly configured: use the default value */
+		max_split = default_max_percent_split_change;
+		break;
+	case 0:
+		return 1; /* 0% means always write a new shared index */
+	case 100:
+		return 0; /* 100% means never write a new shared index */
+	default:
+		; /* do nothing: just use the configured value */
+	}
+
+	/* Count not shared entries */
+	for (i = 0; i < istate->cache_nr; i++) {
+		struct cache_entry *ce = istate->cache[i];
+		if (!ce->index)
+			not_shared++;
+	}
+
+	return istate->cache_nr * max_split < not_shared * 100;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		       unsigned flags)
 {
@@ -2240,6 +2270,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		if ((v & 15) < 6)
 			istate->cache_changed |= SPLIT_INDEX_ORDERED;
 	}
+	if (too_many_not_shared_entries(istate))
+		istate->cache_changed |= SPLIT_INDEX_ORDERED;
 	if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
 		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index db8c39f446..507a1dd1ad 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -8,6 +8,7 @@ test_description='split index mode tests'
 sane_unset GIT_TEST_SPLIT_INDEX
 
 test_expect_success 'enable split index' '
+	git config splitIndex.maxPercentChange 100 &&
 	git update-index --split-index &&
 	test-dump-split-index .git/index >actual &&
 	indexversion=$(test-index-version <.git/index) &&
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 15/21] config: add git_config_get_expiry() from gc.c
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

This function will be used in a following commit to get the expiration
time of the shared index files from the config, and it is generic
enough to be put in "config.c".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/gc.c | 15 ++-------------
 cache.h      |  3 +++
 config.c     | 13 +++++++++++++
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 069950d0b4..1e40d45aa2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -62,17 +62,6 @@ static void report_pack_garbage(unsigned seen_bits, const char *path)
 		string_list_append(&pack_garbage, path);
 }
 
-static void git_config_date_string(const char *key, const char **output)
-{
-	if (git_config_get_string_const(key, output))
-		return;
-	if (strcmp(*output, "now")) {
-		unsigned long now = approxidate("now");
-		if (approxidate(*output) >= now)
-			git_die_config(key, _("Invalid %s: '%s'"), key, *output);
-	}
-}
-
 static void process_log_file(void)
 {
 	struct stat st;
@@ -111,8 +100,8 @@ static void gc_config(void)
 	git_config_get_int("gc.auto", &gc_auto_threshold);
 	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
 	git_config_get_bool("gc.autodetach", &detach_auto);
-	git_config_date_string("gc.pruneexpire", &prune_expire);
-	git_config_date_string("gc.worktreepruneexpire", &prune_worktrees_expire);
+	git_config_get_expiry("gc.pruneexpire", &prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
 	git_config(git_default_config, NULL);
 }
 
diff --git a/cache.h b/cache.h
index f442f28189..279415afbd 100644
--- a/cache.h
+++ b/cache.h
@@ -1827,6 +1827,9 @@ extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
 extern int git_config_get_max_percent_split_change(void);
 
+/* This dies if the configured or default date is in the future */
+extern int git_config_get_expiry(const char *key, const char **output);
+
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
  * ensure that they do not modify the untracked cache when reading it.
diff --git a/config.c b/config.c
index 3e96c223f5..5c52cefd78 100644
--- a/config.c
+++ b/config.c
@@ -1685,6 +1685,19 @@ int git_config_get_pathname(const char *key, const char **dest)
 	return ret;
 }
 
+int git_config_get_expiry(const char *key, const char **output)
+{
+	int ret = git_config_get_string_const(key, output);
+	if (ret)
+		return ret;
+	if (strcmp(*output, "now")) {
+		unsigned long now = approxidate("now");
+		if (approxidate(*output) >= now)
+			git_die_config(key, _("Invalid %s: '%s'"), key, *output);
+	}
+	return ret;
+}
+
 int git_config_get_untracked_cache(void)
 {
 	int val = -1;
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 18/21] read-cache: refactor read_index_from()
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

It looks better and is simpler to review when we don't compute
the same things many times in the function.

It will also help make the following commit simpler.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 8e99a7c325..96e097e341 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1702,6 +1702,8 @@ int read_index_from(struct index_state *istate, const char *path)
 {
 	struct split_index *split_index;
 	int ret;
+	char *base_sha1_hex;
+	const char *base_path;
 
 	/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
 	if (istate->initialized)
@@ -1719,15 +1721,15 @@ int read_index_from(struct index_state *istate, const char *path)
 		discard_index(split_index->base);
 	else
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
-	ret = do_read_index(split_index->base,
-			    git_path("sharedindex.%s",
-				     sha1_to_hex(split_index->base_sha1)), 1);
+
+	base_sha1_hex = sha1_to_hex(split_index->base_sha1);
+	base_path = git_path("sharedindex.%s", base_sha1_hex);
+	ret = do_read_index(split_index->base, base_path, 1);
 	if (hashcmp(split_index->base_sha1, split_index->base->sha1))
 		die("broken index, expect %s in %s, got %s",
-		    sha1_to_hex(split_index->base_sha1),
-		    git_path("sharedindex.%s",
-			     sha1_to_hex(split_index->base_sha1)),
+		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
+
 	merge_base_index(istate);
 	post_read_index_from(istate);
 	return ret;
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 13/21] sha1_file: make check_and_freshen_file() non static
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

This function will be used in a commit soon, so let's make
it available globally.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h     | 3 +++
 sha1_file.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index e15b421b6f..f442f28189 100644
--- a/cache.h
+++ b/cache.h
@@ -1170,6 +1170,9 @@ extern int has_pack_index(const unsigned char *sha1);
 
 extern void assert_sha1_type(const unsigned char *sha1, enum object_type expect);
 
+/* Helper to check and "touch" a file */
+extern int check_and_freshen_file(const char *fn, int freshen);
+
 extern const signed char hexval_table[256];
 static inline unsigned int hexval(unsigned char c)
 {
diff --git a/sha1_file.c b/sha1_file.c
index 9c86d1924a..7e5846d4f9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -576,7 +576,7 @@ static int freshen_file(const char *fn)
  * either does not exist on disk, or has a stale mtime and may be subject to
  * pruning).
  */
-static int check_and_freshen_file(const char *fn, int freshen)
+int check_and_freshen_file(const char *fn, int freshen)
 {
 	if (access(fn, F_OK))
 		return 0;
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 05/21] update-index: warn in case of split-index incoherency
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

When users are using `git update-index --(no-)split-index`, they
may expect the split-index feature to be used or not according to
the option they just used, but this might not be the case if the
new "core.splitIndex" config variable has been set. In this case
let's warn about what will happen and why.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/update-index.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index b75ea037dd..dc1fd0d44d 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1098,12 +1098,21 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (split_index > 0) {
+		if (git_config_get_split_index() == 0)
+			warning(_("core.splitIndex is set to false; "
+				  "remove or change it, if you really want to "
+				  "enable split index"));
 		if (the_index.split_index)
 			the_index.cache_changed |= SPLIT_INDEX_ORDERED;
 		else
 			add_split_index(&the_index);
-	} else if (!split_index)
+	} else if (!split_index) {
+		if (git_config_get_split_index() == 1)
+			warning(_("core.splitIndex is set to true; "
+				  "remove or change it, if you really want to "
+				  "disable split index"));
 		remove_split_index(&the_index);
+	}
 
 	switch (untracked_cache) {
 	case UC_UNSPECIFIED:
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 11/21] t1700: add tests for splitIndex.maxPercentChange
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 507a1dd1ad..f03addf654 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -238,4 +238,76 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+	git config core.splitIndex true &&
+	: >three &&
+	git update-index --add three &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >four &&
+	git update-index --add four &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	four
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check behavior with splitIndex.maxPercentChange unset' '
+	git config --unset splitIndex.maxPercentChange &&
+	: >five &&
+	git update-index --add five &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >six &&
+	git update-index --add six &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	six
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'check splitIndex.maxPercentChange set to 0' '
+	git config splitIndex.maxPercentChange 0 &&
+	: >seven &&
+	git update-index --add seven &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual &&
+	: >eight &&
+	git update-index --add eight &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
 test_done
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 20/21] Documentation/config: add splitIndex.sharedIndexExpire
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 08f638c65c..8fbef25cb1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2776,6 +2776,17 @@ splitIndex.maxPercentChange::
 	than 20 percent of the total number of entries.
 	See linkgit:git-update-index[1].
 
+splitIndex.sharedIndexExpire::
+	When the split index feature is used, shared index files with
+	a mtime older than this time will be removed when a new shared
+	index file is created. The value "now" expires all entries
+	immediately, and "never" suppresses expiration altogether.
+	The default value is "one.week.ago".
+	Note that each time a new split-index file is created, the
+	mtime of the related shared index file is updated to the
+	current time.
+	See linkgit:git-update-index[1].
+
 status.relativePaths::
 	By default, linkgit:git-status[1] shows paths relative to the
 	current directory. Setting this variable to `false` shows paths
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 21/21] Documentation/git-update-index: explain splitIndex.*
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt           |  6 +++---
 Documentation/git-update-index.txt | 37 +++++++++++++++++++++++++++++--------
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 8fbef25cb1..52a3cac4ff 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2782,9 +2782,9 @@ splitIndex.sharedIndexExpire::
 	index file is created. The value "now" expires all entries
 	immediately, and "never" suppresses expiration altogether.
 	The default value is "one.week.ago".
-	Note that each time a new split-index file is created, the
-	mtime of the related shared index file is updated to the
-	current time.
+	Note that each time a split index based on a shared index file
+	is either created or read from, the mtime of the shared index
+	file is updated to the current time.
 	See linkgit:git-update-index[1].
 
 status.relativePaths::
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index e091b2a409..46c953b2f2 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -163,14 +163,10 @@ may not support it yet.
 
 --split-index::
 --no-split-index::
-	Enable or disable split index mode. If enabled, the index is
-	split into two files, $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>.
-	Changes are accumulated in $GIT_DIR/index while the shared
-	index file contains all index entries stays unchanged. If
-	split-index mode is already enabled and `--split-index` is
-	given again, all changes in $GIT_DIR/index are pushed back to
-	the shared index file. This mode is designed for very large
-	indexes that take a significant amount of time to read or write.
+	Enable or disable split index mode. If split-index mode is
+	already enabled and `--split-index` is given again, all
+	changes in $GIT_DIR/index are pushed back to the shared index
+	file.
 +
 These options take effect whatever the value of the `core.splitIndex`
 configuration variable (see linkgit:git-config[1]). But a warning is
@@ -394,6 +390,31 @@ Although this bit looks similar to assume-unchanged bit, its goal is
 different from assume-unchanged bit's. Skip-worktree also takes
 precedence over assume-unchanged bit when both are set.
 
+Split index
+-----------
+
+This mode is designed for very large indexes that take a significant
+amount of time to read or write.
+
+In this mode, the index is split into two files, $GIT_DIR/index and
+$GIT_DIR/sharedindex.<SHA-1>. Changes are accumulated in
+$GIT_DIR/index, the split index, while the shared index file contains
+all index entries and stays unchanged.
+
+All changes in the split index are pushed back to the shared index
+file when the number of entries in the split index reaches a level
+specified by the splitIndex.maxPercentChange config variable (see
+linkgit:git-config[1]).
+
+Each time a new shared index file is created, the old shared index
+files are deleted if their mtime is older than what is specified by
+the splitIndex.sharedIndexExpire config variable (see
+linkgit:git-config[1]).
+
+To avoid deleting a shared index file that is still used, its mtime is
+updated to the current time everytime a new split index based on the
+shared index file is either created or read from.
+
 Untracked cache
 ---------------
 
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 19/21] read-cache: use freshen_shared_index() in read_index_from()
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

This way a share index file will not be garbage collected if
we still read from an index it is based from.

As we need to read the current index before creating a new
one, the tests have to be adjusted, so that we don't expect
an old shared index file to be deleted right away when we
create a new one.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c           |  1 +
 t/t1700-split-index.sh | 14 +++++++-------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 96e097e341..35377f0a3e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1730,6 +1730,7 @@ int read_index_from(struct index_state *istate, const char *path)
 		    base_sha1_hex, base_path,
 		    sha1_to_hex(split_index->base->sha1));
 
+	freshen_shared_index(base_sha1_hex);
 	merge_base_index(istate);
 	post_read_index_from(istate);
 	return ret;
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index f448fc13cd..800f84a593 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -313,17 +313,17 @@ EOF
 test_expect_success 'shared index files expire after 7 days by default' '
 	: >ten &&
 	git update-index --add ten &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_under_7_days_ago=$((1-7*86400)) &&
 	test-chmtime =$just_under_7_days_ago .git/sharedindex.* &&
 	: >eleven &&
 	git update-index --add eleven &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_over_7_days_ago=$((-1-7*86400)) &&
 	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
 	: >twelve &&
 	git update-index --add twelve &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
@@ -331,12 +331,12 @@ test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
 	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
 	: >thirteen &&
 	git update-index --add thirteen &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	just_over_8_days_ago=$((-1-8*86400)) &&
 	test-chmtime =$just_over_8_days_ago .git/sharedindex.* &&
 	: >fourteen &&
 	git update-index --add fourteen &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"' '
@@ -345,13 +345,13 @@ test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"
 	test-chmtime =$just_10_years_ago .git/sharedindex.* &&
 	: >fifteen &&
 	git update-index --add fifteen &&
-	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 2 &&
 	git config splitIndex.sharedIndexExpire now &&
 	just_1_second_ago=-1 &&
 	test-chmtime =$just_1_second_ago .git/sharedindex.* &&
 	: >sixteen &&
 	git update-index --add sixteen &&
-	test $(ls .git/sharedindex.* | wc -l) = 1
+	test $(ls .git/sharedindex.* | wc -l) -le 2
 '
 
 test_done
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 17/21] t1700: test shared index file expiration
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index f03addf654..f448fc13cd 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -310,4 +310,48 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'shared index files expire after 7 days by default' '
+	: >ten &&
+	git update-index --add ten &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	just_under_7_days_ago=$((1-7*86400)) &&
+	test-chmtime =$just_under_7_days_ago .git/sharedindex.* &&
+	: >eleven &&
+	git update-index --add eleven &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	just_over_7_days_ago=$((-1-7*86400)) &&
+	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
+	: >twelve &&
+	git update-index --add twelve &&
+	test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to 8 days' '
+	git config splitIndex.sharedIndexExpire "8.days.ago" &&
+	test-chmtime =$just_over_7_days_ago .git/sharedindex.* &&
+	: >thirteen &&
+	git update-index --add thirteen &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	just_over_8_days_ago=$((-1-8*86400)) &&
+	test-chmtime =$just_over_8_days_ago .git/sharedindex.* &&
+	: >fourteen &&
+	git update-index --add fourteen &&
+	test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
+test_expect_success 'check splitIndex.sharedIndexExpire set to "never" and "now"' '
+	git config splitIndex.sharedIndexExpire never &&
+	just_10_years_ago=$((-365*10*86400)) &&
+	test-chmtime =$just_10_years_ago .git/sharedindex.* &&
+	: >fifteen &&
+	git update-index --add fifteen &&
+	test $(ls .git/sharedindex.* | wc -l) -gt 1 &&
+	git config splitIndex.sharedIndexExpire now &&
+	just_1_second_ago=-1 &&
+	test-chmtime =$just_1_second_ago .git/sharedindex.* &&
+	: >sixteen &&
+	git update-index --add sixteen &&
+	test $(ls .git/sharedindex.* | wc -l) = 1
+'
+
 test_done
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 16/21] read-cache: unlink old sharedindex files
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Everytime split index is turned on, it creates a "sharedindex.XXXX"
file in the git directory. This change makes sure that shared index
files that haven't been used for a long time are removed when a new
shared index file is created.

The new "splitIndex.sharedIndexExpire" config variable is created
to tell the delay after which an unused shared index file can be
deleted. It defaults to "1.week.ago".

A previous commit made sure that each time a split index file is
created the mtime of the shared index file it references is updated.
This makes sure that recently used shared index file will not be
deleted.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 42688261f7..8e99a7c325 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2210,6 +2210,64 @@ static int write_split_index(struct index_state *istate,
 	return ret;
 }
 
+static const char *shared_index_expire = "1.week.ago";
+
+static unsigned long get_shared_index_expire_date(void)
+{
+	static unsigned long shared_index_expire_date;
+	static int shared_index_expire_date_prepared;
+
+	if (!shared_index_expire_date_prepared) {
+		git_config_get_expiry("splitindex.sharedindexexpire",
+				      &shared_index_expire);
+		shared_index_expire_date = approxidate(shared_index_expire);
+		shared_index_expire_date_prepared = 1;
+	}
+
+	return shared_index_expire_date;
+}
+
+static int can_delete_shared_index(const char *shared_sha1_hex)
+{
+	struct stat st;
+	unsigned long expiration;
+	const char *shared_index = git_path("sharedindex.%s", shared_sha1_hex);
+
+	/* Check timestamp */
+	expiration = get_shared_index_expire_date();
+	if (!expiration)
+		return 0;
+	if (stat(shared_index, &st))
+		return error_errno(_("could not stat '%s"), shared_index);
+	if (st.st_mtime > expiration)
+		return 0;
+
+	return 1;
+}
+
+static int clean_shared_index_files(const char *current_hex)
+{
+	struct dirent *de;
+	DIR *dir = opendir(get_git_dir());
+
+	if (!dir)
+		return error_errno(_("unable to open git dir: %s"), get_git_dir());
+
+	while ((de = readdir(dir)) != NULL) {
+		const char *sha1_hex;
+		if (!skip_prefix(de->d_name, "sharedindex.", &sha1_hex))
+			continue;
+		if (!strcmp(sha1_hex, current_hex))
+			continue;
+		if (can_delete_shared_index(sha1_hex) > 0 &&
+		    unlink(git_path("%s", de->d_name)))
+			error_errno(_("unable to unlink: %s"), git_path("%s", de->d_name));
+	}
+	closedir(dir);
+
+	return 0;
+}
+
 static struct tempfile temporary_sharedindex;
 
 static int write_shared_index(struct index_state *istate,
@@ -2231,8 +2289,11 @@ static int write_shared_index(struct index_state *istate,
 	}
 	ret = rename_tempfile(&temporary_sharedindex,
 			      git_path("sharedindex.%s", sha1_to_hex(si->base->sha1)));
-	if (!ret)
+	if (!ret) {
 		hashcpy(si->base_sha1, si->base->sha1);
+		clean_shared_index_files(sha1_to_hex(si->base->sha1));
+	}
+
 	return ret;
 }
 
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 14/21] read-cache: touch shared index files when used
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

When a split-index file is created, let's update the mtime of the
shared index file that the split-index file is referencing.

In a following commit we will make shared index file expire
depending on their mtime, so updating the mtime makes sure that
the shared index file will not be deleted soon.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 read-cache.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 280b01bd6c..42688261f7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1685,6 +1685,19 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	die("index file corrupt");
 }
 
+/*
+ * Signal that the shared index is used by updating its mtime.
+ *
+ * This way, shared index can be removed if they have not been used
+ * for some time. It's ok to fail to update the mtime if we are on a
+ * read only file system.
+ */
+void freshen_shared_index(char *base_sha1_hex)
+{
+	const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+	check_and_freshen_file(shared_index, 1);
+}
+
 int read_index_from(struct index_state *istate, const char *path)
 {
 	struct split_index *split_index;
@@ -2276,6 +2289,8 @@ int write_locked_index(struct index_state *istate, struct lock_file *lock,
 		int ret = write_shared_index(istate, lock, flags);
 		if (ret)
 			return ret;
+	} else {
+		freshen_shared_index(sha1_to_hex(si->base_sha1));
 	}
 
 	return write_split_index(istate, lock, flags);
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 12/21] Documentation/config: add splitIndex.maxPercentChange
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc44d8a417..08f638c65c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2763,6 +2763,19 @@ showbranch.default::
 	The default set of branches for linkgit:git-show-branch[1].
 	See linkgit:git-show-branch[1].
 
+splitIndex.maxPercentChange::
+	When the split index feature is used, this specifies the
+	percent of entries the split index can contain compared to the
+	whole number of entries in both the split index and the shared
+	index before a new shared index is written.
+	The value should be between 0 and 100. If the value is 0 then
+	a new shared index is always written, if it is 100 a new
+	shared index is never written.
+	By default the value is 20, so a new shared index is written
+	if the number of entries in the split index would be greater
+	than 20 percent of the total number of entries.
+	See linkgit:git-update-index[1].
+
 status.relativePaths::
 	By default, linkgit:git-status[1] shows paths relative to the
 	current directory. Setting this variable to `false` shows paths
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 08/21] Documentation/git-update-index: talk about core.splitIndex config var
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-update-index.txt | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 7386c93162..e091b2a409 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -171,6 +171,12 @@ may not support it yet.
 	given again, all changes in $GIT_DIR/index are pushed back to
 	the shared index file. This mode is designed for very large
 	indexes that take a significant amount of time to read or write.
++
+These options take effect whatever the value of the `core.splitIndex`
+configuration variable (see linkgit:git-config[1]). But a warning is
+emitted when the change goes against the configured value, as the
+configured value will take effect next time the index is read and this
+will remove the intended effect of the option.
 
 --untracked-cache::
 --no-untracked-cache::
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 09/21] config: add git_config_get_max_percent_split_change()
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

This new function will be used in a following commit to get the
value of the "splitIndex.maxPercentChange" config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h  |  1 +
 config.c | 15 +++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/cache.h b/cache.h
index c126fe475e..e15b421b6f 100644
--- a/cache.h
+++ b/cache.h
@@ -1822,6 +1822,7 @@ extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
 extern int git_config_get_split_index(void);
+extern int git_config_get_max_percent_split_change(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index c1343bbb3e..3e96c223f5 100644
--- a/config.c
+++ b/config.c
@@ -1719,6 +1719,21 @@ int git_config_get_split_index(void)
 	return -1; /* default value */
 }
 
+int git_config_get_max_percent_split_change(void)
+{
+	int val = -1;
+
+	if (!git_config_get_int("splitindex.maxpercentchange", &val)) {
+		if (0 <= val && val <= 100)
+			return val;
+
+		return error(_("splitIndex.maxPercentChange value '%d' "
+			       "should be between 0 and 100"), val);
+	}
+
+	return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 06/21] t1700: add tests for core.splitIndex
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t1700-split-index.sh | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 292a0720fc..db8c39f446 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -200,4 +200,41 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'set core.splitIndex config variable to true' '
+	git config core.splitIndex true &&
+	: >three &&
+	git update-index --add three &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	three
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+	BASE=$(test-dump-split-index .git/index | grep "^base") &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+$BASE
+replacements:
+deletions:
+EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'set core.splitIndex config variable to false' '
+	git config core.splitIndex false &&
+	git update-index --force-remove three &&
+	git ls-files --stage >ls-files.actual &&
+	cat >ls-files.expect <<EOF &&
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	one
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	two
+EOF
+	test_cmp ls-files.expect ls-files.actual &&
+	test-dump-split-index .git/index | sed "/^own/d" >actual &&
+	cat >expect <<EOF &&
+not a split index
+EOF
+	test_cmp expect actual
+'
+
 test_done
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related

* [PATCH v2 02/21] config: add git_config_get_split_index()
From: Christian Couder @ 2016-12-17 14:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyen Thai Ngoc Duy,
	Ævar Arnfjörð Bjarmason, Christian Couder
In-Reply-To: <20161217145547.11748-1-chriscool@tuxfamily.org>

This new function will be used in a following commit to know
if we want to use the split index feature or not.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 cache.h  |  1 +
 config.c | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/cache.h b/cache.h
index a50a61a197..c126fe475e 100644
--- a/cache.h
+++ b/cache.h
@@ -1821,6 +1821,7 @@ extern int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest);
 extern int git_config_get_maybe_bool(const char *key, int *dest);
 extern int git_config_get_pathname(const char *key, const char **dest);
 extern int git_config_get_untracked_cache(void);
+extern int git_config_get_split_index(void);
 
 /*
  * This is a hack for test programs like test-dump-untracked-cache to
diff --git a/config.c b/config.c
index 2eaf8ad77a..c1343bbb3e 100644
--- a/config.c
+++ b/config.c
@@ -1709,6 +1709,16 @@ int git_config_get_untracked_cache(void)
 	return -1; /* default value */
 }
 
+int git_config_get_split_index(void)
+{
+	int val = -1;
+
+	if (!git_config_get_maybe_bool("core.splitindex", &val))
+		return val;
+
+	return -1; /* default value */
+}
+
 NORETURN
 void git_die_config_linenr(const char *key, const char *filename, int linenr)
 {
-- 
2.11.0.49.g2414764.dirty


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox