Git development
 help / color / mirror / Atom feed
* [PATCH 0/5] A series of performance enhancements in the memihash and name-cache area
From: Johannes Schindelin @ 2017-02-14 11:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff Hostetler

On Windows, calls to memihash() and maintaining the istate.name_hash and
istate.dir_hash HashMaps take significant time on very large
repositories. This series of changes reduces the overall time taken for
various operations by reducing the number calls to memihash(), moving
some of them into multi-threaded code, and etc.

Note: one commenter in https://github.com/git-for-windows/git/pull/964
pointed out that memihash() only handles ASCII correctly. That is true.
And fixing this is outside the purview of this patch series.

[jes: renamed and reformatted a few places, and replaced global
constants by 1-bit fields, in the hopes to make the contribution a
smoother ride.]


Jeff Hostetler (5):
  name-hash: eliminate duplicate memihash call
  hashmap: allow memihash computation to be continued
  name-hash: precompute hash values during preload-index
  name-hash: specify initial size for istate.dir_hash table
  name-hash: remember previous dir_entry during lazy_init_name_hash

 cache.h         |   6 +++
 hashmap.c       |  14 +++++++
 hashmap.h       |   2 +
 name-hash.c     | 116 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 preload-index.c |   2 +
 read-cache.c    |   3 ++
 6 files changed, 127 insertions(+), 16 deletions(-)


base-commit: 5588dbffbd61e4906e453808c6ad32f792fea521
Published-As: https://github.com/dscho/git/releases/tag/memihash-perf-v1
Fetch-It-Via: git fetch https://github.com/dscho/git memihash-perf-v1

-- 
2.11.1.windows.1


^ permalink raw reply

* [PATCH 1/5] name-hash: eliminate duplicate memihash call
From: Johannes Schindelin @ 2017-02-14 11:31 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>

From: Jeff Hostetler <jeffhost@microsoft.com>

The existing code called memihash() to pass to the find_dir_entry()
function, and if not found, called memihash() again to pass to the
hashmap_add() function. Remove that duplicate memihash() call in
hash_dir_entry().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 name-hash.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/name-hash.c b/name-hash.c
index 6d9f23e9325..ad0bc0cef73 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -23,15 +23,21 @@ static int dir_entry_cmp(const struct dir_entry *e1,
 			name ? name : e2->name, e1->namelen);
 }
 
-static struct dir_entry *find_dir_entry(struct index_state *istate,
-		const char *name, unsigned int namelen)
+static struct dir_entry *find_dir_entry_1(struct index_state *istate,
+		const char *name, unsigned int namelen, unsigned int hash)
 {
 	struct dir_entry key;
-	hashmap_entry_init(&key, memihash(name, namelen));
+	hashmap_entry_init(&key, hash);
 	key.namelen = namelen;
 	return hashmap_get(&istate->dir_hash, &key, name);
 }
 
+static struct dir_entry *find_dir_entry(struct index_state *istate,
+		const char *name, unsigned int namelen)
+{
+	return find_dir_entry_1(istate, name, namelen, memihash(name, namelen));
+}
+
 static struct dir_entry *hash_dir_entry(struct index_state *istate,
 		struct cache_entry *ce, int namelen)
 {
@@ -43,6 +49,7 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 	 * in index_state.name_hash (as ordinary cache_entries).
 	 */
 	struct dir_entry *dir;
+	unsigned int hash;
 
 	/* get length of parent directory */
 	while (namelen > 0 && !is_dir_sep(ce->name[namelen - 1]))
@@ -52,11 +59,12 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 	namelen--;
 
 	/* lookup existing entry for that directory */
-	dir = find_dir_entry(istate, ce->name, namelen);
+	hash = memihash(ce->name, namelen);
+	dir = find_dir_entry_1(istate, ce->name, namelen, hash);
 	if (!dir) {
 		/* not found, create it and add to hash table */
 		FLEX_ALLOC_MEM(dir, name, ce->name, namelen);
-		hashmap_entry_init(dir, memihash(ce->name, namelen));
+		hashmap_entry_init(dir, hash);
 		dir->namelen = namelen;
 		hashmap_add(&istate->dir_hash, dir);
 
-- 
2.11.1.windows.1



^ permalink raw reply related

* [PATCH 2/5] hashmap: allow memihash computation to be continued
From: Johannes Schindelin @ 2017-02-14 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>

From: Jeff Hostetler <jeffhost@microsoft.com>

There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory. This can be expensive on large repositories.

With the new memihash_continue() function, we can hash the parent
directory first, and reuse that computed hash for all directory
entries.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 hashmap.c | 14 ++++++++++++++
 hashmap.h |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/hashmap.c b/hashmap.c
index b10b642229c..061b7d61da6 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -50,6 +50,20 @@ unsigned int memihash(const void *buf, size_t len)
 	return hash;
 }
 
+/* Incoporate another chunk of data into a memihash computation. */
+unsigned int memihash_continue(unsigned int hash,
+			       const void *buf, size_t len)
+{
+	const unsigned char *p = buf;
+	while (len--) {
+		unsigned int c = *p++;
+		if (c >= 'a' && c <= 'z')
+			c -= 'a' - 'A';
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
+
 #define HASHMAP_INITIAL_SIZE 64
 /* grow / shrink by 2^2 */
 #define HASHMAP_RESIZE_BITS 2
diff --git a/hashmap.h b/hashmap.h
index ab7958ae333..78e14dfde71 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -12,6 +12,8 @@ extern unsigned int strhash(const char *buf);
 extern unsigned int strihash(const char *buf);
 extern unsigned int memhash(const void *buf, size_t len);
 extern unsigned int memihash(const void *buf, size_t len);
+extern unsigned int memihash_continue(unsigned int hash_seed,
+				      const void *buf, size_t len);
 
 static inline unsigned int sha1hash(const unsigned char *sha1)
 {
-- 
2.11.1.windows.1



^ permalink raw reply related

* [PATCH 4/5] name-hash: specify initial size for istate.dir_hash table
From: Johannes Schindelin @ 2017-02-14 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>

From: Jeff Hostetler <jeffhost@microsoft.com>

Specify an initial size for the istate.dir_hash hash map matching the
size of the istate.name_hash.

Previously hashmap_init() was given 0, causing a 64 bucket hashmap to be
created. When working with very large repositories, this would cause
numerous rehash() calls to realloc and rebalance the hashmap. This is
especially true when the worktree is deep, with many directories
containing a only a few files each.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 name-hash.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/name-hash.c b/name-hash.c
index 49eb84846df..8f8336cc868 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -143,7 +143,8 @@ static void lazy_init_name_hash(struct index_state *istate)
 		return;
 	hashmap_init(&istate->name_hash, (hashmap_cmp_fn) cache_entry_cmp,
 			istate->cache_nr);
-	hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp, 0);
+	hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp,
+			istate->cache_nr);
 	for (nr = 0; nr < istate->cache_nr; nr++)
 		hash_index_entry(istate, istate->cache[nr]);
 	istate->name_hash_initialized = 1;
-- 
2.11.1.windows.1



^ permalink raw reply related

* [BUG] Memory corruption crash with "git bisect start"
From: Maxim Kuvyrkov @ 2017-02-14 11:56 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

I'm seeing the following memory corruption crash on a script-constructed repo when starting git bisect.  I'm seeing this crash both with system git of Ubuntu Xenial and with freshly-compiled git master from the official repo.

The log of the crash is attached.

It is possible that the bug is in Xenial glibc, in which case -- please let me know and I will take it up with the glibc developers.

To reproduce the crash:

<snip>
# Create a repo with 42 files, and change all of them.
rm -rf .git
git init
for i in `seq 1 42`; do echo 0 > f$i; git add f$i; echo 1 > f$i; done
git commit -m 0

# Download script to generate bisect history:
pushd ../
wget https://git.linaro.org/people/maxim.kuvyrkov/tree-bisect.git/plain/construct-tree.sh
popd

# Construct tree for bisect.  This script creates a history graph for all permutations of changed files up to "2" in depth.
# The goal of the script is to assist in reducing testcases for compiler/toolchain development.
../construct-tree.sh -d 2

# Trigger crash
git bisect start bottom top
</snip>

It is interesting that "42" number files is boundary.  The crash does not occur with history generated from 41 or fewer files, and only occurs with 42 or more files.  It appears that the contents of the files is not relevant for the crash.

--
Maxim Kuvyrkov
www.linaro.org



[-- Attachment #2: git-bisect-crash.log --]
[-- Type: application/octet-stream, Size: 10480 bytes --]

maxim.kuvyrkov@2492b3a161d0:~/tmp/test-bisect$ git bisect start bottom top
Previous HEAD position was 0520f90... 9
Switched to branch '42/1_2_3_4_5_6_7_8_9_10_11_12_13_14_15_16_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31_32_33_34_35_36_37_38_39_40_41_42'
*** buffer overflow detected ***: git terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f86df8017e5]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f86df8a256c]
/lib/x86_64-linux-gnu/libc.so.6(+0x116570)[0x7f86df8a0570]
git[0x474d21]
git[0x406365]
git[0x4066a8]
git[0x40584d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f86df7aa830]
git[0x405899]
======= Memory map: ========
00400000-005e1000 r-xp 00000000 08:05 2192298551                         /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e0000-007e1000 r--p 001e0000 08:05 2192298551                         /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e1000-007ea000 rw-p 001e1000 08:05 2192298551                         /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007ea000-00830000 rw-p 00000000 00:00 0 
00b0c000-00b2d000 rw-p 00000000 00:00 0                                  [heap]
7f86df2f0000-7f86df306000 r-xp 00000000 00:25 3394                       /lib/x86_64-linux-gnu/libgcc_s.so.1
7f86df306000-7f86df505000 ---p 00016000 00:25 3394                       /lib/x86_64-linux-gnu/libgcc_s.so.1
7f86df505000-7f86df506000 rw-p 00015000 00:25 3394                       /lib/x86_64-linux-gnu/libgcc_s.so.1
7f86df506000-7f86df586000 rw-p 00000000 00:00 0 
7f86df586000-7f86df589000 r-xp 00000000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df589000-7f86df788000 ---p 00003000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df788000-7f86df789000 r--p 00002000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df789000-7f86df78a000 rw-p 00003000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df78a000-7f86df949000 r-xp 00000000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f86df949000-7f86dfb49000 ---p 001bf000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f86dfb49000-7f86dfb4d000 r--p 001bf000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f86dfb4d000-7f86dfb4f000 rw-p 001c3000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f86dfb4f000-7f86dfb53000 rw-p 00000000 00:00 0 
7f86dfb53000-7f86dfb5a000 r-xp 00000000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfb5a000-7f86dfd59000 ---p 00007000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfd59000-7f86dfd5a000 r--p 00006000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfd5a000-7f86dfd5b000 rw-p 00007000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfd5b000-7f86dfd73000 r-xp 00000000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dfd73000-7f86dff72000 ---p 00018000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dff72000-7f86dff73000 r--p 00017000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dff73000-7f86dff74000 rw-p 00018000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dff74000-7f86dff78000 rw-p 00000000 00:00 0 
7f86dff78000-7f86e0192000 r-xp 00000000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e0192000-7f86e0391000 ---p 0021a000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e0391000-7f86e03ad000 r--p 00219000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e03ad000-7f86e03b9000 rw-p 00235000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e03b9000-7f86e03bc000 rw-p 00000000 00:00 0 
7f86e03bc000-7f86e03d5000 r-xp 00000000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e03d5000-7f86e05d4000 ---p 00019000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e05d4000-7f86e05d5000 r--p 00018000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e05d5000-7f86e05d6000 rw-p 00019000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e05d6000-7f86e05fc000 r-xp 00000000 00:25 31                         /lib/x86_64-linux-gnu/ld-2.23.so
7f86e0641000-7f86e07d9000 r--p 00000000 00:25 156                        /usr/lib/locale/locale-archive
7f86e07d9000-7f86e07de000 rw-p 00000000 00:00 0 
7f86e07f8000-7f86e07fb000 rw-p 00000000 00:00 0 
7f86e07fb000-7f86e07fc000 r--p 00025000 00:25 31                         /lib/x86_64-linux-gnu/ld-2.23.so
7f86e07fc000-7f86e07fd000 rw-p 00026000 00:25 31                         /lib/x86_64-linux-gnu/ld-2.23.so
7f86e07fd000-7f86e07fe000 rw-p 00000000 00:00 0 
7ffd4fa11000-7ffd4fa32000 rw-p 00000000 00:00 0                          [stack]
7ffd4fbed000-7ffd4fbef000 r-xp 00000000 00:00 0                          [vdso]
7ffd4fbef000-7ffd4fbf1000 r--p 00000000 00:00 0                          [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted (core dumped)
*** buffer overflow detected ***: git terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f9515a267e5]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f9515ac756c]
/lib/x86_64-linux-gnu/libc.so.6(+0x116570)[0x7f9515ac5570]
git[0x474d21]
git[0x406365]
git[0x4066a8]
git[0x40584d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f95159cf830]
git[0x405899]
======= Memory map: ========
00400000-005e1000 r-xp 00000000 08:05 2192298551                         /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e0000-007e1000 r--p 001e0000 08:05 2192298551                         /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e1000-007ea000 rw-p 001e1000 08:05 2192298551                         /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007ea000-00830000 rw-p 00000000 00:00 0 
01059000-0107a000 rw-p 00000000 00:00 0                                  [heap]
7f9515515000-7f951552b000 r-xp 00000000 00:25 3394                       /lib/x86_64-linux-gnu/libgcc_s.so.1
7f951552b000-7f951572a000 ---p 00016000 00:25 3394                       /lib/x86_64-linux-gnu/libgcc_s.so.1
7f951572a000-7f951572b000 rw-p 00015000 00:25 3394                       /lib/x86_64-linux-gnu/libgcc_s.so.1
7f951572b000-7f95157ab000 rw-p 00000000 00:00 0 
7f95157ab000-7f95157ae000 r-xp 00000000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f95157ae000-7f95159ad000 ---p 00003000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f95159ad000-7f95159ae000 r--p 00002000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f95159ae000-7f95159af000 rw-p 00003000 00:25 36                         /lib/x86_64-linux-gnu/libdl-2.23.so
7f95159af000-7f9515b6e000 r-xp 00000000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f9515b6e000-7f9515d6e000 ---p 001bf000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f9515d6e000-7f9515d72000 r--p 001bf000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f9515d72000-7f9515d74000 rw-p 001c3000 00:25 38                         /lib/x86_64-linux-gnu/libc-2.23.so
7f9515d74000-7f9515d78000 rw-p 00000000 00:00 0 
7f9515d78000-7f9515d7f000 r-xp 00000000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f9515d7f000-7f9515f7e000 ---p 00007000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f9515f7e000-7f9515f7f000 r--p 00006000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f9515f7f000-7f9515f80000 rw-p 00007000 00:25 147                        /lib/x86_64-linux-gnu/librt-2.23.so
7f9515f80000-7f9515f98000 r-xp 00000000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9515f98000-7f9516197000 ---p 00018000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9516197000-7f9516198000 r--p 00017000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9516198000-7f9516199000 rw-p 00018000 00:25 80                         /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9516199000-7f951619d000 rw-p 00000000 00:00 0 
7f951619d000-7f95163b7000 r-xp 00000000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95163b7000-7f95165b6000 ---p 0021a000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95165b6000-7f95165d2000 r--p 00219000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95165d2000-7f95165de000 rw-p 00235000 00:25 172                        /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95165de000-7f95165e1000 rw-p 00000000 00:00 0 
7f95165e1000-7f95165fa000 r-xp 00000000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95165fa000-7f95167f9000 ---p 00019000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95167f9000-7f95167fa000 r--p 00018000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95167fa000-7f95167fb000 rw-p 00019000 00:25 174                        /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95167fb000-7f9516821000 r-xp 00000000 00:25 31                         /lib/x86_64-linux-gnu/ld-2.23.so
7f9516866000-7f95169fe000 r--p 00000000 00:25 156                        /usr/lib/locale/locale-archive
7f95169fe000-7f9516a03000 rw-p 00000000 00:00 0 
7f9516a1d000-7f9516a20000 rw-p 00000000 00:00 0 
7f9516a20000-7f9516a21000 r--p 00025000 00:25 31                         /lib/x86_64-linux-gnu/ld-2.23.so
7f9516a21000-7f9516a22000 rw-p 00026000 00:25 31                         /lib/x86_64-linux-gnu/ld-2.23.so
7f9516a22000-7f9516a23000 rw-p 00000000 00:00 0 
7fff47216000-7fff47237000 rw-p 00000000 00:00 0                          [stack]
7fff4731e000-7fff47320000 r-xp 00000000 00:00 0                          [vdso]
7fff47320000-7fff47322000 r--p 00000000 00:00 0                          [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted (core dumped)
Bisecting: 41 revisions left to test after this (roughly 5 steps)
[0520f903823d04d139111c3d379806d6c4895586] 9
maxim.kuvyrkov@2492b3a161d0:~/tmp/test-bisect$ echo $?
0

^ permalink raw reply

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
From: Stefan Hajnoczi @ 2017-02-14  9:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Brandon Williams, git
In-Reply-To: <20170209052034.ksoupjcj4qs7x4hz@sigill.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 418 bytes --]

On Thu, Feb 09, 2017 at 12:20:34AM -0500, Jeff King wrote:
> On Wed, Feb 08, 2017 at 09:14:17PM -0800, Junio C Hamano wrote:
> > Jeff King <peff@peff.net> writes:
> (I _do_ think Stefan's proposed direction is worth it simply because the
> result is easier to read, but I agree the whole thing can be avoided by
> using pathspecs, as you've noted).

I won't be pushing this series further due to limited time.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply

* Re: [PATCH] mingw: make stderr unbuffered again
From: Johannes Schindelin @ 2017-02-14 14:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Sixt, Jeff Hostetler
In-Reply-To: <xmqqlgt9btrv.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Mon, 13 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > When removing the hack for isatty(), we actually removed more than just
> > an isatty() hack: we removed the hack where internal data structures of
> > the MSVC runtime are modified in order to redirect stdout/stderr.
> >
> > Instead of using that hack (that does not work with newer versions of
> > the runtime, anyway), we replaced it by reopening the respective file
> > descriptors.
> >
> > What we forgot was to mark stderr as unbuffered again.
> >
> > Reported by Hannes Sixt. Fixed with Jeff Hostetler's assistance.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> > Published-As: https://github.com/dscho/git/releases/tag/mingw-unbuffered-stderr-v1
> > Fetch-It-Via: git fetch https://github.com/dscho/git mingw-unbuffered-stderr-v1
> 
> OK.  Should this go directly to 'master', as the isatty thing is
> already in?

From my point of view, it is not crucial. The next Git for Windows version
will have it, of course, and Hannes is always running with his set of
patches, he can easily cherry-pick this one, too.

Ciao,
Johannes

P.S.: Could you please cut the remainder of the mail that you are not
responding to? Thanks.

^ permalink raw reply

* Re: [BUG] Memory corruption crash with "git bisect start"
From: Christian Couder @ 2017-02-14 14:52 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: git
In-Reply-To: <A7AEC719-6E15-4622-8E21-3E11BAF74A3C@linaro.org>

On Tue, Feb 14, 2017 at 12:56 PM, Maxim Kuvyrkov
<maxim.kuvyrkov@linaro.org> wrote:
> I'm seeing the following memory corruption crash on a script-constructed repo when starting git bisect.  I'm seeing this crash both with system git of Ubuntu Xenial and with freshly-compiled git master from the official repo.
>
> The log of the crash is attached.

Yeah, thanks for the report.

I took a look and it looks like git bisect crashes when it runs git
show-branch and it can be reproduced with just `git show-branch
bottom` instead of `git bisect start bottom top`.

> It is possible that the bug is in Xenial glibc, in which case -- please let me know and I will take it up with the glibc developers.

I have Ubuntu GLIBC 2.23-0ubuntu5 and I get a crash too.

^ permalink raw reply

* [PATCH] show-branch: fix crash with long ref name
From: Christian Couder @ 2017-02-14 15:48 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Maxim Kuvyrkov, Christian Couder

This is a minimum fix for a crash with a long ref name.

Note that if the refname is too long (for example if you
use 100 instead of 50 in the test script) you could get
an error like:

fatal: cannot lock ref 'refs/heads/1_2_3_4_ ... _98_99_100_':
Unable to create '... /.git/refs/heads/1_2_3_4_ ... _98_99_100_.lock':
File name too long

when creating the ref instead of a crash when using
show-branch and that would be ok.

So a simpler fix could have been just something like:

-       char head[128];
+       char head[1024];

But if the filesystem ever allows filenames longer than 1024
characters then the crash could appear again.

Reported by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/show-branch.c          | 14 +++++++-------
 t/t3204-show-branch-refname.sh | 19 +++++++++++++++++++
 2 files changed, 26 insertions(+), 7 deletions(-)
 create mode 100755 t/t3204-show-branch-refname.sh

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f3403ab..3c0fe55eec 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -620,7 +620,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
 	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
-	char head[128];
+	char *head_cpy;
 	const char *head_p;
 	int head_len;
 	struct object_id head_oid;
@@ -791,11 +791,11 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 				    head_oid.hash, NULL);
 	if (head_p) {
 		head_len = strlen(head_p);
-		memcpy(head, head_p, head_len + 1);
+		head_cpy = xstrdup(head_p);
 	}
 	else {
 		head_len = 0;
-		head[0] = 0;
+		head_cpy = xstrdup("");
 	}
 
 	if (with_current_branch && head_p) {
@@ -804,15 +804,15 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			/* We are only interested in adding the branch
 			 * HEAD points at.
 			 */
-			if (rev_is_head(head,
+			if (rev_is_head(head_cpy,
 					head_len,
 					ref_name[i],
 					head_oid.hash, NULL))
 				has_head++;
 		}
 		if (!has_head) {
-			int offset = starts_with(head, "refs/heads/") ? 11 : 0;
-			append_one_rev(head + offset);
+			int offset = starts_with(head_cpy, "refs/heads/") ? 11 : 0;
+			append_one_rev(head_cpy + offset);
 		}
 	}
 
@@ -865,7 +865,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	if (1 < num_rev || extra < 0) {
 		for (i = 0; i < num_rev; i++) {
 			int j;
-			int is_head = rev_is_head(head,
+			int is_head = rev_is_head(head_cpy,
 						  head_len,
 						  ref_name[i],
 						  head_oid.hash,
diff --git a/t/t3204-show-branch-refname.sh b/t/t3204-show-branch-refname.sh
new file mode 100755
index 0000000000..83b64e3032
--- /dev/null
+++ b/t/t3204-show-branch-refname.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='test show-branch with long refname'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+
+	test_commit first &&
+	long_refname=$(printf "%s_" $(seq 1 50)) &&
+	git checkout -b "$long_refname"
+'
+
+test_expect_success 'show-branch with long refname' '
+
+	git show-branch first
+'
+
+test_done
-- 
2.12.0.rc0


^ permalink raw reply related

* Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
From: Jonathan Tan @ 2017-02-14 16:53 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster
In-Reply-To: <20170214060729.v4r24y5tuaov3jrh@sigill.intra.peff.net>

On 02/13/2017 10:07 PM, Jeff King wrote:
> diff --git a/builtin/grep.c b/builtin/grep.c
> index e83b33bda..c4c632594 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  			break;
>  		}
>
> +		if (!use_index) {
> +			if (seen_dashdash)
> +				die(_("--no-index cannot be used with revs"));

There is a subsequent check that prints "--no-index or --untracked 
cannot be used with revs." - maybe we should just expand this part to 
incorporate that case. (That is, write `if (!use_index || untracked)` 
instead of `if (!use_index)`.) This also allows us to preserve the error 
message, which might be useful for someone using a translated version of 
Git.

> +			break;
> +		}
> +
>  		if (get_sha1_with_context(arg, 0, sha1, &oc)) {
>  			if (seen_dashdash)
>  				die(_("unable to resolve revision: %s"), arg);

^ permalink raw reply

* Re: [PATCH 0/7] grep rev/path parsing fixes
From: Jonathan Tan @ 2017-02-14 16:58 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster
In-Reply-To: <20170214060021.einv7372exbxa23z@sigill.intra.peff.net>

On 02/13/2017 10:00 PM, Jeff King wrote:
> I've fixed that, along with a few other bugs and cleanups. The complete
> series is below. Patch 2 is your (untouched) patch. My suggestions for
> your test are in patch 3, which can either remain on its own or be
> squashed in.
>
>   [1/7]: grep: move thread initialization a little lower
>   [2/7]: grep: do not unnecessarily query repo for "--"
>   [3/7]: t7810: make "--no-index --" test more robust
>   [4/7]: grep: re-order rev-parsing loop
>   [5/7]: grep: fix "--" rev/pathspec disambiguation
>   [6/7]: grep: avoid resolving revision names in --no-index case
>   [7/7]: grep: do not diagnose misspelt revs with --no-index

Thanks - these look good to me. I replied to 6/7 with a comment, but I 
also think that these are good as-is. Also, 3/7 can probably be squashed in.

^ permalink raw reply

* Re: [PATCH] show-branch: fix crash with long ref name
From: Jeff King @ 2017-02-14 17:25 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214154816.12625-1-chriscool@tuxfamily.org>

On Tue, Feb 14, 2017 at 04:48:16PM +0100, Christian Couder wrote:

> @@ -791,11 +791,11 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
>  				    head_oid.hash, NULL);
>  	if (head_p) {
>  		head_len = strlen(head_p);
> -		memcpy(head, head_p, head_len + 1);
> +		head_cpy = xstrdup(head_p);
>  	}
>  	else {
>  		head_len = 0;
> -		head[0] = 0;
> +		head_cpy = xstrdup("");
>  	}

This fixes the problem, but I think we can simplify it quite a bit by
using resolve_refdup(). Here's the patch series I ended up with:

  [1/3]: show-branch: drop head_len variable
  [2/3]: show-branch: store resolved head in heap buffer
  [3/3]: show-branch: use skip_prefix to drop magic numbers

 builtin/show-branch.c | 39 ++++++++++++---------------------------
 1 file changed, 12 insertions(+), 27 deletions(-)

-Peff

^ permalink raw reply

* [PATCH 1/3] show-branch: drop head_len variable
From: Jeff King @ 2017-02-14 17:26 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214172526.hzpm3d3ubd3vjnzr@sigill.intra.peff.net>

We copy the result of resolving HEAD into a buffer and keep
track of its length.  But we never actually use the length
for anything besides the copy. Let's stop passing it around.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/show-branch.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f3403a..e4c488b8c 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -470,7 +470,7 @@ static void snarf_refs(int head, int remotes)
 	}
 }
 
-static int rev_is_head(char *head, int headlen, char *name,
+static int rev_is_head(char *head, char *name,
 		       unsigned char *head_sha1, unsigned char *sha1)
 {
 	if ((!head[0]) ||
@@ -622,7 +622,6 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
 	char head[128];
 	const char *head_p;
-	int head_len;
 	struct object_id head_oid;
 	int merge_base = 0;
 	int independent = 0;
@@ -790,11 +789,10 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	head_p = resolve_ref_unsafe("HEAD", RESOLVE_REF_READING,
 				    head_oid.hash, NULL);
 	if (head_p) {
-		head_len = strlen(head_p);
+		size_t head_len = strlen(head_p);
 		memcpy(head, head_p, head_len + 1);
 	}
 	else {
-		head_len = 0;
 		head[0] = 0;
 	}
 
@@ -805,7 +803,6 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			 * HEAD points at.
 			 */
 			if (rev_is_head(head,
-					head_len,
 					ref_name[i],
 					head_oid.hash, NULL))
 				has_head++;
@@ -866,7 +863,6 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		for (i = 0; i < num_rev; i++) {
 			int j;
 			int is_head = rev_is_head(head,
-						  head_len,
 						  ref_name[i],
 						  head_oid.hash,
 						  rev[i]->object.oid.hash);
-- 
2.12.0.rc1.479.g59880b11e


^ permalink raw reply related

* [PATCH 2/3] show-branch: store resolved head in heap buffer
From: Jeff King @ 2017-02-14 17:27 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214172526.hzpm3d3ubd3vjnzr@sigill.intra.peff.net>

We resolve HEAD and copy the result to a fixed-size buffer
with memcpy, never checking that it actually fits. This bug
dates back to 8098a178b (Add git-symbolic-ref, 2005-09-30).
Before that we used readlink(), which took a maximum buffer
size.

We can fix this by using resolve_refdup(), which duplicates
the buffer on the heap. That also lets us just check
for a NULL pointer to see if we have resolved HEAD, and
drop the extra head_p variable.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/show-branch.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index e4c488b8c..404c4d09a 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -473,8 +473,7 @@ static void snarf_refs(int head, int remotes)
 static int rev_is_head(char *head, char *name,
 		       unsigned char *head_sha1, unsigned char *sha1)
 {
-	if ((!head[0]) ||
-	    (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
+	if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
 		return 0;
 	if (starts_with(head, "refs/heads/"))
 		head += 11;
@@ -620,8 +619,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
 	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
-	char head[128];
-	const char *head_p;
+	char *head;
 	struct object_id head_oid;
 	int merge_base = 0;
 	int independent = 0;
@@ -786,17 +784,10 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			snarf_refs(all_heads, all_remotes);
 	}
 
-	head_p = resolve_ref_unsafe("HEAD", RESOLVE_REF_READING,
-				    head_oid.hash, NULL);
-	if (head_p) {
-		size_t head_len = strlen(head_p);
-		memcpy(head, head_p, head_len + 1);
-	}
-	else {
-		head[0] = 0;
-	}
+	head = resolve_refdup("HEAD", RESOLVE_REF_READING,
+			      head_oid.hash, NULL);
 
-	if (with_current_branch && head_p) {
+	if (with_current_branch && head) {
 		int has_head = 0;
 		for (i = 0; !has_head && i < ref_name_cnt; i++) {
 			/* We are only interested in adding the branch
-- 
2.12.0.rc1.479.g59880b11e


^ permalink raw reply related

* [PATCH 3/3] show-branch: use skip_prefix to drop magic numbers
From: Jeff King @ 2017-02-14 17:28 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214172526.hzpm3d3ubd3vjnzr@sigill.intra.peff.net>

We make several starts_with() calls, only to advance
pointers. This is exactly what skip_prefix() is for, which
lets us avoid manually-counted magic numbers.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/show-branch.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 404c4d09a..c03d3ec7c 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -470,17 +470,14 @@ static void snarf_refs(int head, int remotes)
 	}
 }
 
-static int rev_is_head(char *head, char *name,
+static int rev_is_head(const char *head, const char *name,
 		       unsigned char *head_sha1, unsigned char *sha1)
 {
 	if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
 		return 0;
-	if (starts_with(head, "refs/heads/"))
-		head += 11;
-	if (starts_with(name, "refs/heads/"))
-		name += 11;
-	else if (starts_with(name, "heads/"))
-		name += 6;
+	skip_prefix(head, "refs/heads/", &head);
+	if (!skip_prefix(name, "refs/heads/", &name))
+		skip_prefix(name, "heads/", &name);
 	return !strcmp(head, name);
 }
 
@@ -799,8 +796,9 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 				has_head++;
 		}
 		if (!has_head) {
-			int offset = starts_with(head, "refs/heads/") ? 11 : 0;
-			append_one_rev(head + offset);
+			const char *name = head;
+			skip_prefix(name, "refs/heads/", &name);
+			append_one_rev(name);
 		}
 	}
 
-- 
2.12.0.rc1.479.g59880b11e

^ permalink raw reply related

* Re: [PATCH 06/11] refs-internal.h: correct is_per_worktree_ref()
From: Stefan Beller @ 2017-02-14 17:40 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git@vger.kernel.org, Junio C Hamano, Michael Haggerty,
	Johannes Schindelin, David Turner
In-Reply-To: <CACsJy8D=qFQ2_62e4oO1pSBz4JfZV4Zcoai=Ghjw5DTaNxrwog@mail.gmail.com>

On Tue, Feb 14, 2017 at 1:40 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Tue, Feb 14, 2017 at 5:37 AM, Stefan Beller <sbeller@google.com> wrote:
>> On Mon, Feb 13, 2017 at 7:20 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>>> All refs outside refs/ directory is per-worktree, not just HEAD.
>>>
>>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>> ---
>>>  refs/refs-internal.h | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>>> index f4aed49f5..69d02b6ba 100644
>>> --- a/refs/refs-internal.h
>>> +++ b/refs/refs-internal.h
>>> @@ -653,7 +653,7 @@ const char *resolve_ref_recursively(struct ref_store *refs,
>>>
>>>  static inline int is_per_worktree_ref(const char *refname)
>>>  {
>>> -       return !strcmp(refname, "HEAD") ||
>>> +       return !starts_with(refname, "refs/") ||
>>>                 starts_with(refname, "refs/bisect/");
>>
>> you're loosing HEAD here? (assuming we get HEAD in
>> short form here, as well as long form refs/HEAD)
>
> I don't understand. if refname is HEAD then both !strcmp(...) and
> !starts_with(refname, "refs/") return 1. If it's refs/HEAD, both
> return 0. In other words, there's no functional changes?

eh, my bad. You're right.

^ permalink raw reply

* Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
From: Jeff King @ 2017-02-14 18:04 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster
In-Reply-To: <eef97cc4-d616-b298-bc99-b2772b757190@google.com>

On Tue, Feb 14, 2017 at 08:53:04AM -0800, Jonathan Tan wrote:

> On 02/13/2017 10:07 PM, Jeff King wrote:
> > diff --git a/builtin/grep.c b/builtin/grep.c
> > index e83b33bda..c4c632594 100644
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> >  			break;
> >  		}
> > 
> > +		if (!use_index) {
> > +			if (seen_dashdash)
> > +				die(_("--no-index cannot be used with revs"));
> 
> There is a subsequent check that prints "--no-index or --untracked cannot be
> used with revs." - maybe we should just expand this part to incorporate that
> case. (That is, write `if (!use_index || untracked)` instead of `if
> (!use_index)`.) This also allows us to preserve the error message, which
> might be useful for someone using a translated version of Git.

I wasn't sure if we wanted to treat "untracked" in the same way.
Certainly we can catch the error here for the seen_dashdash case, but is
it also the case that:

  echo content >master
  git grep --untracked pattern master

should treat "master" as a path?

-Peff

^ permalink raw reply

* Re: [PATCH v2] clean: use warning_errno() when appropriate
From: Junio C Hamano @ 2017-02-14 18:13 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
In-Reply-To: <20170214095449.15585-1-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> All these warning() calls are preceded by a system call. Report the
> actual error to help the user understand why we fail to remove
> something.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  v2 dances with errno

Thanks.

>
>  builtin/clean.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index d6bc3aaae..3569736f6 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -154,6 +154,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
>  	struct strbuf quoted = STRBUF_INIT;
>  	struct dirent *e;
>  	int res = 0, ret = 0, gone = 1, original_len = path->len, len;
> +	int saved_errno;
>  	struct string_list dels = STRING_LIST_INIT_DUP;
>  
>  	*dir_gone = 1;
> @@ -173,9 +174,11 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
>  	if (!dir) {
>  		/* an empty dir could be removed even if it is unreadble */
>  		res = dry_run ? 0 : rmdir(path->buf);
> +		saved_errno = errno;
>  		if (res) {
>  			quote_path_relative(path->buf, prefix, &quoted);

I think this part should be more like

		res = ... : rmdir(...);
		if (res) {
			int saved_errno = errno;
			... do other things that can touch errno ...
			errno = saved_errno;
			... now we know what the original error was ...

The reason to store the errno in saved_errno here is not because we
want to help code after "if (res) {...}", but the patch sent as-is
gives that impression and is confusing to the readers.  

Perhaps all hunks of this patch share the same issue?  I could
locally amend, of course, but I'd like to double check before doing
so myself---perhaps you did it this way for a good reason that I am
missing?

^ permalink raw reply

* git-p4.py caching
From: Martin-Louis Bright @ 2017-02-14 18:16 UTC (permalink / raw)
  To: git

hi!

I am using git-p4.py to migrate a lot of medium and large Perforce
depots into git. I almost exclusively go one way: from Perforce to
git. I also frequently re-clone/re-migrate as the Perforce migration
client spec is refined.

For this, I have added rudimentary caching in git-p4.py so that
Perforce requests are not repeated over the network.

I have it working and tested on a ~150MB repo and the migration time was halved.

Is this something that would be of interest to the larger community?

--Martin

^ permalink raw reply

* Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
From: Jonathan Tan @ 2017-02-14 18:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster
In-Reply-To: <20170214180453.tpze2hdv3eytxfju@sigill.intra.peff.net>

On 02/14/2017 10:04 AM, Jeff King wrote:
> On Tue, Feb 14, 2017 at 08:53:04AM -0800, Jonathan Tan wrote:
>
>> On 02/13/2017 10:07 PM, Jeff King wrote:
>>> diff --git a/builtin/grep.c b/builtin/grep.c
>>> index e83b33bda..c4c632594 100644
>>> --- a/builtin/grep.c
>>> +++ b/builtin/grep.c
>>> @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>>  			break;
>>>  		}
>>>
>>> +		if (!use_index) {
>>> +			if (seen_dashdash)
>>> +				die(_("--no-index cannot be used with revs"));
>>
>> There is a subsequent check that prints "--no-index or --untracked cannot be
>> used with revs." - maybe we should just expand this part to incorporate that
>> case. (That is, write `if (!use_index || untracked)` instead of `if
>> (!use_index)`.) This also allows us to preserve the error message, which
>> might be useful for someone using a translated version of Git.
>
> I wasn't sure if we wanted to treat "untracked" in the same way.
> Certainly we can catch the error here for the seen_dashdash case, but is
> it also the case that:
>
>   echo content >master
>   git grep --untracked pattern master
>
> should treat "master" as a path?
>
> -Peff

It is already the case that it cannot be treated as a rev:

   $ git grep --untracked pattern master
   fatal: --no-index or --untracked cannot be used with revs.

So I think it would be better if it was treated as a path, for 
consistency with --no-index. I'm OK either way, though.

^ permalink raw reply

* Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API
From: Junio C Hamano @ 2017-02-14 18:24 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Stefan Beller, git@vger.kernel.org, Michael Haggerty,
	Johannes Schindelin, David Turner
In-Reply-To: <CACsJy8ChQqUd4poeeKQruQSwdys=ydzxGDC8fU6ZgfrFEEm7NQ@mail.gmail.com>

Duy Nguyen <pclouds@gmail.com> writes:

> Direct call sites are just middle men though, e.g.
> for_each_ref_submodule(). I'll attempt to clean this up at some point
> in future (probably at the same time I attempt to kill *_submodule ref
> api). But I think for now I'll just put a TODO or FIXME comment here.

So we'd have get_*_ref_store() for various cases and then will have
a unifying for_each_ref_in_refstore() that takes a ref-store, which
supersedes all the ad-hoc iterators like for_each_ref_submodule(),
for_each_namespaced_ref(), etc?

That's a very sensible longer-term goal, methinks.

I am wondering what we should do to for_each_{tag,branch,...}_ref().  
Would that become

	ref_store = get_ref_main_store();
	tag_ref_store = filter_ref_store(ref_store, "refs/tags/");
	for_each_ref_in_refstore(tag_ref_store, ...);

or do you plan to have some other pattern?

^ permalink raw reply

* Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API
From: Stefan Beller @ 2017-02-14 18:43 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git@vger.kernel.org, Junio C Hamano, Michael Haggerty,
	Johannes Schindelin, David Turner
In-Reply-To: <CACsJy8ChQqUd4poeeKQruQSwdys=ydzxGDC8fU6ZgfrFEEm7NQ@mail.gmail.com>

On Tue, Feb 14, 2017 at 2:04 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Tue, Feb 14, 2017 at 6:55 AM, Stefan Beller <sbeller@google.com> wrote:
>>>
>>> +/*
>>> + * Return the ref_store instance for the specified submodule. For the
>>> + * main repository, use submodule==NULL; such a call cannot fail.
>>
>> So now we have both a get_main as well as a get_submodule function,
>> but the submodule function can return the main as well?
>>
>> I'd rather see this as a BUG; or asking another way:
>> What is the difference between get_submodule_ref_store(NULL)
>> and get_main_ref_store() ?
>
> Technical debts :) In order to do that, we need to make sure
> _submodule() never takes NULL as main repo. But current code does. On
> example is revision.c where submodule==NULL is the main repo. In the
> end, I agree that get_submodule_ref_store(NULL) should be a bug.
>
>> As you went through all call sites (by renaming the function), we'd
>> be able to tell that there is no caller with NULL, or is it?
>
> Direct call sites are just middle men though, e.g.
> for_each_ref_submodule(). I'll attempt to clean this up at some point
> in future (probably at the same time I attempt to kill *_submodule ref
> api). But I think for now I'll just put a TODO or FIXME comment here.

ok, fine with me.

My line of thinking when originally responding was to put the FIXME
comment at the caller site, and there we'd differentiate between the
two cases and call the appropriate function.

> --
> Duy

^ permalink raw reply

* Re: [PATCH] mingw: make stderr unbuffered again
From: Johannes Sixt @ 2017-02-14 18:45 UTC (permalink / raw)
  To: Johannes Schindelin, Junio C Hamano; +Cc: git, Jeff Hostetler
In-Reply-To: <alpine.DEB.2.20.1702141545380.3496@virtualbox>

Am 14.02.2017 um 15:47 schrieb Johannes Schindelin:
> On Mon, 13 Feb 2017, Junio C Hamano wrote:
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>>> When removing the hack for isatty(), we actually removed more than just
>>> an isatty() hack: we removed the hack where internal data structures of
>>> the MSVC runtime are modified in order to redirect stdout/stderr.
>>>
>>> Instead of using that hack (that does not work with newer versions of
>>> the runtime, anyway), we replaced it by reopening the respective file
>>> descriptors.
>>>
>>> What we forgot was to mark stderr as unbuffered again.

I do not see how the earlier patch turned stderr from unbuffered to 
buffered, as it did not add or remove any setvbuf() call. Can you explain?

>> OK.  Should this go directly to 'master', as the isatty thing is
>> already in?
>
>>From my point of view, it is not crucial. The next Git for Windows version
> will have it, of course, and Hannes is always running with his set of
> patches, he can easily cherry-pick this one, too.

The patch fixes the problem for me here.

I am a bit hesitant to call it "not crucial": The symptom I observed is 
certainly not a show stopper; but who knows where else it is important 
that stderr goes to the terminal earlier than other output. As it is a 
new regression, it should be included in the next release.

Thanks,
-- Hannes


^ permalink raw reply

* Re: [PATCH 1/7] grep: move thread initialization a little lower
From: Brandon Williams @ 2017-02-14 18:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git, gitster
In-Reply-To: <20170214060237.gvzzbmjzerlqnzjv@sigill.intra.peff.net>

On 02/14, Jeff King wrote:
> Originally, we set up the threads for grep before parsing
> the non-option arguments. In 53b8d931b (grep: disable
> threading in non-worktree case, 2011-12-12), the thread code
> got bumped lower in the function because it now needed to
> know whether we got any revision arguments.
> 
> That put a big block of code in between the parsing of revs
> and the parsing of pathspecs, both of which share some loop
> variables. That makes it harder to read the code than the
> original, where the shared loops were right next to each
> other.
> 
> Let's bump the thread initialization until after all of the
> parsing is done.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I double-checked to make sure no other code was relying on
> the thread setup having happened. I think we could actually
> bump it quite a bit lower (to right before we actually start
> grepping), but I doubt it matters much in practice.

Looks good.  And yes I don't believe anything needs the thread
initialization to happen earlier.

-- 
Brandon Williams

^ permalink raw reply

* Re: [PATCH 4/7] grep: re-order rev-parsing loop
From: Brandon Williams @ 2017-02-14 18:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git, gitster
In-Reply-To: <20170214060417.yq27zdcgmrckjmua@sigill.intra.peff.net>

On 02/14, Jeff King wrote:
> -		/* Is it a rev? */
> -		if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
> -			struct object *object = parse_object_or_die(sha1, arg);
> -			if (!seen_dashdash)
> -				verify_non_filename(prefix, arg);
> -			add_object_array_with_path(object, arg, &list, oc.mode, oc.path);
> -			continue;
> -		}
> -		break;
> +
> +		/* Stop at the first non-rev */
> +		if (get_sha1_with_context(arg, 0, sha1, &oc))
> +			break;
> +
> +		object = parse_object_or_die(sha1, arg);
> +		if (!seen_dashdash)
> +			verify_non_filename(prefix, arg);
> +		add_object_array_with_path(object, arg, &list, oc.mode, oc.path);

This is much more readable!

-- 
Brandon Williams

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox