* Re: [PATCH 06/11] refs-internal.h: correct is_per_worktree_ref()
From: Duy Nguyen @ 2017-02-14 9:40 UTC (permalink / raw)
To: Stefan Beller
Cc: git@vger.kernel.org, Junio C Hamano, Michael Haggerty,
Johannes Schindelin, David Turner
In-Reply-To: <CAGZ79kYkc-_=RiK1uJ+ndhQu=B8u=UDVusXZu-dYe7KnGNye3Q@mail.gmail.com>
On Tue, Feb 14, 2017 at 5:37 AM, Stefan Beller <sbeller@google.com> wrote:
> On Mon, Feb 13, 2017 at 7:20 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>> All refs outside refs/ directory is per-worktree, not just HEAD.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>> refs/refs-internal.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>> index f4aed49f5..69d02b6ba 100644
>> --- a/refs/refs-internal.h
>> +++ b/refs/refs-internal.h
>> @@ -653,7 +653,7 @@ const char *resolve_ref_recursively(struct ref_store *refs,
>>
>> static inline int is_per_worktree_ref(const char *refname)
>> {
>> - return !strcmp(refname, "HEAD") ||
>> + return !starts_with(refname, "refs/") ||
>> starts_with(refname, "refs/bisect/");
>
> you're loosing HEAD here? (assuming we get HEAD in
> short form here, as well as long form refs/HEAD)
I don't understand. if refname is HEAD then both !strcmp(...) and
!starts_with(refname, "refs/") return 1. If it's refs/HEAD, both
return 0. In other words, there's no functional changes?
--
Duy
^ permalink raw reply
* Re: [PATCH 04/11] files-backend: replace *git_path*() with files_path()
From: Duy Nguyen @ 2017-02-14 9:43 UTC (permalink / raw)
To: Ramsay Jones
Cc: Git Mailing List, Junio C Hamano, Michael Haggerty,
Johannes Schindelin, Stefan Beller, David Turner
In-Reply-To: <8f6e09aa-f578-6faf-6acc-e35be71ee990@ramsayjones.plus.com>
On Tue, Feb 14, 2017 at 3:58 AM, Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
>> -void files_path(struct files_ref_store *refs, struct strbuf *sb,
>> - const char *fmt, ...) __attribute__((format (printf, 3, 4)));
>
> You only added this in the last commit, so maybe mark it static in
> the previous patch! Also, just in case you were wondering, the 'Why?'
> of the previous email was, "Why do you need this forward declaration?"
> (hint: you don't ;-)
Yeah, my compiler considers unused static functions an error. But I
guess I should adjust those million compiler flags instead of doing it
like this. I'll need to check if it's an error with DEVELOPER_CFLAGS
because people may bisect with that.
--
Duy
^ permalink raw reply
* [PATCH v2] clean: use warning_errno() when appropriate
From: Nguyễn Thái Ngọc Duy @ 2017-02-14 9:54 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jeff King, Nguyễn Thái Ngọc Duy
In-Reply-To: <20170213092702.10462-1-pclouds@gmail.com>
All these warning() calls are preceded by a system call. Report the
actual error to help the user understand why we fail to remove
something.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
v2 dances with errno
builtin/clean.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/builtin/clean.c b/builtin/clean.c
index d6bc3aaae..3569736f6 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -154,6 +154,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
struct strbuf quoted = STRBUF_INIT;
struct dirent *e;
int res = 0, ret = 0, gone = 1, original_len = path->len, len;
+ int saved_errno;
struct string_list dels = STRING_LIST_INIT_DUP;
*dir_gone = 1;
@@ -173,9 +174,11 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
if (!dir) {
/* an empty dir could be removed even if it is unreadble */
res = dry_run ? 0 : rmdir(path->buf);
+ saved_errno = errno;
if (res) {
quote_path_relative(path->buf, prefix, "ed);
- warning(_(msg_warn_remove_failed), quoted.buf);
+ errno = saved_errno;
+ warning_errno(_(msg_warn_remove_failed), quoted.buf);
*dir_gone = 0;
}
return res;
@@ -204,12 +207,14 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
continue;
} else {
res = dry_run ? 0 : unlink(path->buf);
+ saved_errno = errno;
if (!res) {
quote_path_relative(path->buf, prefix, "ed);
string_list_append(&dels, quoted.buf);
} else {
quote_path_relative(path->buf, prefix, "ed);
- warning(_(msg_warn_remove_failed), quoted.buf);
+ errno = saved_errno;
+ warning_errno(_(msg_warn_remove_failed), quoted.buf);
*dir_gone = 0;
ret = 1;
}
@@ -227,11 +232,13 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
if (*dir_gone) {
res = dry_run ? 0 : rmdir(path->buf);
+ saved_errno = errno;
if (!res)
*dir_gone = 1;
else {
quote_path_relative(path->buf, prefix, "ed);
- warning(_(msg_warn_remove_failed), quoted.buf);
+ errno = saved_errno;
+ warning_errno(_(msg_warn_remove_failed), quoted.buf);
*dir_gone = 0;
ret = 1;
}
@@ -853,7 +860,7 @@ static void interactive_main_loop(void)
int cmd_clean(int argc, const char **argv, const char *prefix)
{
- int i, res;
+ int i, res, saved_errno;
int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
@@ -980,9 +987,11 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
}
} else {
res = dry_run ? 0 : unlink(abs_path.buf);
+ saved_errno = errno;
if (res) {
qname = quote_path_relative(item->string, NULL, &buf);
- warning(_(msg_warn_remove_failed), qname);
+ errno = saved_errno;
+ warning_errno(_(msg_warn_remove_failed), qname);
errors++;
} else if (!quiet) {
qname = quote_path_relative(item->string, NULL, &buf);
--
2.11.0.157.gd943d85
^ permalink raw reply related
* Re: [PATCH 11/11] refs: split and make get_*_ref_store() public API
From: Duy Nguyen @ 2017-02-14 10:04 UTC (permalink / raw)
To: Stefan Beller
Cc: git@vger.kernel.org, Junio C Hamano, Michael Haggerty,
Johannes Schindelin, David Turner
In-Reply-To: <CAGZ79kZC6TntQrW7MF6-h5z5En-u6rwNX=zuaHRNDpbO80ALHA@mail.gmail.com>
On Tue, Feb 14, 2017 at 6:55 AM, Stefan Beller <sbeller@google.com> wrote:
>>
>> +/*
>> + * Return the ref_store instance for the specified submodule. For the
>> + * main repository, use submodule==NULL; such a call cannot fail.
>
> So now we have both a get_main as well as a get_submodule function,
> but the submodule function can return the main as well?
>
> I'd rather see this as a BUG; or asking another way:
> What is the difference between get_submodule_ref_store(NULL)
> and get_main_ref_store() ?
Technical debts :) In order to do that, we need to make sure
_submodule() never takes NULL as main repo. But current code does. On
example is revision.c where submodule==NULL is the main repo. In the
end, I agree that get_submodule_ref_store(NULL) should be a bug.
> As you went through all call sites (by renaming the function), we'd
> be able to tell that there is no caller with NULL, or is it?
Direct call sites are just middle men though, e.g.
for_each_ref_submodule(). I'll attempt to clean this up at some point
in future (probably at the same time I attempt to kill *_submodule ref
api). But I think for now I'll just put a TODO or FIXME comment here.
--
Duy
^ permalink raw reply
* [PATCH 5/5] name-hash: remember previous dir_entry during lazy_init_name_hash
From: Johannes Schindelin @ 2017-02-14 11:32 UTC (permalink / raw)
To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>
From: Jeff Hostetler <jeffhost@microsoft.com>
Teach hash_dir_entry() to remember the previously found dir_entry during
lazy_init_name_hash() iteration. This is a performance optimization.
Since items in the index array are sorted by full pathname, adjacent
items are likely to be in the same directory. This can save memihash()
computations and hash map lookups.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
name-hash.c | 50 ++++++++++++++++++++++++++++++++++++--------------
1 file changed, 36 insertions(+), 14 deletions(-)
diff --git a/name-hash.c b/name-hash.c
index 8f8336cc868..f95054f44cb 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -39,7 +39,8 @@ static struct dir_entry *find_dir_entry(struct index_state *istate,
}
static struct dir_entry *hash_dir_entry(struct index_state *istate,
- struct cache_entry *ce, int namelen)
+ struct cache_entry *ce, int namelen,
+ struct dir_entry **previous_dir)
{
/*
* Throw each directory component in the hash for quick lookup
@@ -63,11 +64,24 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
namelen--;
/* lookup existing entry for that directory */
- if (ce->precomputed_hash.initialized && orig_namelen == ce_namelen(ce))
- hash = ce->precomputed_hash.dir;
- else
- hash = memihash(ce->name, namelen);
- dir = find_dir_entry_1(istate, ce->name, namelen, hash);
+ if (previous_dir && *previous_dir
+ && namelen == (*previous_dir)->namelen
+ && memcmp(ce->name, (*previous_dir)->name, namelen) == 0) {
+ /*
+ * When our caller is sequentially iterating through the index,
+ * items in the same directory will be sequential, and therefore
+ * refer to the same dir_entry.
+ */
+ dir = *previous_dir;
+ } else {
+ if (ce->precomputed_hash.initialized &&
+ orig_namelen == ce_namelen(ce))
+ hash = ce->precomputed_hash.dir;
+ else
+ hash = memihash(ce->name, namelen);
+ dir = find_dir_entry_1(istate, ce->name, namelen, hash);
+ }
+
if (!dir) {
/* not found, create it and add to hash table */
FLEX_ALLOC_MEM(dir, name, ce->name, namelen);
@@ -76,15 +90,21 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
hashmap_add(&istate->dir_hash, dir);
/* recursively add missing parent directories */
- dir->parent = hash_dir_entry(istate, ce, namelen);
+ dir->parent = hash_dir_entry(istate, ce, namelen, NULL);
}
+
+ if (previous_dir)
+ *previous_dir = dir;
+
return dir;
}
-static void add_dir_entry(struct index_state *istate, struct cache_entry *ce)
+static void add_dir_entry(struct index_state *istate, struct cache_entry *ce,
+ struct dir_entry **previous_dir)
{
/* Add reference to the directory entry (and parents if 0). */
- struct dir_entry *dir = hash_dir_entry(istate, ce, ce_namelen(ce));
+ struct dir_entry *dir = hash_dir_entry(istate, ce, ce_namelen(ce),
+ previous_dir);
while (dir && !(dir->nr++))
dir = dir->parent;
}
@@ -95,7 +115,7 @@ static void remove_dir_entry(struct index_state *istate, struct cache_entry *ce)
* Release reference to the directory entry. If 0, remove and continue
* with parent directory.
*/
- struct dir_entry *dir = hash_dir_entry(istate, ce, ce_namelen(ce));
+ struct dir_entry *dir = hash_dir_entry(istate, ce, ce_namelen(ce), NULL);
while (dir && !(--dir->nr)) {
struct dir_entry *parent = dir->parent;
hashmap_remove(&istate->dir_hash, dir, NULL);
@@ -104,7 +124,8 @@ static void remove_dir_entry(struct index_state *istate, struct cache_entry *ce)
}
}
-static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
+static void hash_index_entry(struct index_state *istate, struct cache_entry *ce,
+ struct dir_entry **previous_dir)
{
unsigned int h;
@@ -121,7 +142,7 @@ static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
hashmap_add(&istate->name_hash, ce);
if (ignore_case)
- add_dir_entry(istate, ce);
+ add_dir_entry(istate, ce, previous_dir);
}
static int cache_entry_cmp(const struct cache_entry *ce1,
@@ -137,6 +158,7 @@ static int cache_entry_cmp(const struct cache_entry *ce1,
static void lazy_init_name_hash(struct index_state *istate)
{
+ struct dir_entry *previous_dir = NULL;
int nr;
if (istate->name_hash_initialized)
@@ -146,14 +168,14 @@ static void lazy_init_name_hash(struct index_state *istate)
hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp,
istate->cache_nr);
for (nr = 0; nr < istate->cache_nr; nr++)
- hash_index_entry(istate, istate->cache[nr]);
+ hash_index_entry(istate, istate->cache[nr], &previous_dir);
istate->name_hash_initialized = 1;
}
void add_name_hash(struct index_state *istate, struct cache_entry *ce)
{
if (istate->name_hash_initialized)
- hash_index_entry(istate, ce);
+ hash_index_entry(istate, ce, NULL);
}
void remove_name_hash(struct index_state *istate, struct cache_entry *ce)
--
2.11.1.windows.1
^ permalink raw reply related
* [PATCH 3/5] name-hash: precompute hash values during preload-index
From: Johannes Schindelin @ 2017-02-14 11:32 UTC (permalink / raw)
To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>
From: Jeff Hostetler <jeffhost@microsoft.com>
Precompute the istate.name_hash and istate.dir_hash values
for each cache-entry during the preload-index phase.
Move the expensive memihash() calculations from lazy_init_name_hash()
to the multi-threaded preload-index phase.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
cache.h | 6 ++++++
name-hash.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
preload-index.c | 2 ++
read-cache.c | 3 +++
4 files changed, 66 insertions(+), 2 deletions(-)
diff --git a/cache.h b/cache.h
index 61fc86e6d71..56f7c97cdbe 100644
--- a/cache.h
+++ b/cache.h
@@ -173,6 +173,10 @@ struct cache_entry {
unsigned int ce_flags;
unsigned int ce_namelen;
unsigned int index; /* for link extension */
+ struct {
+ unsigned initialized:1, root_entry:1;
+ unsigned int name, dir;
+ } precomputed_hash;
struct object_id oid;
char name[FLEX_ARRAY]; /* more */
};
@@ -229,6 +233,8 @@ struct cache_entry {
#error "CE_EXTENDED_FLAGS out of range"
#endif
+void precompute_istate_hashes(struct cache_entry *ce);
+
/* Forward structure decls */
struct pathspec;
struct child_process;
diff --git a/name-hash.c b/name-hash.c
index ad0bc0cef73..49eb84846df 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -50,6 +50,10 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
*/
struct dir_entry *dir;
unsigned int hash;
+ int orig_namelen = namelen;
+
+ if (ce->precomputed_hash.initialized && ce->precomputed_hash.root_entry)
+ return NULL; /* item does not have a parent directory */
/* get length of parent directory */
while (namelen > 0 && !is_dir_sep(ce->name[namelen - 1]))
@@ -59,7 +63,10 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
namelen--;
/* lookup existing entry for that directory */
- hash = memihash(ce->name, namelen);
+ if (ce->precomputed_hash.initialized && orig_namelen == ce_namelen(ce))
+ hash = ce->precomputed_hash.dir;
+ else
+ hash = memihash(ce->name, namelen);
dir = find_dir_entry_1(istate, ce->name, namelen, hash);
if (!dir) {
/* not found, create it and add to hash table */
@@ -99,10 +106,18 @@ static void remove_dir_entry(struct index_state *istate, struct cache_entry *ce)
static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
{
+ unsigned int h;
+
if (ce->ce_flags & CE_HASHED)
return;
ce->ce_flags |= CE_HASHED;
- hashmap_entry_init(ce, memihash(ce->name, ce_namelen(ce)));
+
+ if (ce->precomputed_hash.initialized)
+ h = ce->precomputed_hash.name;
+ else
+ h = memihash(ce->name, ce_namelen(ce));
+
+ hashmap_entry_init(ce, h);
hashmap_add(&istate->name_hash, ce);
if (ignore_case)
@@ -244,3 +259,41 @@ void free_name_hash(struct index_state *istate)
hashmap_free(&istate->name_hash, 0);
hashmap_free(&istate->dir_hash, 1);
}
+
+/*
+ * Precompute the hash values for this cache_entry
+ * for use in the istate.name_hash and istate.dir_hash.
+ *
+ * If the item is in the root directory, just compute the hash value (for
+ * istate.name_hash) on the full path.
+ *
+ * If the item is in a subdirectory, first compute the hash value for the
+ * immediate parent directory (for istate.dir_hash) and then the hash value for
+ * the full path by continuing the computation.
+ *
+ * Note that these hashes will be used by wt_status_collect_untracked() as it
+ * scans the worktree and maps observed paths back to the index (optionally
+ * ignoring case). Technically, we only *NEED* to precompute this for
+ * non-skip-worktree items (since status should not observe skipped items), but
+ * because lazy_init_name_hash() hashes everything, we force it here.
+ */
+void precompute_istate_hashes(struct cache_entry *ce)
+{
+ int namelen = ce_namelen(ce);
+
+ while (namelen > 0 && !is_dir_sep(ce->name[namelen - 1]))
+ namelen--;
+
+ if (namelen <= 0) {
+ ce->precomputed_hash.name = memihash(ce->name, ce_namelen(ce));
+ ce->precomputed_hash.root_entry = 1;
+ } else {
+ namelen--;
+ ce->precomputed_hash.dir = memihash(ce->name, namelen);
+ ce->precomputed_hash.name = memihash_continue(
+ ce->precomputed_hash.dir, ce->name + namelen,
+ ce_namelen(ce) - namelen);
+ ce->precomputed_hash.root_entry = 0;
+ }
+ ce->precomputed_hash.initialized = 1;
+}
diff --git a/preload-index.c b/preload-index.c
index c1fe3a3ef9c..602737f9d0f 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -47,6 +47,8 @@ static void *preload_thread(void *_data)
struct cache_entry *ce = *cep++;
struct stat st;
+ precompute_istate_hashes(ce);
+
if (ce_stage(ce))
continue;
if (S_ISGITLINK(ce->ce_mode))
diff --git a/read-cache.c b/read-cache.c
index 9054369dd0c..1a25c345441 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -73,6 +73,7 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
copy_cache_entry(new, old);
new->ce_flags &= ~CE_HASHED;
new->ce_namelen = namelen;
+ new->precomputed_hash.initialized = 0;
new->index = 0;
memcpy(new->name, new_name, namelen + 1);
@@ -614,6 +615,7 @@ static struct cache_entry *create_alias_ce(struct index_state *istate,
new = xcalloc(1, cache_entry_size(len));
memcpy(new->name, alias->name, len);
copy_cache_entry(new, ce);
+ new->precomputed_hash.initialized = 0;
save_or_free_index_entry(istate, ce);
return new;
}
@@ -1446,6 +1448,7 @@ static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *on
ce->ce_stat_data.sd_size = get_be32(&ondisk->size);
ce->ce_flags = flags & ~CE_NAMEMASK;
ce->ce_namelen = len;
+ ce->precomputed_hash.initialized = 0;
ce->index = 0;
hashcpy(ce->oid.hash, ondisk->sha1);
memcpy(ce->name, name, len);
--
2.11.1.windows.1
^ permalink raw reply related
* [PATCH 0/5] A series of performance enhancements in the memihash and name-cache area
From: Johannes Schindelin @ 2017-02-14 11:31 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jeff Hostetler
On Windows, calls to memihash() and maintaining the istate.name_hash and
istate.dir_hash HashMaps take significant time on very large
repositories. This series of changes reduces the overall time taken for
various operations by reducing the number calls to memihash(), moving
some of them into multi-threaded code, and etc.
Note: one commenter in https://github.com/git-for-windows/git/pull/964
pointed out that memihash() only handles ASCII correctly. That is true.
And fixing this is outside the purview of this patch series.
[jes: renamed and reformatted a few places, and replaced global
constants by 1-bit fields, in the hopes to make the contribution a
smoother ride.]
Jeff Hostetler (5):
name-hash: eliminate duplicate memihash call
hashmap: allow memihash computation to be continued
name-hash: precompute hash values during preload-index
name-hash: specify initial size for istate.dir_hash table
name-hash: remember previous dir_entry during lazy_init_name_hash
cache.h | 6 +++
hashmap.c | 14 +++++++
hashmap.h | 2 +
name-hash.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++++--------
preload-index.c | 2 +
read-cache.c | 3 ++
6 files changed, 127 insertions(+), 16 deletions(-)
base-commit: 5588dbffbd61e4906e453808c6ad32f792fea521
Published-As: https://github.com/dscho/git/releases/tag/memihash-perf-v1
Fetch-It-Via: git fetch https://github.com/dscho/git memihash-perf-v1
--
2.11.1.windows.1
^ permalink raw reply
* [PATCH 1/5] name-hash: eliminate duplicate memihash call
From: Johannes Schindelin @ 2017-02-14 11:31 UTC (permalink / raw)
To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>
From: Jeff Hostetler <jeffhost@microsoft.com>
The existing code called memihash() to pass to the find_dir_entry()
function, and if not found, called memihash() again to pass to the
hashmap_add() function. Remove that duplicate memihash() call in
hash_dir_entry().
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
name-hash.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/name-hash.c b/name-hash.c
index 6d9f23e9325..ad0bc0cef73 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -23,15 +23,21 @@ static int dir_entry_cmp(const struct dir_entry *e1,
name ? name : e2->name, e1->namelen);
}
-static struct dir_entry *find_dir_entry(struct index_state *istate,
- const char *name, unsigned int namelen)
+static struct dir_entry *find_dir_entry_1(struct index_state *istate,
+ const char *name, unsigned int namelen, unsigned int hash)
{
struct dir_entry key;
- hashmap_entry_init(&key, memihash(name, namelen));
+ hashmap_entry_init(&key, hash);
key.namelen = namelen;
return hashmap_get(&istate->dir_hash, &key, name);
}
+static struct dir_entry *find_dir_entry(struct index_state *istate,
+ const char *name, unsigned int namelen)
+{
+ return find_dir_entry_1(istate, name, namelen, memihash(name, namelen));
+}
+
static struct dir_entry *hash_dir_entry(struct index_state *istate,
struct cache_entry *ce, int namelen)
{
@@ -43,6 +49,7 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
* in index_state.name_hash (as ordinary cache_entries).
*/
struct dir_entry *dir;
+ unsigned int hash;
/* get length of parent directory */
while (namelen > 0 && !is_dir_sep(ce->name[namelen - 1]))
@@ -52,11 +59,12 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
namelen--;
/* lookup existing entry for that directory */
- dir = find_dir_entry(istate, ce->name, namelen);
+ hash = memihash(ce->name, namelen);
+ dir = find_dir_entry_1(istate, ce->name, namelen, hash);
if (!dir) {
/* not found, create it and add to hash table */
FLEX_ALLOC_MEM(dir, name, ce->name, namelen);
- hashmap_entry_init(dir, memihash(ce->name, namelen));
+ hashmap_entry_init(dir, hash);
dir->namelen = namelen;
hashmap_add(&istate->dir_hash, dir);
--
2.11.1.windows.1
^ permalink raw reply related
* [PATCH 2/5] hashmap: allow memihash computation to be continued
From: Johannes Schindelin @ 2017-02-14 11:32 UTC (permalink / raw)
To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>
From: Jeff Hostetler <jeffhost@microsoft.com>
There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory. This can be expensive on large repositories.
With the new memihash_continue() function, we can hash the parent
directory first, and reuse that computed hash for all directory
entries.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
hashmap.c | 14 ++++++++++++++
hashmap.h | 2 ++
2 files changed, 16 insertions(+)
diff --git a/hashmap.c b/hashmap.c
index b10b642229c..061b7d61da6 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -50,6 +50,20 @@ unsigned int memihash(const void *buf, size_t len)
return hash;
}
+/* Incoporate another chunk of data into a memihash computation. */
+unsigned int memihash_continue(unsigned int hash,
+ const void *buf, size_t len)
+{
+ const unsigned char *p = buf;
+ while (len--) {
+ unsigned int c = *p++;
+ if (c >= 'a' && c <= 'z')
+ c -= 'a' - 'A';
+ hash = (hash * FNV32_PRIME) ^ c;
+ }
+ return hash;
+}
+
#define HASHMAP_INITIAL_SIZE 64
/* grow / shrink by 2^2 */
#define HASHMAP_RESIZE_BITS 2
diff --git a/hashmap.h b/hashmap.h
index ab7958ae333..78e14dfde71 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -12,6 +12,8 @@ extern unsigned int strhash(const char *buf);
extern unsigned int strihash(const char *buf);
extern unsigned int memhash(const void *buf, size_t len);
extern unsigned int memihash(const void *buf, size_t len);
+extern unsigned int memihash_continue(unsigned int hash_seed,
+ const void *buf, size_t len);
static inline unsigned int sha1hash(const unsigned char *sha1)
{
--
2.11.1.windows.1
^ permalink raw reply related
* [PATCH 4/5] name-hash: specify initial size for istate.dir_hash table
From: Johannes Schindelin @ 2017-02-14 11:32 UTC (permalink / raw)
To: git; +Cc: Jeff Hostetler, Junio C Hamano
In-Reply-To: <cover.1487071883.git.johannes.schindelin@gmx.de>
From: Jeff Hostetler <jeffhost@microsoft.com>
Specify an initial size for the istate.dir_hash hash map matching the
size of the istate.name_hash.
Previously hashmap_init() was given 0, causing a 64 bucket hashmap to be
created. When working with very large repositories, this would cause
numerous rehash() calls to realloc and rebalance the hashmap. This is
especially true when the worktree is deep, with many directories
containing a only a few files each.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
name-hash.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/name-hash.c b/name-hash.c
index 49eb84846df..8f8336cc868 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -143,7 +143,8 @@ static void lazy_init_name_hash(struct index_state *istate)
return;
hashmap_init(&istate->name_hash, (hashmap_cmp_fn) cache_entry_cmp,
istate->cache_nr);
- hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp, 0);
+ hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp,
+ istate->cache_nr);
for (nr = 0; nr < istate->cache_nr; nr++)
hash_index_entry(istate, istate->cache[nr]);
istate->name_hash_initialized = 1;
--
2.11.1.windows.1
^ permalink raw reply related
* [BUG] Memory corruption crash with "git bisect start"
From: Maxim Kuvyrkov @ 2017-02-14 11:56 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]
I'm seeing the following memory corruption crash on a script-constructed repo when starting git bisect. I'm seeing this crash both with system git of Ubuntu Xenial and with freshly-compiled git master from the official repo.
The log of the crash is attached.
It is possible that the bug is in Xenial glibc, in which case -- please let me know and I will take it up with the glibc developers.
To reproduce the crash:
<snip>
# Create a repo with 42 files, and change all of them.
rm -rf .git
git init
for i in `seq 1 42`; do echo 0 > f$i; git add f$i; echo 1 > f$i; done
git commit -m 0
# Download script to generate bisect history:
pushd ../
wget https://git.linaro.org/people/maxim.kuvyrkov/tree-bisect.git/plain/construct-tree.sh
popd
# Construct tree for bisect. This script creates a history graph for all permutations of changed files up to "2" in depth.
# The goal of the script is to assist in reducing testcases for compiler/toolchain development.
../construct-tree.sh -d 2
# Trigger crash
git bisect start bottom top
</snip>
It is interesting that "42" number files is boundary. The crash does not occur with history generated from 41 or fewer files, and only occurs with 42 or more files. It appears that the contents of the files is not relevant for the crash.
--
Maxim Kuvyrkov
www.linaro.org
[-- Attachment #2: git-bisect-crash.log --]
[-- Type: application/octet-stream, Size: 10480 bytes --]
maxim.kuvyrkov@2492b3a161d0:~/tmp/test-bisect$ git bisect start bottom top
Previous HEAD position was 0520f90... 9
Switched to branch '42/1_2_3_4_5_6_7_8_9_10_11_12_13_14_15_16_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31_32_33_34_35_36_37_38_39_40_41_42'
*** buffer overflow detected ***: git terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f86df8017e5]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f86df8a256c]
/lib/x86_64-linux-gnu/libc.so.6(+0x116570)[0x7f86df8a0570]
git[0x474d21]
git[0x406365]
git[0x4066a8]
git[0x40584d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f86df7aa830]
git[0x405899]
======= Memory map: ========
00400000-005e1000 r-xp 00000000 08:05 2192298551 /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e0000-007e1000 r--p 001e0000 08:05 2192298551 /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e1000-007ea000 rw-p 001e1000 08:05 2192298551 /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007ea000-00830000 rw-p 00000000 00:00 0
00b0c000-00b2d000 rw-p 00000000 00:00 0 [heap]
7f86df2f0000-7f86df306000 r-xp 00000000 00:25 3394 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f86df306000-7f86df505000 ---p 00016000 00:25 3394 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f86df505000-7f86df506000 rw-p 00015000 00:25 3394 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f86df506000-7f86df586000 rw-p 00000000 00:00 0
7f86df586000-7f86df589000 r-xp 00000000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df589000-7f86df788000 ---p 00003000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df788000-7f86df789000 r--p 00002000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df789000-7f86df78a000 rw-p 00003000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f86df78a000-7f86df949000 r-xp 00000000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f86df949000-7f86dfb49000 ---p 001bf000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f86dfb49000-7f86dfb4d000 r--p 001bf000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f86dfb4d000-7f86dfb4f000 rw-p 001c3000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f86dfb4f000-7f86dfb53000 rw-p 00000000 00:00 0
7f86dfb53000-7f86dfb5a000 r-xp 00000000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfb5a000-7f86dfd59000 ---p 00007000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfd59000-7f86dfd5a000 r--p 00006000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfd5a000-7f86dfd5b000 rw-p 00007000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f86dfd5b000-7f86dfd73000 r-xp 00000000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dfd73000-7f86dff72000 ---p 00018000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dff72000-7f86dff73000 r--p 00017000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dff73000-7f86dff74000 rw-p 00018000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f86dff74000-7f86dff78000 rw-p 00000000 00:00 0
7f86dff78000-7f86e0192000 r-xp 00000000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e0192000-7f86e0391000 ---p 0021a000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e0391000-7f86e03ad000 r--p 00219000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e03ad000-7f86e03b9000 rw-p 00235000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f86e03b9000-7f86e03bc000 rw-p 00000000 00:00 0
7f86e03bc000-7f86e03d5000 r-xp 00000000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e03d5000-7f86e05d4000 ---p 00019000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e05d4000-7f86e05d5000 r--p 00018000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e05d5000-7f86e05d6000 rw-p 00019000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f86e05d6000-7f86e05fc000 r-xp 00000000 00:25 31 /lib/x86_64-linux-gnu/ld-2.23.so
7f86e0641000-7f86e07d9000 r--p 00000000 00:25 156 /usr/lib/locale/locale-archive
7f86e07d9000-7f86e07de000 rw-p 00000000 00:00 0
7f86e07f8000-7f86e07fb000 rw-p 00000000 00:00 0
7f86e07fb000-7f86e07fc000 r--p 00025000 00:25 31 /lib/x86_64-linux-gnu/ld-2.23.so
7f86e07fc000-7f86e07fd000 rw-p 00026000 00:25 31 /lib/x86_64-linux-gnu/ld-2.23.so
7f86e07fd000-7f86e07fe000 rw-p 00000000 00:00 0
7ffd4fa11000-7ffd4fa32000 rw-p 00000000 00:00 0 [stack]
7ffd4fbed000-7ffd4fbef000 r-xp 00000000 00:00 0 [vdso]
7ffd4fbef000-7ffd4fbf1000 r--p 00000000 00:00 0 [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted (core dumped)
*** buffer overflow detected ***: git terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f9515a267e5]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f9515ac756c]
/lib/x86_64-linux-gnu/libc.so.6(+0x116570)[0x7f9515ac5570]
git[0x474d21]
git[0x406365]
git[0x4066a8]
git[0x40584d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f95159cf830]
git[0x405899]
======= Memory map: ========
00400000-005e1000 r-xp 00000000 08:05 2192298551 /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e0000-007e1000 r--p 001e0000 08:05 2192298551 /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007e1000-007ea000 rw-p 001e1000 08:05 2192298551 /home/maxim.kuvyrkov/git-install/libexec/git-core/git
007ea000-00830000 rw-p 00000000 00:00 0
01059000-0107a000 rw-p 00000000 00:00 0 [heap]
7f9515515000-7f951552b000 r-xp 00000000 00:25 3394 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f951552b000-7f951572a000 ---p 00016000 00:25 3394 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f951572a000-7f951572b000 rw-p 00015000 00:25 3394 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f951572b000-7f95157ab000 rw-p 00000000 00:00 0
7f95157ab000-7f95157ae000 r-xp 00000000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f95157ae000-7f95159ad000 ---p 00003000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f95159ad000-7f95159ae000 r--p 00002000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f95159ae000-7f95159af000 rw-p 00003000 00:25 36 /lib/x86_64-linux-gnu/libdl-2.23.so
7f95159af000-7f9515b6e000 r-xp 00000000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f9515b6e000-7f9515d6e000 ---p 001bf000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f9515d6e000-7f9515d72000 r--p 001bf000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f9515d72000-7f9515d74000 rw-p 001c3000 00:25 38 /lib/x86_64-linux-gnu/libc-2.23.so
7f9515d74000-7f9515d78000 rw-p 00000000 00:00 0
7f9515d78000-7f9515d7f000 r-xp 00000000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f9515d7f000-7f9515f7e000 ---p 00007000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f9515f7e000-7f9515f7f000 r--p 00006000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f9515f7f000-7f9515f80000 rw-p 00007000 00:25 147 /lib/x86_64-linux-gnu/librt-2.23.so
7f9515f80000-7f9515f98000 r-xp 00000000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9515f98000-7f9516197000 ---p 00018000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9516197000-7f9516198000 r--p 00017000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9516198000-7f9516199000 rw-p 00018000 00:25 80 /lib/x86_64-linux-gnu/libpthread-2.23.so
7f9516199000-7f951619d000 rw-p 00000000 00:00 0
7f951619d000-7f95163b7000 r-xp 00000000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95163b7000-7f95165b6000 ---p 0021a000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95165b6000-7f95165d2000 r--p 00219000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95165d2000-7f95165de000 rw-p 00235000 00:25 172 /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f95165de000-7f95165e1000 rw-p 00000000 00:00 0
7f95165e1000-7f95165fa000 r-xp 00000000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95165fa000-7f95167f9000 ---p 00019000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95167f9000-7f95167fa000 r--p 00018000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95167fa000-7f95167fb000 rw-p 00019000 00:25 174 /lib/x86_64-linux-gnu/libz.so.1.2.8
7f95167fb000-7f9516821000 r-xp 00000000 00:25 31 /lib/x86_64-linux-gnu/ld-2.23.so
7f9516866000-7f95169fe000 r--p 00000000 00:25 156 /usr/lib/locale/locale-archive
7f95169fe000-7f9516a03000 rw-p 00000000 00:00 0
7f9516a1d000-7f9516a20000 rw-p 00000000 00:00 0
7f9516a20000-7f9516a21000 r--p 00025000 00:25 31 /lib/x86_64-linux-gnu/ld-2.23.so
7f9516a21000-7f9516a22000 rw-p 00026000 00:25 31 /lib/x86_64-linux-gnu/ld-2.23.so
7f9516a22000-7f9516a23000 rw-p 00000000 00:00 0
7fff47216000-7fff47237000 rw-p 00000000 00:00 0 [stack]
7fff4731e000-7fff47320000 r-xp 00000000 00:00 0 [vdso]
7fff47320000-7fff47322000 r--p 00000000 00:00 0 [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted (core dumped)
Bisecting: 41 revisions left to test after this (roughly 5 steps)
[0520f903823d04d139111c3d379806d6c4895586] 9
maxim.kuvyrkov@2492b3a161d0:~/tmp/test-bisect$ echo $?
0
^ permalink raw reply
* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
From: Stefan Hajnoczi @ 2017-02-14 9:43 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, Brandon Williams, git
In-Reply-To: <20170209052034.ksoupjcj4qs7x4hz@sigill.intra.peff.net>
[-- Attachment #1: Type: text/plain, Size: 418 bytes --]
On Thu, Feb 09, 2017 at 12:20:34AM -0500, Jeff King wrote:
> On Wed, Feb 08, 2017 at 09:14:17PM -0800, Junio C Hamano wrote:
> > Jeff King <peff@peff.net> writes:
> (I _do_ think Stefan's proposed direction is worth it simply because the
> result is easier to read, but I agree the whole thing can be avoided by
> using pathspecs, as you've noted).
I won't be pushing this series further due to limited time.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
^ permalink raw reply
* Re: [PATCH] mingw: make stderr unbuffered again
From: Johannes Schindelin @ 2017-02-14 14:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Sixt, Jeff Hostetler
In-Reply-To: <xmqqlgt9btrv.fsf@gitster.mtv.corp.google.com>
Hi Junio,
On Mon, 13 Feb 2017, Junio C Hamano wrote:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>
> > When removing the hack for isatty(), we actually removed more than just
> > an isatty() hack: we removed the hack where internal data structures of
> > the MSVC runtime are modified in order to redirect stdout/stderr.
> >
> > Instead of using that hack (that does not work with newer versions of
> > the runtime, anyway), we replaced it by reopening the respective file
> > descriptors.
> >
> > What we forgot was to mark stderr as unbuffered again.
> >
> > Reported by Hannes Sixt. Fixed with Jeff Hostetler's assistance.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> > Published-As: https://github.com/dscho/git/releases/tag/mingw-unbuffered-stderr-v1
> > Fetch-It-Via: git fetch https://github.com/dscho/git mingw-unbuffered-stderr-v1
>
> OK. Should this go directly to 'master', as the isatty thing is
> already in?
From my point of view, it is not crucial. The next Git for Windows version
will have it, of course, and Hannes is always running with his set of
patches, he can easily cherry-pick this one, too.
Ciao,
Johannes
P.S.: Could you please cut the remainder of the mail that you are not
responding to? Thanks.
^ permalink raw reply
* Re: [BUG] Memory corruption crash with "git bisect start"
From: Christian Couder @ 2017-02-14 14:52 UTC (permalink / raw)
To: Maxim Kuvyrkov; +Cc: git
In-Reply-To: <A7AEC719-6E15-4622-8E21-3E11BAF74A3C@linaro.org>
On Tue, Feb 14, 2017 at 12:56 PM, Maxim Kuvyrkov
<maxim.kuvyrkov@linaro.org> wrote:
> I'm seeing the following memory corruption crash on a script-constructed repo when starting git bisect. I'm seeing this crash both with system git of Ubuntu Xenial and with freshly-compiled git master from the official repo.
>
> The log of the crash is attached.
Yeah, thanks for the report.
I took a look and it looks like git bisect crashes when it runs git
show-branch and it can be reproduced with just `git show-branch
bottom` instead of `git bisect start bottom top`.
> It is possible that the bug is in Xenial glibc, in which case -- please let me know and I will take it up with the glibc developers.
I have Ubuntu GLIBC 2.23-0ubuntu5 and I get a crash too.
^ permalink raw reply
* [PATCH] show-branch: fix crash with long ref name
From: Christian Couder @ 2017-02-14 15:48 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Maxim Kuvyrkov, Christian Couder
This is a minimum fix for a crash with a long ref name.
Note that if the refname is too long (for example if you
use 100 instead of 50 in the test script) you could get
an error like:
fatal: cannot lock ref 'refs/heads/1_2_3_4_ ... _98_99_100_':
Unable to create '... /.git/refs/heads/1_2_3_4_ ... _98_99_100_.lock':
File name too long
when creating the ref instead of a crash when using
show-branch and that would be ok.
So a simpler fix could have been just something like:
- char head[128];
+ char head[1024];
But if the filesystem ever allows filenames longer than 1024
characters then the crash could appear again.
Reported by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/show-branch.c | 14 +++++++-------
t/t3204-show-branch-refname.sh | 19 +++++++++++++++++++
2 files changed, 26 insertions(+), 7 deletions(-)
create mode 100755 t/t3204-show-branch-refname.sh
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f3403ab..3c0fe55eec 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -620,7 +620,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
int all_heads = 0, all_remotes = 0;
int all_mask, all_revs;
enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
- char head[128];
+ char *head_cpy;
const char *head_p;
int head_len;
struct object_id head_oid;
@@ -791,11 +791,11 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
head_oid.hash, NULL);
if (head_p) {
head_len = strlen(head_p);
- memcpy(head, head_p, head_len + 1);
+ head_cpy = xstrdup(head_p);
}
else {
head_len = 0;
- head[0] = 0;
+ head_cpy = xstrdup("");
}
if (with_current_branch && head_p) {
@@ -804,15 +804,15 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
/* We are only interested in adding the branch
* HEAD points at.
*/
- if (rev_is_head(head,
+ if (rev_is_head(head_cpy,
head_len,
ref_name[i],
head_oid.hash, NULL))
has_head++;
}
if (!has_head) {
- int offset = starts_with(head, "refs/heads/") ? 11 : 0;
- append_one_rev(head + offset);
+ int offset = starts_with(head_cpy, "refs/heads/") ? 11 : 0;
+ append_one_rev(head_cpy + offset);
}
}
@@ -865,7 +865,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
if (1 < num_rev || extra < 0) {
for (i = 0; i < num_rev; i++) {
int j;
- int is_head = rev_is_head(head,
+ int is_head = rev_is_head(head_cpy,
head_len,
ref_name[i],
head_oid.hash,
diff --git a/t/t3204-show-branch-refname.sh b/t/t3204-show-branch-refname.sh
new file mode 100755
index 0000000000..83b64e3032
--- /dev/null
+++ b/t/t3204-show-branch-refname.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='test show-branch with long refname'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+
+ test_commit first &&
+ long_refname=$(printf "%s_" $(seq 1 50)) &&
+ git checkout -b "$long_refname"
+'
+
+test_expect_success 'show-branch with long refname' '
+
+ git show-branch first
+'
+
+test_done
--
2.12.0.rc0
^ permalink raw reply related
* Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
From: Jonathan Tan @ 2017-02-14 16:53 UTC (permalink / raw)
To: Jeff King; +Cc: git, gitster
In-Reply-To: <20170214060729.v4r24y5tuaov3jrh@sigill.intra.peff.net>
On 02/13/2017 10:07 PM, Jeff King wrote:
> diff --git a/builtin/grep.c b/builtin/grep.c
> index e83b33bda..c4c632594 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> break;
> }
>
> + if (!use_index) {
> + if (seen_dashdash)
> + die(_("--no-index cannot be used with revs"));
There is a subsequent check that prints "--no-index or --untracked
cannot be used with revs." - maybe we should just expand this part to
incorporate that case. (That is, write `if (!use_index || untracked)`
instead of `if (!use_index)`.) This also allows us to preserve the error
message, which might be useful for someone using a translated version of
Git.
> + break;
> + }
> +
> if (get_sha1_with_context(arg, 0, sha1, &oc)) {
> if (seen_dashdash)
> die(_("unable to resolve revision: %s"), arg);
^ permalink raw reply
* Re: [PATCH 0/7] grep rev/path parsing fixes
From: Jonathan Tan @ 2017-02-14 16:58 UTC (permalink / raw)
To: Jeff King; +Cc: git, gitster
In-Reply-To: <20170214060021.einv7372exbxa23z@sigill.intra.peff.net>
On 02/13/2017 10:00 PM, Jeff King wrote:
> I've fixed that, along with a few other bugs and cleanups. The complete
> series is below. Patch 2 is your (untouched) patch. My suggestions for
> your test are in patch 3, which can either remain on its own or be
> squashed in.
>
> [1/7]: grep: move thread initialization a little lower
> [2/7]: grep: do not unnecessarily query repo for "--"
> [3/7]: t7810: make "--no-index --" test more robust
> [4/7]: grep: re-order rev-parsing loop
> [5/7]: grep: fix "--" rev/pathspec disambiguation
> [6/7]: grep: avoid resolving revision names in --no-index case
> [7/7]: grep: do not diagnose misspelt revs with --no-index
Thanks - these look good to me. I replied to 6/7 with a comment, but I
also think that these are good as-is. Also, 3/7 can probably be squashed in.
^ permalink raw reply
* Re: [PATCH] show-branch: fix crash with long ref name
From: Jeff King @ 2017-02-14 17:25 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214154816.12625-1-chriscool@tuxfamily.org>
On Tue, Feb 14, 2017 at 04:48:16PM +0100, Christian Couder wrote:
> @@ -791,11 +791,11 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
> head_oid.hash, NULL);
> if (head_p) {
> head_len = strlen(head_p);
> - memcpy(head, head_p, head_len + 1);
> + head_cpy = xstrdup(head_p);
> }
> else {
> head_len = 0;
> - head[0] = 0;
> + head_cpy = xstrdup("");
> }
This fixes the problem, but I think we can simplify it quite a bit by
using resolve_refdup(). Here's the patch series I ended up with:
[1/3]: show-branch: drop head_len variable
[2/3]: show-branch: store resolved head in heap buffer
[3/3]: show-branch: use skip_prefix to drop magic numbers
builtin/show-branch.c | 39 ++++++++++++---------------------------
1 file changed, 12 insertions(+), 27 deletions(-)
-Peff
^ permalink raw reply
* [PATCH 1/3] show-branch: drop head_len variable
From: Jeff King @ 2017-02-14 17:26 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214172526.hzpm3d3ubd3vjnzr@sigill.intra.peff.net>
We copy the result of resolving HEAD into a buffer and keep
track of its length. But we never actually use the length
for anything besides the copy. Let's stop passing it around.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/show-branch.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 974f3403a..e4c488b8c 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -470,7 +470,7 @@ static void snarf_refs(int head, int remotes)
}
}
-static int rev_is_head(char *head, int headlen, char *name,
+static int rev_is_head(char *head, char *name,
unsigned char *head_sha1, unsigned char *sha1)
{
if ((!head[0]) ||
@@ -622,7 +622,6 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
char head[128];
const char *head_p;
- int head_len;
struct object_id head_oid;
int merge_base = 0;
int independent = 0;
@@ -790,11 +789,10 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
head_p = resolve_ref_unsafe("HEAD", RESOLVE_REF_READING,
head_oid.hash, NULL);
if (head_p) {
- head_len = strlen(head_p);
+ size_t head_len = strlen(head_p);
memcpy(head, head_p, head_len + 1);
}
else {
- head_len = 0;
head[0] = 0;
}
@@ -805,7 +803,6 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
* HEAD points at.
*/
if (rev_is_head(head,
- head_len,
ref_name[i],
head_oid.hash, NULL))
has_head++;
@@ -866,7 +863,6 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
for (i = 0; i < num_rev; i++) {
int j;
int is_head = rev_is_head(head,
- head_len,
ref_name[i],
head_oid.hash,
rev[i]->object.oid.hash);
--
2.12.0.rc1.479.g59880b11e
^ permalink raw reply related
* [PATCH 2/3] show-branch: store resolved head in heap buffer
From: Jeff King @ 2017-02-14 17:27 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214172526.hzpm3d3ubd3vjnzr@sigill.intra.peff.net>
We resolve HEAD and copy the result to a fixed-size buffer
with memcpy, never checking that it actually fits. This bug
dates back to 8098a178b (Add git-symbolic-ref, 2005-09-30).
Before that we used readlink(), which took a maximum buffer
size.
We can fix this by using resolve_refdup(), which duplicates
the buffer on the heap. That also lets us just check
for a NULL pointer to see if we have resolved HEAD, and
drop the extra head_p variable.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/show-branch.c | 19 +++++--------------
1 file changed, 5 insertions(+), 14 deletions(-)
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index e4c488b8c..404c4d09a 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -473,8 +473,7 @@ static void snarf_refs(int head, int remotes)
static int rev_is_head(char *head, char *name,
unsigned char *head_sha1, unsigned char *sha1)
{
- if ((!head[0]) ||
- (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
+ if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
return 0;
if (starts_with(head, "refs/heads/"))
head += 11;
@@ -620,8 +619,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
int all_heads = 0, all_remotes = 0;
int all_mask, all_revs;
enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
- char head[128];
- const char *head_p;
+ char *head;
struct object_id head_oid;
int merge_base = 0;
int independent = 0;
@@ -786,17 +784,10 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
snarf_refs(all_heads, all_remotes);
}
- head_p = resolve_ref_unsafe("HEAD", RESOLVE_REF_READING,
- head_oid.hash, NULL);
- if (head_p) {
- size_t head_len = strlen(head_p);
- memcpy(head, head_p, head_len + 1);
- }
- else {
- head[0] = 0;
- }
+ head = resolve_refdup("HEAD", RESOLVE_REF_READING,
+ head_oid.hash, NULL);
- if (with_current_branch && head_p) {
+ if (with_current_branch && head) {
int has_head = 0;
for (i = 0; !has_head && i < ref_name_cnt; i++) {
/* We are only interested in adding the branch
--
2.12.0.rc1.479.g59880b11e
^ permalink raw reply related
* [PATCH 3/3] show-branch: use skip_prefix to drop magic numbers
From: Jeff King @ 2017-02-14 17:28 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Junio C Hamano, Maxim Kuvyrkov, Christian Couder
In-Reply-To: <20170214172526.hzpm3d3ubd3vjnzr@sigill.intra.peff.net>
We make several starts_with() calls, only to advance
pointers. This is exactly what skip_prefix() is for, which
lets us avoid manually-counted magic numbers.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/show-branch.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 404c4d09a..c03d3ec7c 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -470,17 +470,14 @@ static void snarf_refs(int head, int remotes)
}
}
-static int rev_is_head(char *head, char *name,
+static int rev_is_head(const char *head, const char *name,
unsigned char *head_sha1, unsigned char *sha1)
{
if (!head || (head_sha1 && sha1 && hashcmp(head_sha1, sha1)))
return 0;
- if (starts_with(head, "refs/heads/"))
- head += 11;
- if (starts_with(name, "refs/heads/"))
- name += 11;
- else if (starts_with(name, "heads/"))
- name += 6;
+ skip_prefix(head, "refs/heads/", &head);
+ if (!skip_prefix(name, "refs/heads/", &name))
+ skip_prefix(name, "heads/", &name);
return !strcmp(head, name);
}
@@ -799,8 +796,9 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
has_head++;
}
if (!has_head) {
- int offset = starts_with(head, "refs/heads/") ? 11 : 0;
- append_one_rev(head + offset);
+ const char *name = head;
+ skip_prefix(name, "refs/heads/", &name);
+ append_one_rev(name);
}
}
--
2.12.0.rc1.479.g59880b11e
^ permalink raw reply related
* Re: [PATCH 06/11] refs-internal.h: correct is_per_worktree_ref()
From: Stefan Beller @ 2017-02-14 17:40 UTC (permalink / raw)
To: Duy Nguyen
Cc: git@vger.kernel.org, Junio C Hamano, Michael Haggerty,
Johannes Schindelin, David Turner
In-Reply-To: <CACsJy8D=qFQ2_62e4oO1pSBz4JfZV4Zcoai=Ghjw5DTaNxrwog@mail.gmail.com>
On Tue, Feb 14, 2017 at 1:40 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Tue, Feb 14, 2017 at 5:37 AM, Stefan Beller <sbeller@google.com> wrote:
>> On Mon, Feb 13, 2017 at 7:20 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>>> All refs outside refs/ directory is per-worktree, not just HEAD.
>>>
>>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>> ---
>>> refs/refs-internal.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>>> index f4aed49f5..69d02b6ba 100644
>>> --- a/refs/refs-internal.h
>>> +++ b/refs/refs-internal.h
>>> @@ -653,7 +653,7 @@ const char *resolve_ref_recursively(struct ref_store *refs,
>>>
>>> static inline int is_per_worktree_ref(const char *refname)
>>> {
>>> - return !strcmp(refname, "HEAD") ||
>>> + return !starts_with(refname, "refs/") ||
>>> starts_with(refname, "refs/bisect/");
>>
>> you're loosing HEAD here? (assuming we get HEAD in
>> short form here, as well as long form refs/HEAD)
>
> I don't understand. if refname is HEAD then both !strcmp(...) and
> !starts_with(refname, "refs/") return 1. If it's refs/HEAD, both
> return 0. In other words, there's no functional changes?
eh, my bad. You're right.
^ permalink raw reply
* Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
From: Jeff King @ 2017-02-14 18:04 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git, gitster
In-Reply-To: <eef97cc4-d616-b298-bc99-b2772b757190@google.com>
On Tue, Feb 14, 2017 at 08:53:04AM -0800, Jonathan Tan wrote:
> On 02/13/2017 10:07 PM, Jeff King wrote:
> > diff --git a/builtin/grep.c b/builtin/grep.c
> > index e83b33bda..c4c632594 100644
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> > break;
> > }
> >
> > + if (!use_index) {
> > + if (seen_dashdash)
> > + die(_("--no-index cannot be used with revs"));
>
> There is a subsequent check that prints "--no-index or --untracked cannot be
> used with revs." - maybe we should just expand this part to incorporate that
> case. (That is, write `if (!use_index || untracked)` instead of `if
> (!use_index)`.) This also allows us to preserve the error message, which
> might be useful for someone using a translated version of Git.
I wasn't sure if we wanted to treat "untracked" in the same way.
Certainly we can catch the error here for the seen_dashdash case, but is
it also the case that:
echo content >master
git grep --untracked pattern master
should treat "master" as a path?
-Peff
^ permalink raw reply
* Re: [PATCH v2] clean: use warning_errno() when appropriate
From: Junio C Hamano @ 2017-02-14 18:13 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
In-Reply-To: <20170214095449.15585-1-pclouds@gmail.com>
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> All these warning() calls are preceded by a system call. Report the
> actual error to help the user understand why we fail to remove
> something.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> v2 dances with errno
Thanks.
>
> builtin/clean.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index d6bc3aaae..3569736f6 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -154,6 +154,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
> struct strbuf quoted = STRBUF_INIT;
> struct dirent *e;
> int res = 0, ret = 0, gone = 1, original_len = path->len, len;
> + int saved_errno;
> struct string_list dels = STRING_LIST_INIT_DUP;
>
> *dir_gone = 1;
> @@ -173,9 +174,11 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
> if (!dir) {
> /* an empty dir could be removed even if it is unreadble */
> res = dry_run ? 0 : rmdir(path->buf);
> + saved_errno = errno;
> if (res) {
> quote_path_relative(path->buf, prefix, "ed);
I think this part should be more like
res = ... : rmdir(...);
if (res) {
int saved_errno = errno;
... do other things that can touch errno ...
errno = saved_errno;
... now we know what the original error was ...
The reason to store the errno in saved_errno here is not because we
want to help code after "if (res) {...}", but the patch sent as-is
gives that impression and is confusing to the readers.
Perhaps all hunks of this patch share the same issue? I could
locally amend, of course, but I'd like to double check before doing
so myself---perhaps you did it this way for a good reason that I am
missing?
^ permalink raw reply
* git-p4.py caching
From: Martin-Louis Bright @ 2017-02-14 18:16 UTC (permalink / raw)
To: git
hi!
I am using git-p4.py to migrate a lot of medium and large Perforce
depots into git. I almost exclusively go one way: from Perforce to
git. I also frequently re-clone/re-migrate as the Perforce migration
client spec is refined.
For this, I have added rudimentary caching in git-p4.py so that
Perforce requests are not repeated over the network.
I have it working and tested on a ~150MB repo and the migration time was halved.
Is this something that would be of interest to the larger community?
--Martin
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox