git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: git@vger.kernel.org
Cc: Eric Sunshine <sunshine@sunshineco.com>,
	Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Brian Gernhardt <brian@gernhardtsoftware.com>,
	Jonathan Nieder <jrnieder@gmail.com>
Subject: [PATCH v2 3/4] name-hash: stop storing trailing '/' on paths in index_state.dir_hash
Date: Tue, 17 Sep 2013 03:06:16 -0400	[thread overview]
Message-ID: <1379401577-36799-4-git-send-email-sunshine@sunshineco.com> (raw)
In-Reply-To: <1379401577-36799-1-git-send-email-sunshine@sunshineco.com>

When 5102c617 (Add case insensitivity support for directories when using
git status, 2010-10-03) added directories to the name-hash there was
only a single hash table in which both real cache entries and leading
directory prefixes were registered. To distinguish between the two types
of entries, directories were stored with a trailing '/'.

2092678c (name-hash.c: fix endless loop with core.ignorecase=true,
2013-02-28), however, moved directories to a separate hash table
(index_state.dir_hash) but retained the (now) redundant trailing '/',
thus callers continue to bear the burden of ensuring the slash's
presence before searching the index for a directory. Eliminate this
redundancy by storing paths in the dir-hash without the trailing '/'.

An important benefit of this change is that it eliminates undocumented
and dangerous behavior of dir.c:directory_exists_in_index_icase() in
which it assumes not only that it can validly access one character
beyond the end of its incoming directory argument, but also that that
character will unconditionally be a '/'. This perilous behavior was
"tolerated" because the string passed in by its lone caller always had a
'/' in that position, however, things broke [1] when 2eac2a4c (ls-files
-k: a directory only can be killed if the index has a non-directory,
2013-08-15) added a new caller which failed to respect the undocumented
assumption.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/232727

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 dir.c        |  2 +-
 name-hash.c  | 11 ++++++-----
 read-cache.c |  2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index a8401b9..fccd479 100644
--- a/dir.c
+++ b/dir.c
@@ -889,7 +889,7 @@ enum exist_status {
  */
 static enum exist_status directory_exists_in_index_icase(const char *dirname, int len)
 {
-	const struct cache_entry *ce = cache_dir_exists(dirname, len + 1);
+	const struct cache_entry *ce = cache_dir_exists(dirname, len);
 	unsigned char endchar;
 
 	if (!ce)
diff --git a/name-hash.c b/name-hash.c
index f06b049..e5b6e1a 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -58,9 +58,9 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 {
 	/*
 	 * Throw each directory component in the hash for quick lookup
-	 * during a git status. Directory components are stored with their
+	 * during a git status. Directory components are stored without their
 	 * closing slash.  Despite submodules being a directory, they never
-	 * reach this point, because they are stored without a closing slash
+	 * reach this point, because they are stored
 	 * in index_state.name_hash (as ordinary cache_entries).
 	 *
 	 * Note that the cache_entry stored with the dir_entry merely
@@ -78,6 +78,7 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 		namelen--;
 	if (namelen <= 0)
 		return NULL;
+	namelen--;
 
 	/* lookup existing entry for that directory */
 	dir = find_dir_entry(istate, ce->name, namelen);
@@ -97,7 +98,7 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 		}
 
 		/* recursively add missing parent directories */
-		dir->parent = hash_dir_entry(istate, ce, namelen - 1);
+		dir->parent = hash_dir_entry(istate, ce, namelen);
 	}
 	return dir;
 }
@@ -237,7 +238,7 @@ struct cache_entry *index_dir_exists(struct index_state *istate, const char *nam
 	 * in the dir-hash, submodules are stored in the name-hash, so check
 	 * there, as well.
 	 */
-	ce = index_file_exists(istate, name, namelen - 1, 1);
+	ce = index_file_exists(istate, name, namelen, 1);
 	if (ce && S_ISGITLINK(ce->ce_mode))
 		return ce;
 
@@ -265,7 +266,7 @@ struct cache_entry *index_file_exists(struct index_state *istate, const char *na
 struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen, int icase)
 {
 	if (namelen > 0 && name[namelen - 1] == '/')
-		return index_dir_exists(istate, name, namelen);
+		return index_dir_exists(istate, name, namelen - 1);
 	return index_file_exists(istate, name, namelen, icase);
 }
 
diff --git a/read-cache.c b/read-cache.c
index b8d3759..e25de32 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -643,7 +643,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 			if (*ptr == '/') {
 				struct cache_entry *foundce;
 				++ptr;
-				foundce = index_dir_exists(istate, ce->name, ptr - ce->name);
+				foundce = index_dir_exists(istate, ce->name, ptr - ce->name - 1);
 				if (foundce) {
 					memcpy((void *)startPtr, foundce->name + (startPtr - ce->name), ptr - startPtr);
 					startPtr = ptr;
-- 
1.8.4.535.g7b94f8e

  parent reply	other threads:[~2013-09-17  7:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-17  7:06 [PATCH v2 0/4] stop storing trailing slash in dir-hash Eric Sunshine
2013-09-17  7:06 ` [PATCH v2 1/4] name-hash: refactor polymorphic index_name_exists() Eric Sunshine
2013-09-17  7:06 ` [PATCH v2 2/4] employ new explicit "exists in index?" API Eric Sunshine
2013-09-17  7:06 ` Eric Sunshine [this message]
2013-09-17  7:06 ` [PATCH v2 4/4] dir: revert work-around for retired dangerous behavior Eric Sunshine
2013-09-17 17:11 ` [PATCH v2 0/4] stop storing trailing slash in dir-hash Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1379401577-36799-4-git-send-email-sunshine@sunshineco.com \
    --to=sunshine@sunshineco.com \
    --cc=brian@gernhardtsoftware.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).