* [PATCH 1/7] Add string comparison functions that respect the ignore_case variable.
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-21 4:50 ` [PATCH 2/7] Case insensitivity support for .gitignore via core.ignorecase Joshua Jensen
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
Multiple locations within this patch series alter a case sensitive
string comparison call such as strcmp() to be a call to a string
comparison call that selects case comparison based on the global
ignore_case variable. Behaviorally, when core.ignorecase=false, the
*_icase() versions are functionally equivalent to their C runtime
counterparts. When core.ignorecase=true, the *_icase() versions perform
a case insensitive comparison.
Like Linus' earlier ignorecase patch, these may ignore filename
conventions on certain file systems. By isolating filename comparisons
to certain functions, support for those filename conventions may be more
easily met.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
dir.c | 35 +++++++++++++++++++++++++++++++++++
dir.h | 5 +++++
2 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/dir.c b/dir.c
index cb83332..21d2104 100644
--- a/dir.c
+++ b/dir.c
@@ -18,6 +18,41 @@ static int read_directory_recursive(struct dir_struct *dir, const char *path, in
int check_only, const struct path_simplify *simplify);
static int get_dtype(struct dirent *de, const char *path, int len);
+/* helper string functions with support for the ignore_case flag */
+int strcmp_icase(const char *a, const char *b)
+{
+ return ignore_case ? strcasecmp(a, b) : strcmp(a, b);
+}
+
+int strncmp_icase(const char *a, const char *b, size_t count)
+{
+ return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);
+}
+
+int fnmatch_icase(const char *pattern, const char *string, int flags)
+{
+ return fnmatch(pattern, string, flags | (ignore_case ? FNM_CASEFOLD : 0));
+}
+
+int memcmp_icase(const char *a, const char *b, size_t count)
+{
+ if (ignore_case) {
+ int lowera = 0;
+ int lowerb = 0;
+ while (--count) {
+ lowera = tolower(*a++);
+ lowerb = tolower(*b++);
+ if (lowera != lowerb)
+ break;
+ }
+ return lowera - lowerb;
+
+ return 0;
+ } else {
+ return memcmp(a, b, count);
+ }
+}
+
static int common_prefix(const char **pathspec)
{
const char *path, *slash, *next;
diff --git a/dir.h b/dir.h
index 3bead5f..aced818 100644
--- a/dir.h
+++ b/dir.h
@@ -100,4 +100,9 @@ extern int remove_dir_recursively(struct strbuf *path, int flag);
/* tries to remove the path with empty directories along it, ignores ENOENT */
extern int remove_path(const char *path);
+extern int strcmp_icase(const char *a, const char *b);
+extern int strncmp_icase(const char *a, const char *b, size_t count);
+extern int fnmatch_icase(const char *pattern, const char *string, int flags);
+extern int memcmp_icase(const char *a, const char *b, size_t count);
+
#endif
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/7] Case insensitivity support for .gitignore via core.ignorecase
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
2010-05-21 4:50 ` [PATCH 1/7] Add string comparison functions that respect the ignore_case variable Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-21 4:50 ` [PATCH 3/7] Add case insensitivity support for directories when using git status Joshua Jensen
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
This is especially beneficial when using Windows and Perforce and the
git-p4 bridge. Internally, Perforce preserves a given file's full path
including its case at the time it was added to the Perforce repository.
When syncing a file down via Perforce, missing directories are created,
if necessary, using the case as stored with the filename. Unfortunately,
two files in the same directory can have differing cases for their
respective paths, such as /diRa/file1.c and /DirA/file2.c. Depending on
sync order, DirA/ may get created instead of diRa/.
It is possible to handle directory names in a case insensitive manner
without this patch, but it is highly inconvenient, requiring each
character to be specified like so: [Bb][Uu][Ii][Ll][Dd]. With this patch, the
gitignore exclusions honor the core.ignorecase=true configuration
setting and make the process less error prone. The above is specified
like so: Build
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
dir.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/dir.c b/dir.c
index 21d2104..a19b7ab 100644
--- a/dir.c
+++ b/dir.c
@@ -409,14 +409,14 @@ int excluded_from_list(const char *pathname,
if (x->flags & EXC_FLAG_NODIR) {
/* match basename */
if (x->flags & EXC_FLAG_NOWILDCARD) {
- if (!strcmp(exclude, basename))
+ if (!strcmp_icase(exclude, basename))
return to_exclude;
} else if (x->flags & EXC_FLAG_ENDSWITH) {
if (x->patternlen - 1 <= pathlen &&
- !strcmp(exclude + 1, pathname + pathlen - x->patternlen + 1))
+ !strcmp_icase(exclude + 1, pathname + pathlen - x->patternlen + 1))
return to_exclude;
} else {
- if (fnmatch(exclude, basename, 0) == 0)
+ if (fnmatch_icase(exclude, basename, 0) == 0)
return to_exclude;
}
}
@@ -431,14 +431,14 @@ int excluded_from_list(const char *pathname,
if (pathlen < baselen ||
(baselen && pathname[baselen-1] != '/') ||
- strncmp(pathname, x->base, baselen))
+ strncmp_icase(pathname, x->base, baselen))
continue;
if (x->flags & EXC_FLAG_NOWILDCARD) {
- if (!strcmp(exclude, pathname + baselen))
+ if (!strcmp_icase(exclude, pathname + baselen))
return to_exclude;
} else {
- if (fnmatch(exclude, pathname+baselen,
+ if (fnmatch_icase(exclude, pathname+baselen,
FNM_PATHNAME) == 0)
return to_exclude;
}
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/7] Add case insensitivity support for directories when using git status
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
2010-05-21 4:50 ` [PATCH 1/7] Add string comparison functions that respect the ignore_case variable Joshua Jensen
2010-05-21 4:50 ` [PATCH 2/7] Case insensitivity support for .gitignore via core.ignorecase Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-21 4:50 ` [PATCH 4/7] Add case insensitivity support when using git ls-files Joshua Jensen
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
When using a case preserving but case insensitive file system, directory
case can differ but still refer to the same physical directory. git
status reports the directory with the alternate case as an Untracked
file. (That is, when mydir/filea.txt is added to the repository and
then the directory on disk is renamed from mydir/ to MyDir/, git status
shows MyDir/ as being untracked.)
Support has been added in name-hash.c for hashing directories with a
terminating slash into the name hash. When index_name_exists() is called
with a directory (a name with a terminating slash), the name is not
found via the normal cache_name_compare() call, but it is found in the
slow_same_name() function.
Additionally, in dir.c, directory_exists_in_index_icase() allows newly
added directories deeper in the directory chain to be identified.
Ultimately, it would be better if the file list was read in case
insensitive alphabetical order from disk, but this change seems to
suffice for now.
The end result is the directory is looked up in a case insensitive
manner and does not show in the Untracked files list.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
dir.c | 25 ++++++++++++++++++++++++-
name-hash.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 67 insertions(+), 2 deletions(-)
diff --git a/dir.c b/dir.c
index a19b7ab..c861f3e 100644
--- a/dir.c
+++ b/dir.c
@@ -503,6 +503,24 @@ enum exist_status {
index_gitdir,
};
+static enum exist_status directory_exists_in_index_icase(const char *dirname, int len)
+{
+ struct cache_entry *ce = index_name_exists(&the_index, dirname, len + 1, ignore_case);
+ if (!ce)
+ return index_nonexistent;
+
+ if (!strncmp_icase(ce->name, dirname, len)) {
+ unsigned char endchar = ce->name[len];
+ if (endchar <= '/') {
+ if (endchar == '/')
+ return index_directory;
+ if (!endchar && S_ISGITLINK(ce->ce_mode))
+ return index_gitdir;
+ }
+ }
+ return index_nonexistent;
+}
+
/*
* The index sorts alphabetically by entry name, which
* means that a gitlink sorts as '\0' at the end, while
@@ -512,7 +530,12 @@ enum exist_status {
*/
static enum exist_status directory_exists_in_index(const char *dirname, int len)
{
- int pos = cache_name_pos(dirname, len);
+ int pos;
+
+ if (ignore_case)
+ return directory_exists_in_index_icase(dirname, len);
+
+ pos = cache_name_pos(dirname, len);
if (pos < 0)
pos = -pos-1;
while (pos < active_nr) {
diff --git a/name-hash.c b/name-hash.c
index 0031d78..b10b5b1 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -32,6 +32,30 @@ static unsigned int hash_name(const char *name, int namelen)
return hash;
}
+static void hash_index_entry_directories(struct index_state *istate, struct cache_entry *ce)
+{
+ /* throw each directory component in the hash for quick lookup during a git status */
+ unsigned int hash;
+ void **pos;
+
+ const char *ptr = ce->name;
+ while (*ptr) {
+ while (*ptr && *ptr != '/')
+ ++ptr;
+ if (*ptr == '/') {
+ ++ptr;
+ hash = hash_name(ce->name, ptr - ce->name);
+ if (!lookup_hash(hash, &istate->name_hash)) {
+ pos = insert_hash(hash, ce, &istate->name_hash);
+ if (pos) {
+ ce->next = *pos;
+ *pos = ce;
+ }
+ }
+ }
+ }
+}
+
static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
{
void **pos;
@@ -47,6 +71,9 @@ static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
ce->next = *pos;
*pos = ce;
}
+
+ if (ignore_case)
+ hash_index_entry_directories(istate, ce);
}
static void lazy_init_name_hash(struct index_state *istate)
@@ -97,7 +124,22 @@ static int same_name(const struct cache_entry *ce, const char *name, int namelen
if (len == namelen && !cache_name_compare(name, namelen, ce->name, len))
return 1;
- return icase && slow_same_name(name, namelen, ce->name, len);
+ if (!icase)
+ return 0;
+
+ /*
+ * If the entry we're comparing is a filename (no trailing slash), then compare
+ * the lengths exactly.
+ */
+ if (name[namelen - 1] != '/') {
+ return slow_same_name(name, namelen, ce->name, len);
+ }
+
+ /*
+ * For a directory, we point to an arbitrary cache_entry filename. Just
+ * make sure the directory portion matches.
+ */
+ return slow_same_name(name, namelen, ce->name, namelen < len ? namelen : len);
}
struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen, int icase)
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/7] Add case insensitivity support when using git ls-files
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
` (2 preceding siblings ...)
2010-05-21 4:50 ` [PATCH 3/7] Add case insensitivity support for directories when using git status Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-21 4:50 ` [PATCH 5/7] Add support for case insensitive directory and file lookups to git log Joshua Jensen
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and
MyDir/fileb.txt is added, running git ls-files mydir only shows
mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt.
Running git ls-files mYdIR shows nothing.
With this patch running git ls-files for mydir, MyDir, and mYdIR shows
mydir/filea.txt and MyDir/fileb.txt.
Wildcards are not handled case insensitively in this patch. Example:
MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git
ls-files mydir/a* does not.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
dir.c | 38 ++++++++++++++++++++++++++------------
1 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/dir.c b/dir.c
index c861f3e..d67ec68 100644
--- a/dir.c
+++ b/dir.c
@@ -126,16 +126,30 @@ static int match_one(const char *match, const char *name, int namelen)
if (!*match)
return MATCHED_RECURSIVELY;
- for (;;) {
- unsigned char c1 = *match;
- unsigned char c2 = *name;
- if (c1 == '\0' || is_glob_special(c1))
- break;
- if (c1 != c2)
- return 0;
- match++;
- name++;
- namelen--;
+ if (ignore_case) {
+ for (;;) {
+ unsigned char c1 = tolower(*match);
+ unsigned char c2 = tolower(*name);
+ if (c1 == '\0' || is_glob_special(c1))
+ break;
+ if (c1 != c2)
+ return 0;
+ match++;
+ name++;
+ namelen--;
+ }
+ } else {
+ for (;;) {
+ unsigned char c1 = *match;
+ unsigned char c2 = *name;
+ if (c1 == '\0' || is_glob_special(c1))
+ break;
+ if (c1 != c2)
+ return 0;
+ match++;
+ name++;
+ namelen--;
+ }
}
@@ -144,8 +158,8 @@ static int match_one(const char *match, const char *name, int namelen)
* we need to match by fnmatch
*/
matchlen = strlen(match);
- if (strncmp(match, name, matchlen))
- return !fnmatch(match, name, 0) ? MATCHED_FNMATCH : 0;
+ if (strncmp_icase(match, name, matchlen))
+ return !fnmatch_icase(match, name, 0) ? MATCHED_FNMATCH : 0;
if (namelen == matchlen)
return MATCHED_EXACTLY;
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/7] Add support for case insensitive directory and file lookups to git log
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
` (3 preceding siblings ...)
2010-05-21 4:50 ` [PATCH 4/7] Add case insensitivity support when using git ls-files Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-21 4:50 ` [PATCH 6/7] Support case folding for git add when core.ignorecase=true Joshua Jensen
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
This patch also affects any other commands that use tree-diff.c.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
tree-diff.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/tree-diff.c b/tree-diff.c
index fe9f52c..5110980 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -5,6 +5,7 @@
#include "diff.h"
#include "diffcore.h"
#include "tree.h"
+#include "dir.h"
static char *malloc_base(const char *base, int baselen, const char *path, int pathlen)
{
@@ -114,7 +115,7 @@ static int tree_entry_interesting(struct tree_desc *desc, const char *base, int
if (baselen >= matchlen) {
/* If it doesn't match, move along... */
- if (strncmp(base, match, matchlen))
+ if (strncmp_icase(base, match, matchlen))
continue;
/*
@@ -131,7 +132,7 @@ static int tree_entry_interesting(struct tree_desc *desc, const char *base, int
}
/* Does the base match? */
- if (strncmp(base, match, baselen))
+ if (strncmp_icase(base, match, baselen))
continue;
match += baselen;
@@ -147,7 +148,7 @@ static int tree_entry_interesting(struct tree_desc *desc, const char *base, int
* Does match sort strictly earlier than path
* with their common parts?
*/
- m = strncmp(match, path,
+ m = strncmp_icase(match, path,
(matchlen < pathlen) ? matchlen : pathlen);
if (m < 0)
continue;
@@ -183,7 +184,7 @@ static int tree_entry_interesting(struct tree_desc *desc, const char *base, int
* we cheated and did not do strncmp(), so we do
* that here.
*/
- m = strncmp(match, path, pathlen);
+ m = strncmp_icase(match, path, pathlen);
/*
* If common part matched earlier then it is a hit,
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/7] Support case folding for git add when core.ignorecase=true
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
` (4 preceding siblings ...)
2010-05-21 4:50 ` [PATCH 5/7] Add support for case insensitive directory and file lookups to git log Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-21 4:50 ` [PATCH 7/7] Support case folding in git fast-import " Joshua Jensen
2010-05-26 18:26 ` [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
When MyDir/ABC/filea.txt is added to Git, the disk directory MyDir/ABC/
is renamed to mydir/aBc/, and then mydir/aBc/fileb.txt is added, the
index will contain MyDir/ABC/filea.txt and mydir/aBc/fileb.txt. Although
the earlier portions of this patch series account for those differences
in case, this patch makes the pathing consistent by folding the case of
newly added files against the first file added with that path.
In read-cache.c's add_to_index(), the index_name_exists() support used
for git status's case insensitive directory lookups is used to find the
proper directory case according to what the user already checked in.
That is, MyDir/ABC/'s case is used to alter the stored path for
fileb.txt to MyDir/ABC/fileb.txt (instead of mydir/aBc/fileb.txt).
This is especially important when cloning a repository to a case
sensitive file system. MyDir/ABC/ and mydir/aBc/ exist in the same
directory on a Windows machine, but on Linux, the files exist in two
separate directories. The update to add_to_index(), in effect, treats a
Windows file system as case sensitive by making path case consistent.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
read-cache.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/read-cache.c b/read-cache.c
index f1f789b..b3954d5 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -608,6 +608,22 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
ce->ce_mode = ce_mode_from_stat(ent, st_mode);
}
+ if (ignore_case) {
+ const char *startPtr = ce->name;
+ const char *ptr = startPtr;
+ while (*ptr) {
+ while (*ptr && *ptr != '/')
+ ++ptr;
+ if (*ptr == '/') {
+ struct cache_entry *foundce;
+ ++ptr;
+ foundce = index_name_exists(&the_index, ce->name, ptr - ce->name, ignore_case);
+ if (foundce)
+ memcpy((void*)startPtr, foundce->name + (startPtr - ce->name), ptr - startPtr);
+ }
+ }
+ }
+
alias = index_name_exists(istate, ce->name, ce_namelen(ce), ignore_case);
if (alias && !ce_stage(alias) && !ie_match_stat(istate, alias, st, ce_option)) {
/* Nothing changed, really */
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 7/7] Support case folding in git fast-import when core.ignorecase=true
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
` (5 preceding siblings ...)
2010-05-21 4:50 ` [PATCH 6/7] Support case folding for git add when core.ignorecase=true Joshua Jensen
@ 2010-05-21 4:50 ` Joshua Jensen
2010-05-26 18:26 ` [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-21 4:50 UTC (permalink / raw)
To: git; +Cc: Joshua Jensen
When core.ignorecase=true, imported file paths will be folded to match
existing directory case.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
---
fast-import.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/fast-import.c b/fast-import.c
index 352e2e3..2279276 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -156,6 +156,7 @@ Format of STDIN stream:
#include "csum-file.h"
#include "quote.h"
#include "exec_cmd.h"
+#include "dir.h"
#define PACK_ID_BITS 16
#define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
@@ -1459,7 +1460,7 @@ static int tree_content_set(
for (i = 0; i < t->entry_count; i++) {
e = t->entries[i];
- if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {
+ if (e->name->str_len == n && !strncmp_icase(p, e->name->str_dat, n)) {
if (!slash1) {
if (!S_ISDIR(mode)
&& e->versions[1].mode == mode
--
1.7.1.1930.gca7dd4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/7] Various updates to make core.ignorecase=true work better
2010-05-21 4:50 [PATCH 0/7] Various updates to make core.ignorecase=true work better Joshua Jensen
` (6 preceding siblings ...)
2010-05-21 4:50 ` [PATCH 7/7] Support case folding in git fast-import " Joshua Jensen
@ 2010-05-26 18:26 ` Joshua Jensen
7 siblings, 0 replies; 9+ messages in thread
From: Joshua Jensen @ 2010-05-26 18:26 UTC (permalink / raw)
To: git@vger.kernel.org
----- Original Message -----
From: Joshua Jensen
Date: 5/20/2010 10:50 PM
> Joshua Jensen (7):
> Add string comparison functions that respect the ignore_case
> variable.
> Case insensitivity support for .gitignore via core.ignorecase
> Add case insensitivity support for directories when using git status
> Add case insensitivity support when using git ls-files
> Add support for case insensitive directory and file lookups to git
> log
> Support case folding for git add when core.ignorecase=true
> Support case folding in git fast-import when core.ignorecase=true
Would this patch series be better sent to the msysGit mailing list,
given that it addresses Windows (and Mac OS X, I suppose) case
preserving but case insensitive file system issues? I posted here
first, because it builds on some core.ignorecase functionality Linus wrote.
Thanks.
Josh
^ permalink raw reply [flat|nested] 9+ messages in thread