* [PATCH RFC v2 0/4] Patches to avoid reporting conversion changes.
@ 2010-04-25 16:29 Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 1/4] sha1_file: Added index_blob() Henrik Grubbström (Grubba)
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-25 16:29 UTC (permalink / raw)
To: git; +Cc: Henrik Grubbström
This is the second go at having the git index keep track of the
conversion mode and corresponding normalized blob sha1 for files.
Since last time the diff behaviour patch has been removed, and
with it the need for storing of the normalized blob and thus
the gc patch. Some basic test cases have been added.
Thanks to Junio C Hamano for the suggestion and some of the tests.
Henrik Grubbström (Grubba) (4):
sha1_file: Added index_blob().
cache: Added ce_norm_sha1() and related cache_entry fields.
cache: Added index extension "NORM".
t/t0021: Test that conversion changes are detected.
cache.h | 22 ++++++++++++++++
convert.c | 31 +++++++++++++++++++++++
read-cache.c | 66 +++++++++++++++++++++++++++++++++++++++++++------
sha1_file.c | 19 ++++++++++++++
t/t0021-conversion.sh | 50 +++++++++++++++++++++++++++++++++++++
5 files changed, 180 insertions(+), 8 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH RFC v2 1/4] sha1_file: Added index_blob().
2010-04-25 16:29 [PATCH RFC v2 0/4] Patches to avoid reporting conversion changes Henrik Grubbström (Grubba)
@ 2010-04-25 16:29 ` Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 2/4] cache: Added ce_norm_sha1() and related cache_entry fields Henrik Grubbström (Grubba)
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-25 16:29 UTC (permalink / raw)
To: git; +Cc: Henrik Grubbström
When conversion attributes have changed, it
is useful to be able to easily reconvert an
existing blob.
Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
No changes since v1.
cache.h | 1 +
sha1_file.c | 19 +++++++++++++++++++
2 files changed, 20 insertions(+), 0 deletions(-)
diff --git a/cache.h b/cache.h
index 5eb0573..1fe2d7d 100644
--- a/cache.h
+++ b/cache.h
@@ -494,6 +494,7 @@ extern int ie_match_stat(const struct index_state *, struct cache_entry *, struc
extern int ie_modified(const struct index_state *, struct cache_entry *, struct stat *, unsigned int);
extern int ce_path_match(const struct cache_entry *ce, const char **pathspec);
+extern int index_blob(unsigned char *dst_sha1, const unsigned char *src_sha1, int write_object, const char *path);
extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path);
extern int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object);
extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
diff --git a/sha1_file.c b/sha1_file.c
index ff65328..c162321 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2434,6 +2434,25 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
#define SMALL_FILE_SIZE (32*1024)
+int index_blob(unsigned char *dst_sha1, const unsigned char *src_sha1,
+ int write_object, const char *path)
+{
+ void *buf;
+ unsigned long buflen = 0;
+ int ret;
+
+ memcpy(dst_sha1, src_sha1, 20);
+ buf = read_object_with_reference(src_sha1, typename(OBJ_BLOB),
+ &buflen, dst_sha1);
+ if (!buf)
+ return 0;
+
+ ret = index_mem(dst_sha1, buf, buflen, write_object, OBJ_BLOB, path);
+ free(buf);
+
+ return ret;
+}
+
int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
enum object_type type, const char *path)
{
--
1.7.0.4.369.g81e89
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH RFC v2 2/4] cache: Added ce_norm_sha1() and related cache_entry fields.
2010-04-25 16:29 [PATCH RFC v2 0/4] Patches to avoid reporting conversion changes Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 1/4] sha1_file: Added index_blob() Henrik Grubbström (Grubba)
@ 2010-04-25 16:29 ` Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 3/4] cache: Added index extension "NORM" Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 4/4] t/t0021: Test that conversion changes are detected Henrik Grubbström (Grubba)
3 siblings, 0 replies; 5+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-25 16:29 UTC (permalink / raw)
To: git; +Cc: Henrik Grubbström
The index now keeps track of the conversion mode that was active
when the entry was created. This can be used to detect the most
common cases of when the conversion mode has changed.
Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
git_norm_flags has been extended with one flag (NORM_CONV_CRLF_WT)
to be able to keep track of the working tree state as well as the
repository state.
git_norm_flags() now takes account of the auto_crlf state.
ce_match_stat_basic() now knows that a normalization change may
affect the working tree file size.
Updating of the normalization state is now done in ce_compare_data().
cache.h | 14 ++++++++++++++
convert.c | 31 +++++++++++++++++++++++++++++++
read-cache.c | 17 +++++++++++++++--
3 files changed, 60 insertions(+), 2 deletions(-)
diff --git a/cache.h b/cache.h
index 1fe2d7d..3e70bef 100644
--- a/cache.h
+++ b/cache.h
@@ -151,10 +151,18 @@ struct cache_entry {
unsigned int ce_size;
unsigned int ce_flags;
unsigned char sha1[20];
+ unsigned int norm_flags;
+ unsigned char norm_sha1[20];
struct cache_entry *next;
char name[FLEX_ARRAY]; /* more */
};
+#define NORM_CONV_CRLF_GIT 0x0001
+#define NORM_CONV_CRLF_WT 0x0002
+#define NORM_CONV_CRLF_GUESS 0x0004
+#define NORM_CONV_IDENT 0x0008
+#define NORM_CONV_FILT 0x0010
+
#define CE_NAMEMASK (0x0fff)
#define CE_STAGEMASK (0x3000)
#define CE_EXTENDED (0x4000)
@@ -278,6 +286,11 @@ static inline int ce_to_dtype(const struct cache_entry *ce)
else
return DT_UNKNOWN;
}
+static inline unsigned char *ce_norm_sha1(struct cache_entry *ce)
+{
+ return ce->norm_flags?ce->norm_sha1:ce->sha1;
+}
+
#define canon_mode(mode) \
(S_ISREG(mode) ? (S_IFREG | ce_permissions(mode)) : \
S_ISLNK(mode) ? S_IFLNK : S_ISDIR(mode) ? S_IFDIR : S_IFGITLINK)
@@ -1014,6 +1027,7 @@ extern void trace_argv_printf(const char **argv, const char *format, ...);
/* convert.c */
/* returns 1 if *dst was used */
+extern unsigned int git_norm_flags(const char *path);
extern int convert_to_git(const char *path, const char *src, size_t len,
struct strbuf *dst, enum safe_crlf checksafe);
extern int convert_to_working_tree(const char *path, const char *src, size_t len, struct strbuf *dst);
diff --git a/convert.c b/convert.c
index 4f8fcb7..5f36669 100644
--- a/convert.c
+++ b/convert.c
@@ -568,6 +568,37 @@ static int git_path_check_ident(const char *path, struct git_attr_check *check)
return !!ATTR_TRUE(value);
}
+unsigned int git_norm_flags(const char *path)
+{
+ struct git_attr_check check[3];
+ int crlf = CRLF_GUESS;
+ int ident = 0;
+ unsigned ret = 0;
+ struct convert_driver *drv = NULL;
+
+ setup_convert_check(check);
+ if (!git_checkattr(path, ARRAY_SIZE(check), check)) {
+ crlf = git_path_check_crlf(path, check + 0);
+ ident = git_path_check_ident(path, check + 1);
+ drv = git_path_check_convert(path, check + 2);
+ }
+
+ if (auto_crlf && (crlf != CRLF_BINARY)) {
+ ret |= NORM_CONV_CRLF_GIT;
+ if (crlf != CRLF_INPUT && auto_crlf > 0)
+ ret |= NORM_CONV_CRLF_WT;
+ if (crlf == CRLF_GUESS)
+ ret |= NORM_CONV_CRLF_GUESS;
+ }
+ if (ident) {
+ ret |= NORM_CONV_IDENT;
+ }
+ if (drv) {
+ ret |= NORM_CONV_FILT;
+ }
+ return ret;
+}
+
int convert_to_git(const char *path, const char *src, size_t len,
struct strbuf *dst, enum safe_crlf checksafe)
{
diff --git a/read-cache.c b/read-cache.c
index f1f789b..1a698bf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -88,12 +88,20 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
static int ce_compare_data(struct cache_entry *ce, struct stat *st)
{
int match = -1;
- int fd = open(ce->name, O_RDONLY);
+ int fd;
+ unsigned int norm_flags = git_norm_flags(ce->name);
+ if (norm_flags != ce->norm_flags) {
+ ce->norm_flags = norm_flags;
+ if (norm_flags)
+ index_blob(ce->norm_sha1, ce->sha1, 0, ce->name);
+ }
+
+ fd = open(ce->name, O_RDONLY);
if (fd >= 0) {
unsigned char sha1[20];
if (!index_fd(sha1, fd, st, 0, OBJ_BLOB, ce->name))
- match = hashcmp(sha1, ce->sha1);
+ match = hashcmp(sha1, ce_norm_sha1(ce));
/* index_fd() closed the file descriptor already */
}
return match;
@@ -227,6 +235,11 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
changed |= INODE_CHANGED;
#endif
+ /* ce_size can not be trusted if the conversion mode has changed. */
+ if ((ce->ce_mode & S_IFMT) == S_IFREG &&
+ ce->norm_flags != git_norm_flags(ce->name))
+ return changed;
+
if (ce->ce_size != (unsigned int) st->st_size)
changed |= DATA_CHANGED;
--
1.7.0.4.369.g81e89
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH RFC v2 3/4] cache: Added index extension "NORM".
2010-04-25 16:29 [PATCH RFC v2 0/4] Patches to avoid reporting conversion changes Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 1/4] sha1_file: Added index_blob() Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 2/4] cache: Added ce_norm_sha1() and related cache_entry fields Henrik Grubbström (Grubba)
@ 2010-04-25 16:29 ` Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 4/4] t/t0021: Test that conversion changes are detected Henrik Grubbström (Grubba)
3 siblings, 0 replies; 5+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-25 16:29 UTC (permalink / raw)
To: git; +Cc: Henrik Grubbström
The index can now store and retrieve the ce_norm_sha1 data.
Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
Unchanged since v1.
cache.h | 7 +++++++
read-cache.c | 49 +++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/cache.h b/cache.h
index 3e70bef..9aa031b 100644
--- a/cache.h
+++ b/cache.h
@@ -157,6 +157,13 @@ struct cache_entry {
char name[FLEX_ARRAY]; /* more */
};
+struct ondisk_norm_sha1 {
+ unsigned int entry_no;
+ unsigned int norm_flags;
+ unsigned int norm_size;
+ unsigned char norm_sha1[20];
+};
+
#define NORM_CONV_CRLF_GIT 0x0001
#define NORM_CONV_CRLF_WT 0x0002
#define NORM_CONV_CRLF_GUESS 0x0004
diff --git a/read-cache.c b/read-cache.c
index 1a698bf..5abb59d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -27,6 +27,7 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce, int reall
#define CACHE_EXT(s) ( (s[0]<<24)|(s[1]<<16)|(s[2]<<8)|(s[3]) )
#define CACHE_EXT_TREE 0x54524545 /* "TREE" */
#define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */
+#define CACHE_EXT_NORM_SHA1 0x4e4f524d /* "NORM" */
struct index_state the_index;
@@ -1191,6 +1192,21 @@ static int verify_hdr(struct cache_header *hdr, unsigned long size)
return 0;
}
+static int norm_sha1_read(struct cache_entry **cache, unsigned int entries,
+ const struct ondisk_norm_sha1 *data, unsigned long sz)
+{
+ while (sz >= sizeof(*data)) {
+ unsigned int entry_no = ntohl(data->entry_no);
+ if (entry_no < entries) {
+ cache[entry_no]->norm_flags = ntohl(data->norm_flags);
+ memcpy(cache[entry_no]->norm_sha1, data->norm_sha1, 20);
+ }
+ sz -= sizeof(*data);
+ data++;
+ }
+ return 0;
+}
+
static int read_index_extension(struct index_state *istate,
const char *ext, void *data, unsigned long sz)
{
@@ -1201,6 +1217,9 @@ static int read_index_extension(struct index_state *istate,
case CACHE_EXT_RESOLVE_UNDO:
istate->resolve_undo = resolve_undo_read(data, sz);
break;
+ case CACHE_EXT_NORM_SHA1:
+ return norm_sha1_read(istate->cache, istate->cache_nr, data, sz);
+ break;
default:
if (*ext < 'A' || 'Z' < *ext)
return error("index uses %.4s extension, which we do not understand",
@@ -1524,6 +1543,16 @@ static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
}
}
+static void norm_sha1_write(struct strbuf *sb, const struct cache_entry *ce,
+ int entry_no)
+{
+ struct ondisk_norm_sha1 entry;
+ entry.entry_no = htonl(entry_no);
+ entry.norm_flags = htonl(ce->norm_flags);
+ memcpy(entry.norm_sha1, ce->norm_sha1, 20);
+ strbuf_add(sb, &entry, sizeof(entry));
+}
+
static int ce_write_entry(git_SHA_CTX *c, int fd, struct cache_entry *ce)
{
int size = ondisk_ce_size(ce);
@@ -1559,10 +1588,11 @@ int write_index(struct index_state *istate, int newfd)
{
git_SHA_CTX c;
struct cache_header hdr;
- int i, err, removed, extended;
+ int i, j, err, removed, extended;
struct cache_entry **cache = istate->cache;
int entries = istate->cache_nr;
struct stat st;
+ struct strbuf sb = STRBUF_INIT;
for (i = removed = extended = 0; i < entries; i++) {
if (cache[i]->ce_flags & CE_REMOVE)
@@ -1585,7 +1615,7 @@ int write_index(struct index_state *istate, int newfd)
if (ce_write(&c, newfd, &hdr, sizeof(hdr)) < 0)
return -1;
- for (i = 0; i < entries; i++) {
+ for (i = j = 0; i < entries; i++) {
struct cache_entry *ce = cache[i];
if (ce->ce_flags & CE_REMOVE)
continue;
@@ -1593,12 +1623,21 @@ int write_index(struct index_state *istate, int newfd)
ce_smudge_racily_clean_entry(ce);
if (ce_write_entry(&c, newfd, ce) < 0)
return -1;
+ if (ce->norm_flags)
+ norm_sha1_write(&sb, ce, j);
+ j++;
}
/* Write extension data here */
+ if (sb.len) {
+ err = write_index_ext_header(&c, newfd, CACHE_EXT_NORM_SHA1,
+ sb.len) < 0
+ || ce_write(&c, newfd, sb.buf, sb.len) < 0;
+ strbuf_release(&sb);
+ if (err)
+ return -1;
+ }
if (istate->cache_tree) {
- struct strbuf sb = STRBUF_INIT;
-
cache_tree_write(&sb, istate->cache_tree);
err = write_index_ext_header(&c, newfd, CACHE_EXT_TREE, sb.len) < 0
|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
@@ -1607,8 +1646,6 @@ int write_index(struct index_state *istate, int newfd)
return -1;
}
if (istate->resolve_undo) {
- struct strbuf sb = STRBUF_INIT;
-
resolve_undo_write(&sb, istate->resolve_undo);
err = write_index_ext_header(&c, newfd, CACHE_EXT_RESOLVE_UNDO,
sb.len) < 0
--
1.7.0.4.369.g81e89
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH RFC v2 4/4] t/t0021: Test that conversion changes are detected.
2010-04-25 16:29 [PATCH RFC v2 0/4] Patches to avoid reporting conversion changes Henrik Grubbström (Grubba)
` (2 preceding siblings ...)
2010-04-25 16:29 ` [PATCH RFC v2 3/4] cache: Added index extension "NORM" Henrik Grubbström (Grubba)
@ 2010-04-25 16:29 ` Henrik Grubbström (Grubba)
3 siblings, 0 replies; 5+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-25 16:29 UTC (permalink / raw)
To: git; +Cc: Henrik Grubbström
Signed-off-by Henrik Grubbström <grubba@grubba.org>
---
Thanks to Junio C Hamano for some of the tests.
t/t0021-conversion.sh | 50 +++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 50 insertions(+), 0 deletions(-)
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 6cb8d60..b6de203 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -89,4 +89,54 @@ test_expect_success expanded_in_repo '
cmp expanded-keywords expected-output
'
+# Check that files that have had their canonical representation
+# changed since being checked in aren't reported as modified
+# directly after being checked out.
+test_expect_success keywords_not_modified '
+ {
+ echo "File with foreign keywords"
+ echo "\$Id\$"
+ echo "\$Id: NoTerminatingSymbol"
+ echo "\$Id: Foreign Commit With Spaces \$"
+ echo "\$Id: GitCommitId \$"
+ echo "\$Id: NoTerminatingSymbolAtEOF"
+ } > expanded-keywords2 &&
+
+ git add expanded-keywords2 &&
+ git commit -m "File with keywords expanded" &&
+
+ echo "expanded-keywords2 ident" >> .gitattributes &&
+
+ rm -f expanded-keywords2 &&
+ git checkout -- expanded-keywords2 &&
+
+ test "x`git status --porcelain -- expanded-keywords2`" = x
+'
+
+# Test detection of CRLF conversion changes CRLF ==> LF.
+test_expect_success crlf_conversion_change_crlf_to_lf '
+ # step 0. a blob with CRLF
+ git init one && cd one &&
+ echo -e "a quick brown fox\015" >kuzu &&
+ git add kuzu && git commit -m kuzu &&
+ # step 1. you want CRLF in work area, LF in repository
+ git config core.autocrlf true &&
+ # step 2. user edit and revert.
+ touch kuzu &&
+ git update-index --refresh
+'
+
+# Test detection of CRLF conversion changes LF ==> CRLF.
+test_expect_success crlf_conversion_change_lf_to_crlf '
+ # step 0 & 1. a project with LF ending
+ git init two && cd two &&
+ echo a quick brown fox >kuzu &&
+ git add kuzu && git commit -m kuzu &&
+ # step 2. you want CRLF in your work area
+ echo -e "a quick brown fox\015" >kuzu &&
+ git config core.autocrlf true &&
+ # step 3. oops, refresh
+ git update-index --refresh
+'
+
test_done
--
1.7.0.4.369.g81e89
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-04-25 16:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-25 16:29 [PATCH RFC v2 0/4] Patches to avoid reporting conversion changes Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 1/4] sha1_file: Added index_blob() Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 2/4] cache: Added ce_norm_sha1() and related cache_entry fields Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 3/4] cache: Added index extension "NORM" Henrik Grubbström (Grubba)
2010-04-25 16:29 ` [PATCH RFC v2 4/4] t/t0021: Test that conversion changes are detected Henrik Grubbström (Grubba)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).