* 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree @ 2012-10-09 9:03 Johannes Sixt 2012-10-09 9:38 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-09 9:03 UTC (permalink / raw) To: Git Mailing List Running 'git grep needle origin/master' on Windows gives numerous warnings of the kind warning: unable to access 'origin/master:Documentation/.gitattributes': Invalid argument It is worrysome that it is attempted to access a file whose name is prefixed by a revision. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-09 9:03 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Johannes Sixt @ 2012-10-09 9:38 ` Nguyen Thai Ngoc Duy 2012-10-09 12:01 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-09 9:38 UTC (permalink / raw) To: Johannes Sixt; +Cc: Git Mailing List On Tue, Oct 9, 2012 at 4:03 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Running 'git grep needle origin/master' on Windows gives numerous warnings > of the kind > > warning: unable to access 'origin/master:Documentation/.gitattributes': > Invalid argument strace confirms it. Stack trace #0 read_attr_from_file (path=0x820e818 "HEAD:Documentation/.gitattributes", macro_ok=0) at attr.c:351 #1 0x080d378d in read_attr (path=0x820e818 "HEAD:Documentation/.gitattributes", macro_ok=0) at attr.c:436 #2 0x080d3bf1 in prepare_attr_stack (path=0x820e7f0 "HEAD:Documentation/.gitattributes") at attr.c:630 #3 0x080d3f68 in collect_all_attrs (path=0x820e7f0 "HEAD:Documentation/.gitattributes") at attr.c:747 #4 0x080d3ffd in git_check_attr (path=0x820e7f0 "HEAD:Documentation/.gitattributes", num=1, check=0xbfffd848) at attr.c:761 #5 0x0815e736 in userdiff_find_by_path (path=0x820e7f0 "HEAD:Documentation/.gitattributes") at userdiff.c:278 #6 0x081058ca in grep_source_load_driver (gs=0xbfffd978) at grep.c:1504 #7 0x08105907 in grep_source_is_binary (gs=0xbfffd978) at grep.c:1512 #8 0x08104eaa in grep_source_1 (opt=0xbfffe304, gs=0xbfffd978, collect_hits=0) at grep.c:1180 Never touched userdiff.c before. Anybody? -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-09 9:38 ` Nguyen Thai Ngoc Duy @ 2012-10-09 12:01 ` Nguyen Thai Ngoc Duy 2012-10-09 12:41 ` Jeff King 0 siblings, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-09 12:01 UTC (permalink / raw) To: Johannes Sixt; +Cc: Git Mailing List On Tue, Oct 09, 2012 at 04:38:32PM +0700, Nguyen Thai Ngoc Duy wrote: > #5 0x0815e736 in userdiff_find_by_path (path=0x820e7f0 > "HEAD:Documentation/.gitattributes") at userdiff.c:278 > #6 0x081058ca in grep_source_load_driver (gs=0xbfffd978) at grep.c:1504 A bandage patch may look like this. But it does not solve the real problem: - we should be able to look up in-tree .gitattributes, not just ignore like this patch does - gs->name seems to be prepared for human consumption, not for accessing files. grep_file() with opt->relative is true can put quotes in gs->name, for example. I feel like we should make this function a callback and let git-grep deal with driver loading itself. -- 8< -- diff --git a/grep.c b/grep.c index edc7776..c5ed067 100644 --- a/grep.c +++ b/grep.c @@ -1501,7 +1501,8 @@ void grep_source_load_driver(struct grep_source *gs) return; grep_attr_lock(); - gs->driver = userdiff_find_by_path(gs->name); + if (gs->type == GREP_SOURCE_FILE) + gs->driver = userdiff_find_by_path(gs->name); if (!gs->driver) gs->driver = userdiff_find_by_name("default"); grep_attr_unlock(); -- 8< -- -- Duy ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-09 12:01 ` Nguyen Thai Ngoc Duy @ 2012-10-09 12:41 ` Jeff King 2012-10-09 18:59 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Jeff King @ 2012-10-09 12:41 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Johannes Sixt, Git Mailing List On Tue, Oct 09, 2012 at 07:01:44PM +0700, Nguyen Thai Ngoc Duy wrote: > On Tue, Oct 09, 2012 at 04:38:32PM +0700, Nguyen Thai Ngoc Duy wrote: > > #5 0x0815e736 in userdiff_find_by_path (path=0x820e7f0 > > "HEAD:Documentation/.gitattributes") at userdiff.c:278 > > #6 0x081058ca in grep_source_load_driver (gs=0xbfffd978) at grep.c:1504 > > A bandage patch may look like this. But it does not solve the real > problem: > > - we should be able to look up in-tree .gitattributes, not just > ignore like this patch does > > - gs->name seems to be prepared for human consumption, not for > accessing files. grep_file() with opt->relative is true can put > quotes in gs->name, for example. Right. For the second, you would probably want gs->identifier in the case of GREP_SOURCE_FILE. But that identifier information is not available at all for GREP_SOURCE_SHA1, which is what is breaking the first point. > I feel like we should make this function a callback and let git-grep > deal with driver loading itself. I think we just need to have callers of grep_source_init provide us with the actual pathname (or NULL, in the case of GREP_SOURCE_BUF). That is where the information is lost. Like this incomplete and untested patch, which should fix the quoting problem for GREP_SOURCE_FILE, but leave the sha1 bits broken (see the in-code comment). I'm traveling this week, so I doubt I'll have time to look for a few more days. If you want to work on it, please do. diff --git a/builtin/grep.c b/builtin/grep.c index 82530a6..be602dd 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -86,7 +86,7 @@ static void add_work(struct grep_opt *opt, enum grep_source_type type, static int skip_first_line; static void add_work(struct grep_opt *opt, enum grep_source_type type, - const char *name, const void *id) + const char *name, const char *path, const void *id) { grep_lock(); @@ -94,7 +94,7 @@ static void add_work(struct grep_opt *opt, enum grep_source_type type, pthread_cond_wait(&cond_write, &grep_mutex); } - grep_source_init(&todo[todo_end].source, type, name, id); + grep_source_init(&todo[todo_end].source, type, name, path, id); if (opt->binary != GREP_BINARY_TEXT) grep_source_load_driver(&todo[todo_end].source); todo[todo_end].done = 0; @@ -378,14 +378,21 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, if (opt->relative && opt->prefix_length) { quote_path_relative(filename + tree_name_len, -1, &pathbuf, opt->prefix); + /* XXX Why do we insert here instead of just putting it in + * first? */ strbuf_insert(&pathbuf, 0, filename, tree_name_len); } else { strbuf_addstr(&pathbuf, filename); } + /* XXX We seem to get all kinds of junk via the filename field here, + * including partial filenames, sha1:path, etc. We could parse it + * ourselves, but that is probably insanity. We should ask the + * caller to break it down more for us. For now, just pass NULL. */ + #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, NULL, sha1); strbuf_release(&pathbuf); return 0; } else @@ -394,7 +401,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, NULL, sha1); strbuf_release(&pathbuf); hit = grep_source(opt, &gs); @@ -414,7 +421,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_FILE, buf.buf, filename); + add_work(opt, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); return 0; } else @@ -423,7 +430,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename); + grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); hit = grep_source(opt, &gs); diff --git a/grep.c b/grep.c index edc7776..06bc1c6 100644 --- a/grep.c +++ b/grep.c @@ -1373,7 +1373,7 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) struct grep_source gs; int r; - grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL); + grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL, NULL); gs.buf = buf; gs.size = size; @@ -1384,10 +1384,12 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type, } void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier) + const char *name, const char *path, + const void *identifier) { gs->type = type; gs->name = name ? xstrdup(name) : NULL; + gs->path = path ? xstrdup(path) : NULL; gs->buf = NULL; gs->size = 0; gs->driver = NULL; @@ -1409,6 +1411,8 @@ void grep_source_clear(struct grep_source *gs) { free(gs->name); gs->name = NULL; + free(gs->path); + gs->path = NULL; free(gs->identifier); gs->identifier = NULL; grep_source_clear_data(gs); diff --git a/grep.h b/grep.h index c256ac6..c2cf57b 100644 --- a/grep.h +++ b/grep.h @@ -158,11 +158,13 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type, char *buf; unsigned long size; + char *path; /* for attribute lookups */ struct userdiff_driver *driver; }; void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier); + const char *name, const char *path, + const void *identifier); void grep_source_clear_data(struct grep_source *gs); void grep_source_clear(struct grep_source *gs); void grep_source_load_driver(struct grep_source *gs); ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-09 12:41 ` Jeff King @ 2012-10-09 18:59 ` Junio C Hamano 2012-10-10 5:17 ` Nguyen Thai Ngoc Duy 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 0 siblings, 2 replies; 43+ messages in thread From: Junio C Hamano @ 2012-10-09 18:59 UTC (permalink / raw) To: Jeff King; +Cc: Nguyen Thai Ngoc Duy, Johannes Sixt, Git Mailing List Jeff King <peff@peff.net> writes: > I think we just need to have callers of grep_source_init provide us with > the actual pathname (or NULL, in the case of GREP_SOURCE_BUF). That is > where the information is lost. Yes. I agree that is the right approach. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-09 18:59 ` Junio C Hamano @ 2012-10-10 5:17 ` Nguyen Thai Ngoc Duy 2012-10-10 5:33 ` Junio C Hamano 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 1 sibling, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-10 5:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Johannes Sixt, Git Mailing List On Wed, Oct 10, 2012 at 1:59 AM, Junio C Hamano <gitster@pobox.com> wrote: > Jeff King <peff@peff.net> writes: > >> I think we just need to have callers of grep_source_init provide us with >> the actual pathname (or NULL, in the case of GREP_SOURCE_BUF). That is >> where the information is lost. > > Yes. I agree that is the right approach. Passing full path this way means prepare_attr_stack has to walk back to top directory for every files (even in the same directory). If .gitattributes are loaded while git-grep traverses the tree, then it can preload attr once per directory. But Jeff's approach looks simpler. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-10 5:17 ` Nguyen Thai Ngoc Duy @ 2012-10-10 5:33 ` Junio C Hamano 2012-10-10 5:45 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-10 5:33 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Jeff King, Johannes Sixt, Git Mailing List Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > On Wed, Oct 10, 2012 at 1:59 AM, Junio C Hamano <gitster@pobox.com> wrote: >> Jeff King <peff@peff.net> writes: >> >>> I think we just need to have callers of grep_source_init provide us with >>> the actual pathname (or NULL, in the case of GREP_SOURCE_BUF). That is >>> where the information is lost. >> >> Yes. I agree that is the right approach. > > Passing full path this way means prepare_attr_stack has to walk back > to top directory for every files (even in the same directory). If > .gitattributes are loaded while git-grep traverses the tree, then it > can preload attr once per directory. But Jeff's approach looks > simpler. Why can't you do both? That is, to build a full path as you descend, and read per-directory .gitattributes as you go? Confused... ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-10 5:33 ` Junio C Hamano @ 2012-10-10 5:45 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-10 5:45 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Johannes Sixt, Git Mailing List On Wed, Oct 10, 2012 at 12:33 PM, Junio C Hamano <gitster@pobox.com> wrote: > Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > >> On Wed, Oct 10, 2012 at 1:59 AM, Junio C Hamano <gitster@pobox.com> wrote: >>> Jeff King <peff@peff.net> writes: >>> >>>> I think we just need to have callers of grep_source_init provide us with >>>> the actual pathname (or NULL, in the case of GREP_SOURCE_BUF). That is >>>> where the information is lost. >>> >>> Yes. I agree that is the right approach. >> >> Passing full path this way means prepare_attr_stack has to walk back >> to top directory for every files (even in the same directory). If >> .gitattributes are loaded while git-grep traverses the tree, then it >> can preload attr once per directory. But Jeff's approach looks >> simpler. > > Why can't you do both? That is, to build a full path as you > descend, and read per-directory .gitattributes as you go? Hm... I need to check attr.c code but I think it means read .gitattributes and prepare attr stack in builtin/grep.c (where we traverse trees) and actually check the attribute in grep_source_load_driver(), much further down the call stack. I'm not sure how we can pass the prepared attr stack down to grep_source_load_driver(). -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-09 18:59 ` Junio C Hamano 2012-10-10 5:17 ` Nguyen Thai Ngoc Duy @ 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy ` (3 more replies) 1 sibling, 4 replies; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy On Wed, Oct 10, 2012 at 1:59 AM, Junio C Hamano <gitster@pobox.com> wrote: > Jeff King <peff@peff.net> writes: > >> I think we just need to have callers of grep_source_init provide us with >> the actual pathname (or NULL, in the case of GREP_SOURCE_BUF). That is >> where the information is lost. > > Yes. I agree that is the right approach. Here we go. No additional tests as I don't know how to write them. But strace shows it's ok. Currently we don't have many options to optimize attr access. We cannot prepare a attr stack in advance for a specific directory, then check attributes many times using the stack. We cannot prepare attr stack from a tree object. As a result a normal "grep pattern --" results in a lot of duplicate failed open(".../.gitattributes") calls. Jeff King (1): grep: pass true path name to grep machinery Nguyễn Thái Ngọc Duy (2): quote: let caller reset buffer for quote_path_relative() grep: stop looking at random places for .gitattributes Documentation/git-grep.txt | 7 +++++-- builtin/clean.c | 2 ++ builtin/grep.c | 19 ++++++++++++------- builtin/ls-files.c | 1 + grep.c | 11 ++++++++--- grep.h | 4 +++- quote.c | 1 - wt-status.c | 1 + 8 files changed, 32 insertions(+), 14 deletions(-) -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 21:13 ` Junio C Hamano 2012-10-10 11:34 ` [PATCH 2/3] grep: pass true path name to grep machinery Nguyễn Thái Ngọc Duy ` (2 subsequent siblings) 3 siblings, 1 reply; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy quote_path_relative() resetting output buffer is sometimes unnecessary as the buffer has never been used, and some other times makes it harder for the caller to use (see builtin/grep.c, the caller has to insert a string after quote_path_relative) Move the buffer reset back to call sites when necessary. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- The answer for Jeff's XXX in his patch, why strbuf_insert() instead of just adding in advance. builtin/clean.c | 2 ++ builtin/grep.c | 2 +- builtin/ls-files.c | 1 + quote.c | 1 - wt-status.c | 1 + 5 files changed, 5 insertions(+), 2 deletions(-) diff --git a/builtin/clean.c b/builtin/clean.c index 69c1cda..ff633cc 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -149,6 +149,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) if (S_ISDIR(st.st_mode)) { strbuf_addstr(&directory, ent->name); + strbuf_reset(&buf); qname = quote_path_relative(directory.buf, directory.len, &buf, prefix); if (show_only && (remove_directories || (matches == MATCHED_EXACTLY))) { @@ -171,6 +172,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) } else { if (pathspec && !matches) continue; + strbuf_reset(&buf); qname = quote_path_relative(ent->name, -1, &buf, prefix); if (show_only) { printf(_("Would remove %s\n"), qname); diff --git a/builtin/grep.c b/builtin/grep.c index 82530a6..377c904 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -376,9 +376,9 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, struct strbuf pathbuf = STRBUF_INIT; if (opt->relative && opt->prefix_length) { + strbuf_add(&pathbuf, filename, tree_name_len); quote_path_relative(filename + tree_name_len, -1, &pathbuf, opt->prefix); - strbuf_insert(&pathbuf, 0, filename, tree_name_len); } else { strbuf_addstr(&pathbuf, filename); } diff --git a/builtin/ls-files.c b/builtin/ls-files.c index b5434af..1e0cae9 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -395,6 +395,7 @@ int report_path_error(const char *ps_matched, const char **pathspec, const char if (found_dup) continue; + strbuf_reset(&sb); name = quote_path_relative(pathspec[num], -1, &sb, prefix); error("pathspec '%s' did not match any file(s) known to git.", name); diff --git a/quote.c b/quote.c index 911229f..7e23ba9 100644 --- a/quote.c +++ b/quote.c @@ -381,7 +381,6 @@ char *quote_path_relative(const char *in, int len, { struct strbuf sb = STRBUF_INIT; const char *rel = path_relative(in, len, &sb, prefix, -1); - strbuf_reset(out); quote_c_style_counted(rel, strlen(rel), out, NULL, 0); strbuf_release(&sb); diff --git a/wt-status.c b/wt-status.c index 2a9658b..be8b600 100644 --- a/wt-status.c +++ b/wt-status.c @@ -690,6 +690,7 @@ static void wt_status_print_other(struct wt_status *s, struct string_list_item *it; const char *path; it = &(l->items[i]); + strbuf_reset(&buf); path = quote_path(it->string, strlen(it->string), &buf, s->prefix); if (column_active(s->colopts)) { -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() 2012-10-10 11:34 ` [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy @ 2012-10-10 21:13 ` Junio C Hamano 2012-10-11 13:04 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-10 21:13 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King, Johannes Sixt Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > quote_path_relative() resetting output buffer is sometimes unnecessary > as the buffer has never been used, and some other times makes it > harder for the caller to use (see builtin/grep.c, the caller has to > insert a string after quote_path_relative) > > Move the buffer reset back to call sites when necessary. > > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> > --- > The answer for Jeff's XXX in his patch, why strbuf_insert() instead > of just adding in advance. This sounds a lot stronger than "let" to me. All existing callers that assumed that buf to be emptied by the function now has to clear it. "quote: stop resetting output buffer of quote_path_relative()" may better describe what this really does. How should this interact with the logic in the called function that used to say "if we ended up returning an empty string because the path is the same as the base, we should give ./ back", and what should the return value of this function be? To answer these questions, you must define the meaning of the string in the output buffer that already exists when the function is called. If the caller did this: strbuf_addstr(&out, "The path relative to your HOME is: "); quote_path_relative(path, pathlen, &out, "/home/pclouds/"); then the answers are "We still need to add ./ but !out->len is no longer a good test to decide" and "It should point at the first byte of what we added, not out->buf". But if the caller did this instead: srcdir = "/src/"; strbuf_addstr(&dst, "/dst/"); quote_path_relative(path, pathlen, &dst, srcdir); printf("cp '%s' '%s'\n", path, dst->buf); then it is nonsensical to add "./" when out->len is not zero when the function returns. So what does it mean to have an existing string in the output buffer when calling the function? Is it supposed to be a path to a directory, or just a general uninterpreted string (e.g. a message)? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() 2012-10-10 21:13 ` Junio C Hamano @ 2012-10-11 13:04 ` Nguyen Thai Ngoc Duy 2012-10-11 16:42 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-11 13:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King, Johannes Sixt On Thu, Oct 11, 2012 at 4:13 AM, Junio C Hamano <gitster@pobox.com> wrote: > This sounds a lot stronger than "let" to me. All existing callers > that assumed that buf to be emptied by the function now has to clear > it. "quote: stop resetting output buffer of quote_path_relative()" > may better describe what this really does. > > How should this interact with the logic in the called function that > used to say "if we ended up returning an empty string because the > path is the same as the base, we should give ./ back", and what > should the return value of this function be? > ... > So what does it mean to have an existing string in the output buffer > when calling the function? Is it supposed to be a path to a > directory, or just a general uninterpreted string (e.g. a message)? I prefer the KISS principle in this case: do not interpret existing string. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() 2012-10-11 13:04 ` Nguyen Thai Ngoc Duy @ 2012-10-11 16:42 ` Junio C Hamano 0 siblings, 0 replies; 43+ messages in thread From: Junio C Hamano @ 2012-10-11 16:42 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git, Jeff King, Johannes Sixt Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > On Thu, Oct 11, 2012 at 4:13 AM, Junio C Hamano <gitster@pobox.com> wrote: >> This sounds a lot stronger than "let" to me. All existing callers >> that assumed that buf to be emptied by the function now has to clear >> it. "quote: stop resetting output buffer of quote_path_relative()" >> may better describe what this really does. >> >> How should this interact with the logic in the called function that >> used to say "if we ended up returning an empty string because the >> path is the same as the base, we should give ./ back", and what >> should the return value of this function be? >> ... >> So what does it mean to have an existing string in the output buffer >> when calling the function? Is it supposed to be a path to a >> directory, or just a general uninterpreted string (e.g. a message)? > > I prefer the KISS principle in this case: do not interpret existing string. Sorry, I do not quite understand what you mean. The original function is about turning one path into a path that is relative to the other path (e.g. make the result usable for chdir() when your cwd() is the other path to name that path), and because you do not want to end up with an empty path, it adds "./" to the result when the two paths are equal. That is simple and stupid, in the sense that you can explain in simple terms what it does even to a stupid person and make him understand. If a patch changes one aspect of the implementation of the function, but keeps other parts intact without thinking the ramifications of doing so through, and ends up giving the end result unnecessarily complex semantics, the _patch_ might be simple and stupid, but the end result is no longer simple. That is why I asked what it means to "append" and asked you to write it down for people who may need to decide if this function is an appropriate thing to call for their new code in the future. You didn't answer that question, so I have to ask again. What is the existing string this function appends its result to? - If it is a leading path (in other words, you are transplanting a hierarchy to somewhere else --- see and (re)read "But if the caller did this instead" part from the message you are responding to), "because you do not want to end up with an empty path, it adds "./" to the result when the two paths are equal" does not make sense at all. - If it is a message that describes the resulting relative path (the use case above that "transplant" example in the same message), it needs to add "./" to the result, but the logic to determine it now needs to take the length of what was in the buffer when the function was called into account. - Is there a third possibility? Whatever your answer is, please make sure the next person who wants to call this function from his code does not have to ask "Why does it append "./" when out->len is zero? Shouldn't the condition be when *rel is an empty string?" Thanks. ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 2/3] grep: pass true path name to grep machinery 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 3/3] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 0/2] Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Nguyễn Thái Ngọc Duy 3 siblings, 0 replies; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy From: Jeff King <peff@peff.net> Having path name (if available) may be useful. For example, if grep machinery needs to look up git attributes, it'll need a good path name. grep_source->name is not good because it's for display purpose only. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- builtin/grep.c | 17 +++++++++++------ grep.c | 8 ++++++-- grep.h | 4 +++- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 377c904..0211e35 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -86,7 +86,7 @@ static pthread_cond_t cond_result; static int skip_first_line; static void add_work(struct grep_opt *opt, enum grep_source_type type, - const char *name, const void *id) + const char *name, const char *path, const void *id) { grep_lock(); @@ -94,7 +94,7 @@ static void add_work(struct grep_opt *opt, enum grep_source_type type, pthread_cond_wait(&cond_write, &grep_mutex); } - grep_source_init(&todo[todo_end].source, type, name, id); + grep_source_init(&todo[todo_end].source, type, name, path, id); if (opt->binary != GREP_BINARY_TEXT) grep_source_load_driver(&todo[todo_end].source); todo[todo_end].done = 0; @@ -383,9 +383,14 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, strbuf_addstr(&pathbuf, filename); } + /* XXX We seem to get all kinds of junk via the filename field here, + * including partial filenames, sha1:path, etc. We could parse it + * ourselves, but that is probably insanity. We should ask the + * caller to break it down more for us. For now, just pass NULL. */ + #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, NULL, sha1); strbuf_release(&pathbuf); return 0; } else @@ -394,7 +399,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, NULL, sha1); strbuf_release(&pathbuf); hit = grep_source(opt, &gs); @@ -414,7 +419,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_FILE, buf.buf, filename); + add_work(opt, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); return 0; } else @@ -423,7 +428,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename); + grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); hit = grep_source(opt, &gs); diff --git a/grep.c b/grep.c index edc7776..06bc1c6 100644 --- a/grep.c +++ b/grep.c @@ -1373,7 +1373,7 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) struct grep_source gs; int r; - grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL); + grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL, NULL); gs.buf = buf; gs.size = size; @@ -1384,10 +1384,12 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) } void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier) + const char *name, const char *path, + const void *identifier) { gs->type = type; gs->name = name ? xstrdup(name) : NULL; + gs->path = path ? xstrdup(path) : NULL; gs->buf = NULL; gs->size = 0; gs->driver = NULL; @@ -1409,6 +1411,8 @@ void grep_source_clear(struct grep_source *gs) { free(gs->name); gs->name = NULL; + free(gs->path); + gs->path = NULL; free(gs->identifier); gs->identifier = NULL; grep_source_clear_data(gs); diff --git a/grep.h b/grep.h index c256ac6..c2cf57b 100644 --- a/grep.h +++ b/grep.h @@ -158,11 +158,13 @@ struct grep_source { char *buf; unsigned long size; + char *path; /* for attribute lookups */ struct userdiff_driver *driver; }; void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier); + const char *name, const char *path, + const void *identifier); void grep_source_clear_data(struct grep_source *gs); void grep_source_clear(struct grep_source *gs); void grep_source_load_driver(struct grep_source *gs); -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 2/3] grep: pass true path name to grep machinery Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 11:51 ` Johannes Sixt 2012-10-10 13:59 ` [PATCH v2 0/2] Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Nguyễn Thái Ngọc Duy 3 siblings, 1 reply; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 11:34 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy grep searches for .gitattributes using "name" field in struct grep_source but that field is not real on-disk path name. For example, "grep pattern rev" fills the field with "rev:path", which is non-existent usually until somebody exploits it to drive git away. attr does not support looking up .gitattributes from a tree object. Making "git grep pattern <rev>" support .gitattributes could be a big work. Just note in document what we support for now. The document changes in this patch are to be reverted once support is in place. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- Documentation/git-grep.txt | 7 +++++-- grep.c | 3 ++- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index cfecf84..a4c66ee 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -86,7 +86,9 @@ OPTIONS files. -I:: - Don't match the pattern in binary files. + Don't match the pattern in binary files. Note that binary + detection via .gitattributes only works with searching files + in working directory. --max-depth <depth>:: For each <pathspec> given on command line, descend at most <depth> @@ -189,7 +191,8 @@ OPTIONS the match, unless the matching line is a function name itself. The name is determined in the same way as 'git diff' works out patch hunk headers (see 'Defining a custom hunk-header' in - linkgit:gitattributes[5]). + linkgit:gitattributes[5]). Note that .gitattributes are only + support for searching files in working directory. -<num>:: -C <num>:: diff --git a/grep.c b/grep.c index 06bc1c6..e36c01b 100644 --- a/grep.c +++ b/grep.c @@ -1505,7 +1505,8 @@ void grep_source_load_driver(struct grep_source *gs) return; grep_attr_lock(); - gs->driver = userdiff_find_by_path(gs->name); + if (gs->path) + gs->driver = userdiff_find_by_path(gs->path); if (!gs->driver) gs->driver = userdiff_find_by_name("default"); grep_attr_unlock(); -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 11:34 ` [PATCH 3/3] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy @ 2012-10-10 11:51 ` Johannes Sixt 2012-10-10 12:03 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-10 11:51 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Jeff King Thanks for working on this issue! Am 10/10/2012 13:34, schrieb Nguyễn Thái Ngọc Duy: > + linkgit:gitattributes[5]). Note that .gitattributes are only > + support for searching files in working directory. Does this mean it does not work for 'git grep --cached'? That would be a real loss. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 11:51 ` Johannes Sixt @ 2012-10-10 12:03 ` Nguyen Thai Ngoc Duy 2012-10-10 12:12 ` Johannes Sixt 0 siblings, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-10 12:03 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Junio C Hamano, Jeff King On Wed, Oct 10, 2012 at 6:51 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Thanks for working on this issue! > > Am 10/10/2012 13:34, schrieb Nguyễn Thái Ngọc Duy: >> + linkgit:gitattributes[5]). Note that .gitattributes are only >> + support for searching files in working directory. > > Does this mean it does not work for 'git grep --cached'? That would be a > real loss. I missed this case. I don't think it works correctly before as it reads worktree's .gitattributes instead of the index version. But at least we support reading .gitattributes from index. Patches coming soon. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 12:03 ` Nguyen Thai Ngoc Duy @ 2012-10-10 12:12 ` Johannes Sixt 2012-10-10 12:32 ` Nguyen Thai Ngoc Duy 2012-10-10 19:44 ` Junio C Hamano 0 siblings, 2 replies; 43+ messages in thread From: Johannes Sixt @ 2012-10-10 12:12 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git, Junio C Hamano, Jeff King Am 10/10/2012 14:03, schrieb Nguyen Thai Ngoc Duy: > On Wed, Oct 10, 2012 at 6:51 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: >> Thanks for working on this issue! >> >> Am 10/10/2012 13:34, schrieb Nguyễn Thái Ngọc Duy: >>> + linkgit:gitattributes[5]). Note that .gitattributes are only >>> + support for searching files in working directory. >> >> Does this mean it does not work for 'git grep --cached'? That would be a >> real loss. > > I missed this case. I don't think it works correctly before as it > reads worktree's .gitattributes instead of the index version. But at > least we support reading .gitattributes from index. Patches coming > soon. Is there already an established definition which the "correct" .gitattributes are? IIRC, everywhere else we are looking at the .gitattributes in the worktree, regardless of whether the object at the path in question is in the worktree, the index, or in an old commit. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 12:12 ` Johannes Sixt @ 2012-10-10 12:32 ` Nguyen Thai Ngoc Duy 2012-10-10 12:43 ` Johannes Sixt 2012-10-10 19:44 ` Junio C Hamano 1 sibling, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-10 12:32 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Junio C Hamano, Jeff King On Wed, Oct 10, 2012 at 7:12 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Is there already an established definition which the "correct" > .gitattributes are? If I ask to grep the index then to me it should read only the index. Although other people can counter that they may want different attributes than the one stored in index, which either comes from worktree or $GIT_DIR. > IIRC, everywhere else we are looking at the > .gitattributes in the worktree, regardless of whether the object at the > path in question is in the worktree, the index, or in an old commit. git-archive has --worktree-attributes to specify where attributes come from. Sparse checkout can choose to read index version first then worktree's or the other way around. All normal operations read worktree version, if not found index version. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 12:32 ` Nguyen Thai Ngoc Duy @ 2012-10-10 12:43 ` Johannes Sixt 2012-10-10 12:51 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-10 12:43 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git, Junio C Hamano, Jeff King Am 10/10/2012 14:32, schrieb Nguyen Thai Ngoc Duy: > git-archive has --worktree-attributes to specify where attributes come > from. Sparse checkout can choose to read index version first then > worktree's or the other way around. All normal operations read > worktree version, if not found index version. So, even if I run 'git diff master~2000 master~1999', it uses the attributes from the worktree, and if not found from the index? (I would not mind, BTW.) -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 12:43 ` Johannes Sixt @ 2012-10-10 12:51 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-10 12:51 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Junio C Hamano, Jeff King On Wed, Oct 10, 2012 at 7:43 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Am 10/10/2012 14:32, schrieb Nguyen Thai Ngoc Duy: >> git-archive has --worktree-attributes to specify where attributes come >> from. Sparse checkout can choose to read index version first then >> worktree's or the other way around. All normal operations read >> worktree version, if not found index version. > > So, even if I run 'git diff master~2000 master~1999', it uses the > attributes from the worktree, and if not found from the index? (I would > not mind, BTW.) Yes. Tested and verified. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 12:12 ` Johannes Sixt 2012-10-10 12:32 ` Nguyen Thai Ngoc Duy @ 2012-10-10 19:44 ` Junio C Hamano 2012-10-11 5:55 ` Johannes Sixt 1 sibling, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-10 19:44 UTC (permalink / raw) To: Johannes Sixt; +Cc: Nguyen Thai Ngoc Duy, git, Jeff King Johannes Sixt <j.sixt@viscovery.net> writes: > Is there already an established definition which the "correct" > .gitattributes are? IIRC, everywhere else we are looking at the > .gitattributes in the worktree, regardless of whether the object at the > path in question is in the worktree, the index, or in an old commit. No, and it is deliberately kept vague while waiting for us to come up with a clear definition of what is "correct". We could declare, from a purist's point of view, that the attribute should be taken from the same place as the path in question is taken from. When running "git add foo.c", we grab the contents of "foo.c" from the working tree, so ".gitignore" from the working tree should be applied when dealing with "foo.c". Similarly, the contents of blob "foo.c" that "git checkout foo.c" reads from the index would get attributes from ".gitignore" in the index (to find what its smudging semantics is) before it gets written out to the working tree. "git diff A B" may give the attributes from tree A to the preimage side while using the attributes from tree B to the postimage side. But the last example has some practical issues. Very often, people retroactively define attributes to correct earlier mistakes. If an older tree A forgot to declare that a path mybank.gnucash is a GnuCash ledger file, while a newer tree B (and the current checkout that is even newer) does [*1*], it is more useful to apply the newer definition from .gitattributes to both trees in practice (and in practice, you are much less likely to have a check-out of ancient tree while running "git diff A B" to compare two trees that are newer than the current check-out). Using the file from the working tree is the best approximation of "we want to use the newer one", both from the semantics (i.e. you are likely to have fresher tree checked out) and the performance (i.e. reading from files in the working tree is far more trivial than reading from historical trees) point of view. So it is not so cut-and-dried that "take the attributes from the same place" is a good and "correct" definition [*2*]. [Footnote] *1* GnuCash writes, by default, a gzip compressed xml file, so I have in my .gitattributes file *.gnucash filter=gnucash and then in my .git/config [filter "gnucash"] clean = gzip -dc smudge = gzip -c This allows "git diff" to work reasonably well (if you do not mind reading diff between two versions of xml files, that is) and also helps delta compression when packing the repository. *2* Besides, the attributes are primarily used to define the semantics about the contents in question. If one file is of "gnucash" kind (i.e. has "filter=gnucash" attribute in the previous example) in one tree, and the path is of a different kind (e.g. "filter=ooo" that says "this is an Ooo file"), it is very likely that it does not even make sense, with or without content filtering, to compare them with "git diff", so "take the attributes from the same place" would have to imply "if the attributes do not match, say something similar to 'Binary files differ'", which is just as useless as applying one attribute taken from a convenient but random place (i.e. the working tree). ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-10 19:44 ` Junio C Hamano @ 2012-10-11 5:55 ` Johannes Sixt 2012-10-11 7:04 ` Michael Haggerty 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-11 5:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git, Jeff King Am 10/10/2012 21:44, schrieb Junio C Hamano: > Johannes Sixt <j.sixt@viscovery.net> writes: > >> Is there already an established definition which the "correct" >> .gitattributes are? > > No, and it is deliberately kept vague while waiting for us to come > up with a clear definition of what is "correct". ... > Very often, people > retroactively define attributes to correct earlier mistakes. Absolutely. I have Windows resource files that are Shift-JIS encoded checked in long ago, and I want to retoactively declare them with "encoding=Shift-JIS" because I prefer to see Japanese script in gitk rather than gibberish. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-11 5:55 ` Johannes Sixt @ 2012-10-11 7:04 ` Michael Haggerty 2012-10-11 8:17 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 43+ messages in thread From: Michael Haggerty @ 2012-10-11 7:04 UTC (permalink / raw) To: Johannes Sixt; +Cc: Junio C Hamano, Nguyen Thai Ngoc Duy, git, Jeff King On 10/11/2012 07:55 AM, Johannes Sixt wrote: > Am 10/10/2012 21:44, schrieb Junio C Hamano: >> Johannes Sixt <j.sixt@viscovery.net> writes: >> >>> Is there already an established definition which the "correct" >>> .gitattributes are? >> >> No, and it is deliberately kept vague while waiting for us to come >> up with a clear definition of what is "correct". > ... >> Very often, people >> retroactively define attributes to correct earlier mistakes. > > Absolutely. I have Windows resource files that are Shift-JIS encoded > checked in long ago, and I want to retoactively declare them with > "encoding=Shift-JIS" because I prefer to see Japanese script in gitk > rather than gibberish. Maybe I'm being too much of a purist, but I don't think that git should retroactively reinterpret history on its own initiative in a way that might not be correct (e.g., maybe your encoding changed from ASCII to Shift-JIS sometime in the past). It would be more appropriate for this to happen only if explicitly requested by the user. For example, why don't you override the incorrect historical attributes via .git/info/attributes? Michael -- Michael Haggerty mhagger@alum.mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] grep: stop looking at random places for .gitattributes 2012-10-11 7:04 ` Michael Haggerty @ 2012-10-11 8:17 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-11 8:17 UTC (permalink / raw) To: Michael Haggerty; +Cc: Johannes Sixt, Junio C Hamano, git, Jeff King On Thu, Oct 11, 2012 at 2:04 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > Maybe I'm being too much of a purist, but I don't think that git should > retroactively reinterpret history on its own initiative in a way that > might not be correct (e.g., maybe your encoding changed from ASCII to > Shift-JIS sometime in the past). It would be more appropriate for this > to happen only if explicitly requested by the user. For example, why > don't you override the incorrect historical attributes via > .git/info/attributes? I think git-notes is a more appropriate place to correct these things. If the incorrect commits are pruned, their notes can also be pruned. No poluttion in $GIT_DIR/info/attr.. And i'm your side, no checking worktree without user's permission. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v2 0/2] Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy ` (2 preceding siblings ...) 2012-10-10 11:34 ` [PATCH 3/3] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy @ 2012-10-10 13:59 ` Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 1/2] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 2/2] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy 3 siblings, 2 replies; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 13:59 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy Round too. .gitattributes should be respected in all cases except blob grepping. As Johannes pointed elsewhere in this thread, if "git diff <rev1> <rev2>" looks up worktree's .gitattributes (and none in rev1/rev2), there's no reason "git grep" should behave differently. Nguyễn Thái Ngọc Duy (2): quote: let caller reset buffer for quote_path_relative() grep: stop looking at random places for .gitattributes builtin/clean.c | 2 ++ builtin/grep.c | 24 +++++++++++++----------- builtin/ls-files.c | 1 + grep.c | 11 ++++++++--- grep.h | 4 +++- quote.c | 1 - t/t7008-grep-binary.sh | 22 ++++++++++++++++++++++ wt-status.c | 1 + 8 files changed, 50 insertions(+), 16 deletions(-) -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v2 1/2] quote: let caller reset buffer for quote_path_relative() 2012-10-10 13:59 ` [PATCH v2 0/2] Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Nguyễn Thái Ngọc Duy @ 2012-10-10 13:59 ` Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 2/2] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy 1 sibling, 0 replies; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 13:59 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy quote_path_relative() resetting output buffer is sometimes unnecessary as the buffer has never been used, and some other times makes it harder for the caller to use (see builtin/grep.c, the caller has to insert a string after quote_path_relative) Move the buffer reset back to call sites when necessary. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- builtin/clean.c | 2 ++ builtin/grep.c | 2 +- builtin/ls-files.c | 1 + quote.c | 1 - wt-status.c | 1 + 5 files changed, 5 insertions(+), 2 deletions(-) diff --git a/builtin/clean.c b/builtin/clean.c index 69c1cda..ff633cc 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -149,6 +149,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) if (S_ISDIR(st.st_mode)) { strbuf_addstr(&directory, ent->name); + strbuf_reset(&buf); qname = quote_path_relative(directory.buf, directory.len, &buf, prefix); if (show_only && (remove_directories || (matches == MATCHED_EXACTLY))) { @@ -171,6 +172,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) } else { if (pathspec && !matches) continue; + strbuf_reset(&buf); qname = quote_path_relative(ent->name, -1, &buf, prefix); if (show_only) { printf(_("Would remove %s\n"), qname); diff --git a/builtin/grep.c b/builtin/grep.c index 82530a6..377c904 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -376,9 +376,9 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, struct strbuf pathbuf = STRBUF_INIT; if (opt->relative && opt->prefix_length) { + strbuf_add(&pathbuf, filename, tree_name_len); quote_path_relative(filename + tree_name_len, -1, &pathbuf, opt->prefix); - strbuf_insert(&pathbuf, 0, filename, tree_name_len); } else { strbuf_addstr(&pathbuf, filename); } diff --git a/builtin/ls-files.c b/builtin/ls-files.c index b5434af..1e0cae9 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -395,6 +395,7 @@ int report_path_error(const char *ps_matched, const char **pathspec, const char if (found_dup) continue; + strbuf_reset(&sb); name = quote_path_relative(pathspec[num], -1, &sb, prefix); error("pathspec '%s' did not match any file(s) known to git.", name); diff --git a/quote.c b/quote.c index 911229f..7e23ba9 100644 --- a/quote.c +++ b/quote.c @@ -381,7 +381,6 @@ char *quote_path_relative(const char *in, int len, { struct strbuf sb = STRBUF_INIT; const char *rel = path_relative(in, len, &sb, prefix, -1); - strbuf_reset(out); quote_c_style_counted(rel, strlen(rel), out, NULL, 0); strbuf_release(&sb); diff --git a/wt-status.c b/wt-status.c index 2a9658b..be8b600 100644 --- a/wt-status.c +++ b/wt-status.c @@ -690,6 +690,7 @@ static void wt_status_print_other(struct wt_status *s, struct string_list_item *it; const char *path; it = &(l->items[i]); + strbuf_reset(&buf); path = quote_path(it->string, strlen(it->string), &buf, s->prefix); if (column_active(s->colopts)) { -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-10 13:59 ` [PATCH v2 0/2] Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 1/2] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy @ 2012-10-10 13:59 ` Nguyễn Thái Ngọc Duy 2012-10-10 14:21 ` Johannes Sixt 2012-10-12 10:49 ` [PATCH v3] " Nguyễn Thái Ngọc Duy 1 sibling, 2 replies; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-10 13:59 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy grep searches for .gitattributes using "name" field in struct grep_source but that field is not real on-disk path name. For example, "grep pattern rev" fills the field with "rev:path", which is non-existent usually until somebody exploits it to drive git away. This patch passes real paths down to grep_source_load_driver(). Except grepping a blob, all other cases should have right paths down to grep_source_load_driver(). In other words, .gitattributes are still respected. Initial-work-by: Jeff King <peff@peff.net> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- builtin/grep.c | 22 ++++++++++++---------- grep.c | 11 ++++++++--- grep.h | 4 +++- t/t7008-grep-binary.sh | 22 ++++++++++++++++++++++ 4 files changed, 45 insertions(+), 14 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 377c904..f6c5ba2 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -86,7 +86,7 @@ static pthread_cond_t cond_result; static int skip_first_line; static void add_work(struct grep_opt *opt, enum grep_source_type type, - const char *name, const void *id) + const char *name, const char *path, const void *id) { grep_lock(); @@ -94,7 +94,7 @@ static void add_work(struct grep_opt *opt, enum grep_source_type type, pthread_cond_wait(&cond_write, &grep_mutex); } - grep_source_init(&todo[todo_end].source, type, name, id); + grep_source_init(&todo[todo_end].source, type, name, path, id); if (opt->binary != GREP_BINARY_TEXT) grep_source_load_driver(&todo[todo_end].source); todo[todo_end].done = 0; @@ -371,7 +371,8 @@ static void *lock_and_read_sha1_file(const unsigned char *sha1, enum object_type } static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, - const char *filename, int tree_name_len) + const char *filename, int tree_name_len, + const char *path) { struct strbuf pathbuf = STRBUF_INIT; @@ -385,7 +386,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, path, sha1); strbuf_release(&pathbuf); return 0; } else @@ -394,7 +395,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, path, sha1); strbuf_release(&pathbuf); hit = grep_source(opt, &gs); @@ -414,7 +415,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_FILE, buf.buf, filename); + add_work(opt, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); return 0; } else @@ -423,7 +424,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename); + grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); hit = grep_source(opt, &gs); @@ -479,7 +480,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int if (cached || (ce->ce_flags & CE_VALID) || ce_skip_worktree(ce)) { if (ce_stage(ce)) continue; - hit |= grep_sha1(opt, ce->sha1, ce->name, 0); + hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name); } else hit |= grep_file(opt, ce->name); @@ -518,7 +519,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, strbuf_add(base, entry.path, te_len); if (S_ISREG(entry.mode)) { - hit |= grep_sha1(opt, entry.sha1, base->buf, tn_len); + hit |= grep_sha1(opt, entry.sha1, base->buf, tn_len, + base->buf + tn_len); } else if (S_ISDIR(entry.mode)) { enum object_type type; @@ -548,7 +550,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, struct object *obj, const char *name) { if (obj->type == OBJ_BLOB) - return grep_sha1(opt, obj->sha1, name, 0); + return grep_sha1(opt, obj->sha1, name, 0, NULL); if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) { struct tree_desc tree; void *data; diff --git a/grep.c b/grep.c index edc7776..e36c01b 100644 --- a/grep.c +++ b/grep.c @@ -1373,7 +1373,7 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) struct grep_source gs; int r; - grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL); + grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL, NULL); gs.buf = buf; gs.size = size; @@ -1384,10 +1384,12 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) } void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier) + const char *name, const char *path, + const void *identifier) { gs->type = type; gs->name = name ? xstrdup(name) : NULL; + gs->path = path ? xstrdup(path) : NULL; gs->buf = NULL; gs->size = 0; gs->driver = NULL; @@ -1409,6 +1411,8 @@ void grep_source_clear(struct grep_source *gs) { free(gs->name); gs->name = NULL; + free(gs->path); + gs->path = NULL; free(gs->identifier); gs->identifier = NULL; grep_source_clear_data(gs); @@ -1501,7 +1505,8 @@ void grep_source_load_driver(struct grep_source *gs) return; grep_attr_lock(); - gs->driver = userdiff_find_by_path(gs->name); + if (gs->path) + gs->driver = userdiff_find_by_path(gs->path); if (!gs->driver) gs->driver = userdiff_find_by_name("default"); grep_attr_unlock(); diff --git a/grep.h b/grep.h index c256ac6..c2cf57b 100644 --- a/grep.h +++ b/grep.h @@ -158,11 +158,13 @@ struct grep_source { char *buf; unsigned long size; + char *path; /* for attribute lookups */ struct userdiff_driver *driver; }; void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier); + const char *name, const char *path, + const void *identifier); void grep_source_clear_data(struct grep_source *gs); void grep_source_clear(struct grep_source *gs); void grep_source_load_driver(struct grep_source *gs); diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index fd6410f..2c85a30 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -111,6 +111,28 @@ test_expect_success 'grep respects binary diff attribute' ' test_cmp expect actual ' +test_expect_success 'grep --cached respects binary diff attribute' ' + git grep --cached text t >actual && + test_cmp expect actual +' + +test_expect_success 'grep --cached respects binary diff attribute (2)' ' + git add .gitattributes && + rm .gitattributes && + git grep --cached text t >actual && + test_cmp expect actual && + git checkout .gitattributes && + git rm --cached .gitattributes +' + +test_expect_success 'grep tree respects binary diff attribute' ' + git commit -m new && + echo "Binary file HEAD:t matches" >expect && + git grep text HEAD -- t >actual && + test_cmp expect actual && + git reset HEAD^ +' + test_expect_success 'grep respects not-binary diff attribute' ' echo binQary | q_to_nul >b && git add b && -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-10 13:59 ` [PATCH v2 2/2] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy @ 2012-10-10 14:21 ` Johannes Sixt 2012-10-10 19:56 ` Junio C Hamano 2012-10-11 1:49 ` Nguyen Thai Ngoc Duy 2012-10-12 10:49 ` [PATCH v3] " Nguyễn Thái Ngọc Duy 1 sibling, 2 replies; 43+ messages in thread From: Johannes Sixt @ 2012-10-10 14:21 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Jeff King Am 10/10/2012 15:59, schrieb Nguyễn Thái Ngọc Duy: > This patch passes real paths down to grep_source_load_driver(). Except > grepping a blob, all other cases should have right paths down to ... grepping a blob or tree object... > grep_source_load_driver(). In other words, .gitattributes are still > respected. ... > +test_expect_success 'grep --cached respects binary diff attribute (2)' ' > + git add .gitattributes && > + rm .gitattributes && > + git grep --cached text t >actual && > + test_cmp expect actual && > + git checkout .gitattributes && > + git rm --cached .gitattributes > +' This should perhaps be test_when_finished "git rm --cached .gitattributes". > + > +test_expect_success 'grep tree respects binary diff attribute' ' I was confused by the word "tree" here. Isn't "pathspec" more correct? > + git commit -m new && > + echo "Binary file HEAD:t matches" >expect && > + git grep text HEAD -- t >actual && > + test_cmp expect actual && > + git reset HEAD^ > +' And in yet another test, should git grep text HEAD:t /not/ respect the binary attribute? At any rate, I don't observe the warnings anymore with this series. Thanks, -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-10 14:21 ` Johannes Sixt @ 2012-10-10 19:56 ` Junio C Hamano 2012-10-11 5:45 ` Johannes Sixt 2012-10-11 1:49 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-10 19:56 UTC (permalink / raw) To: Johannes Sixt; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Johannes Sixt <j.sixt@viscovery.net> writes: > At any rate, I don't observe the warnings anymore with this series. What kind of warnings have you been getting? Earlier we had a bug in the jk/config-warn-on-inaccessible-paths series that made it warn when we tried to open a .gitattribute file and open() returned an error other than ENOENT. The bug was that we saw unnecessary errors when a directory that used to exist no longer exists in the working tree; we would instead get ENOTDIR in such a case that needs to be ignored. The problem was supposed to be "fixed" by 8e950da (attr: failure to open a ".gitattributes" file is OK with ENOTDIR, 2012-09-13); if you are still seeing the error, that error still may need to be addressed, regardless of Nguyễn's patch. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-10 19:56 ` Junio C Hamano @ 2012-10-11 5:45 ` Johannes Sixt 2012-10-11 15:51 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-11 5:45 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Am 10/10/2012 21:56, schrieb Junio C Hamano: > Johannes Sixt <j.sixt@viscovery.net> writes: > >> At any rate, I don't observe the warnings anymore with this series. > > What kind of warnings have you been getting? Earlier we had a bug > in the jk/config-warn-on-inaccessible-paths series that made it warn > when we tried to open a .gitattribute file and open() returned an > error other than ENOENT. The bug was that we saw unnecessary errors > when a directory that used to exist no longer exists in the working > tree; we would instead get ENOTDIR in such a case that needs to be > ignored. I saw EINVAL errors when 'git grep pattern rev' was run on Windows. The reason is that the code attempted to access "rev:dir/.gitattributes" in the worktree, which is an invalid path on Windows due to the colon. The lack of this warning indicates that the attempts to access these files are eliminated. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-11 5:45 ` Johannes Sixt @ 2012-10-11 15:51 ` Junio C Hamano 2012-10-12 7:33 ` Johannes Sixt 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-11 15:51 UTC (permalink / raw) To: Johannes Sixt; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Johannes Sixt <j.sixt@viscovery.net> writes: > I saw EINVAL errors when 'git grep pattern rev' was run on Windows. The > reason is that the code attempted to access "rev:dir/.gitattributes" in > the worktree, which is an invalid path on Windows due to the colon. The > lack of this warning indicates that the attempts to access these files are > eliminated. It means that whenever we ask for attributes for a tracked file that is inside a directory whose name contains a colon, we would get the same error on Windows, no? Perhaps Windows may not let you create such a directory in the first place, but you may still get a repository of a cross platform project that contains such a path. What I am wondering is if we should do something similar to 8e950da (attr: failure to open a .gitattributes file is OK with ENOTDIR, 2012-09-13), at least on Windows, for EINVAL. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-11 15:51 ` Junio C Hamano @ 2012-10-12 7:33 ` Johannes Sixt 2012-10-14 4:29 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-12 7:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Am 10/11/2012 17:51, schrieb Junio C Hamano: > Johannes Sixt <j.sixt@viscovery.net> writes: > >> I saw EINVAL errors when 'git grep pattern rev' was run on Windows. The >> reason is that the code attempted to access "rev:dir/.gitattributes" in >> the worktree, which is an invalid path on Windows due to the colon. The >> lack of this warning indicates that the attempts to access these files are >> eliminated. > > It means that whenever we ask for attributes for a tracked file that > is inside a directory whose name contains a colon, we would get the > same error on Windows, no? Perhaps Windows may not let you create > such a directory in the first place, but you may still get a > repository of a cross platform project that contains such a path. Your assessment is correct. > What I am wondering is if we should do something similar to 8e950da > (attr: failure to open a .gitattributes file is OK with ENOTDIR, > 2012-09-13), at least on Windows, for EINVAL. It might be worth it. We already have a similar special case in write_or_die.c:maybe_flush_or_die() for Windows, although it is not about a colon in a path name. Perhaps like this. --- 8< --- From: Johannes Sixt <j6t@kdbg.org> Subject: [PATCH] attr: do not warn on path with colon on Windows In the same spirit as commit 8e950dab (attr: failure to open a .gitattributes file is OK with ENOTDIR), ignore EINVAL errors. On Windows, a path that contains a colon that is not the one after the drive letter is not allowed and is reported with errno set to EINVAL. Signed-off-by: Johannes Sixt <j6t@kdbg.org> --- attr.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/attr.c b/attr.c index 8010429..ac945ad 100644 --- a/attr.c +++ b/attr.c @@ -354,7 +354,15 @@ static struct attr_stack *read_attr_from_file(const char *path, int macro_ok) int lineno = 0; if (!fp) { - if (errno != ENOENT && errno != ENOTDIR) + /* + * If path does not exist, it is not worth mentioning, + * but I/O errors and permission errors are. + * + * On Windows, EINVAL is reported if path contains a colon + * that is not the driver letter separator. Such a path + * cannot exist in the file system and is also uninteresting. + */ + if (errno != ENOENT && errno != ENOTDIR && errno != EINVAL) warn_on_inaccessible(path); return NULL; } -- 1.8.0.rc2.1237.g5522246 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-12 7:33 ` Johannes Sixt @ 2012-10-14 4:29 ` Junio C Hamano 2012-10-15 6:02 ` Johannes Sixt 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-14 4:29 UTC (permalink / raw) To: Johannes Sixt; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Johannes Sixt <j.sixt@viscovery.net> writes: > It might be worth it. We already have a similar special case in > write_or_die.c:maybe_flush_or_die() for Windows, although it is not about > a colon in a path name. > > Perhaps like this. Hrm, the "we already have" one b2f5e26 (Windows: Work around an oddity when a pipe with no reader is written to., 2007-08-17) was what you added while I was looking the other way ;-) as a part of Windows specific pull. That change, and this patch, seem to cover the cases to be ignored with a bit too wide a net to my taste. On other systems, and even on Windows if the path does not have any colon, EINVAL is something we would want to notice and report, as a potential problem, no? > --- 8< --- > From: Johannes Sixt <j6t@kdbg.org> > Subject: [PATCH] attr: do not warn on path with colon on Windows > > In the same spirit as commit 8e950dab (attr: failure to open a > .gitattributes file is OK with ENOTDIR), ignore EINVAL errors. On > Windows, a path that contains a colon that is not the one after the > drive letter is not allowed and is reported with errno set to > EINVAL. > > Signed-off-by: Johannes Sixt <j6t@kdbg.org> > --- > attr.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/attr.c b/attr.c > index 8010429..ac945ad 100644 > --- a/attr.c > +++ b/attr.c > @@ -354,7 +354,15 @@ static struct attr_stack *read_attr_from_file(const char *path, int macro_ok) > int lineno = 0; > > if (!fp) { > - if (errno != ENOENT && errno != ENOTDIR) > + /* > + * If path does not exist, it is not worth mentioning, > + * but I/O errors and permission errors are. > + * > + * On Windows, EINVAL is reported if path contains a colon > + * that is not the driver letter separator. Such a path > + * cannot exist in the file system and is also uninteresting. > + */ > + if (errno != ENOENT && errno != ENOTDIR && errno != EINVAL) > warn_on_inaccessible(path); > return NULL; > } ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-14 4:29 ` Junio C Hamano @ 2012-10-15 6:02 ` Johannes Sixt 2012-10-15 16:54 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-15 6:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Am 10/14/2012 6:29, schrieb Junio C Hamano: > Johannes Sixt <j.sixt@viscovery.net> writes: > >> It might be worth it. We already have a similar special case in >> write_or_die.c:maybe_flush_or_die() for Windows, although it is not about >> a colon in a path name. >> >> Perhaps like this. > > Hrm, the "we already have" one b2f5e26 (Windows: Work around an > oddity when a pipe with no reader is written to., 2007-08-17) was > what you added while I was looking the other way ;-) as a part of > Windows specific pull. > > That change, and this patch, seem to cover the cases to be ignored > with a bit too wide a net to my taste. On other systems, and even > on Windows if the path does not have any colon, EINVAL is something > we would want to noticbbe and report, as a potential problem, no? For fopen(), EINVAL should occur only if the mode argument is wrong, which it isn't. For fflush() (as in write_or_die.c), EINVAL is not even listed as possible error code. Therefore, catching EINVAL wholesale should not be a problem, IMO, at least not "on other systems". On Windows, it is more problematic because there is a table of "customary" Windows API error codes, which are mapped to errno values, and EINVAL is used for all other Windows error codes (and for a few listed ones), and ignoring EINVAL might indeed miss something worth to be reported. Sooo... I don't mind if you do not pick up this patch because it handles a rather theoretic case, i.e., where a project with strange paths somehow ended up on a Windows drive. But reverting the EINVAL check from write_or_die.c is out of question, because that handles a real case. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-15 6:02 ` Johannes Sixt @ 2012-10-15 16:54 ` Junio C Hamano 2012-10-16 6:39 ` Johannes Sixt 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2012-10-15 16:54 UTC (permalink / raw) To: Johannes Sixt; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Johannes Sixt <j.sixt@viscovery.net> writes: > For fflush() (as in write_or_die.c), EINVAL is not even listed > as possible error code. Therefore, catching EINVAL wholesale should not be > a problem, IMO, at least not "on other systems". I see a disturbing gap between "EINVAL is not supposed to be even possible" and "therefore it is safe to ignore". Our usual attitude towards catching errors is to ignore only specific things that are expected to happen and known to be safe, e.g. the original code before that patch which special cased to ignore EPIPE with if (fflush(f)) { if (errno == EPIPE) exit(0); die_errno("write failure on '%s'", desc); } where we know we are often downstream of something else and diagnosing EPIPE as an error is actively wrong. Unless you assume that *no* other platform has the same glitch as Windows has with respect to fflush(f) returning EINVAL, or any other platform that may return EINVAL from fflush(f) has the exactly same glitch as Windows, ignoring EINVAL unconditionally everywhere is wrong. Perhaps the next broken platform may return EINVAL when it should return EIO or something. Ideally, that earlier workaround should have done a logica equivalent of: if (fflush(f)) { #ifdef WINDOWS /* * On Windows, EPIPE is returned only by the first write() * after the reading end has closed its handle; subsequent * write()s return EINVAL. */ if (errno == EINVAL && this is after a second write) errno = EPIPE; #endif if (errno == EPIPE) exit(0); die_errno("write failure on '%s'", desc); } and did so not in-line at the calling site but in a compat/ wrapper for fflush() to eliminate the need for the ifdef. > But reverting the EINVAL check from write_or_die.c is out of question, > because that handles a real case. I am not saying we should not deal with EINVAL on Windows at all. I am just saying ignoring EINVAL on other platforms is wrong, and these two shouldn't have been mutually incompatible goals. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-15 16:54 ` Junio C Hamano @ 2012-10-16 6:39 ` Johannes Sixt 2012-10-17 7:05 ` Johannes Sixt 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-16 6:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Am 10/15/2012 18:54, schrieb Junio C Hamano: > Ideally, that earlier workaround > should have done a logica equivalent of: > ... > and did so not in-line at the calling site but in a compat/ wrapper > for fflush() to eliminate the need for the ifdef. Fair enough. >> But reverting the EINVAL check from write_or_die.c is out of question, >> because that handles a real case. Correction: I can't reproduce the error messages that this was working around anymore in a brief test. I'll revert the check locally and see what happens. -- Hannes ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-16 6:39 ` Johannes Sixt @ 2012-10-17 7:05 ` Johannes Sixt 2012-10-17 7:33 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Johannes Sixt @ 2012-10-17 7:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Am 10/16/2012 8:39, schrieb Johannes Sixt: > Am 10/15/2012 18:54, schrieb Junio C Hamano: >> Ideally, that earlier workaround >> should have done a logica equivalent of: >> ... >> and did so not in-line at the calling site but in a compat/ wrapper >> for fflush() to eliminate the need for the ifdef. > > Fair enough. > >>> But reverting the EINVAL check from write_or_die.c is out of question, >>> because that handles a real case. > > Correction: I can't reproduce the error messages that this was working > around anymore in a brief test. I'll revert the check locally and see what > happens. The error is reproducible with this command on Windows: $ git rev-list HEAD | sed 1q 42b333d8bb8709dfc5783dd4c44bdb6012c2c17d fatal: write failure on 'stdout': Invalid argument Let's do as you suggested. --- 8< --- From: Johannes Sixt <j6t@kdbg.org> Subject: [PATCH] maybe_flush_or_die: move a too-loose Windows specific error check to compat Commit b2f5e268 (Windows: Work around an oddity when a pipe with no reader is written to) introduced a check for EINVAL after fflush() to fight spurious "Invalid argument" errors on Windows when a pipe was broken. But this check may hide real errors on systems that do not have the this odd behavior. Introduce an fflush wrapper in compat/mingw.* so that the treatment is only applied on Windows. Signed-off-by: Johannes Sixt <j6t@kdbg.org> --- compat/mingw.c | 22 ++++++++++++++++++++++ compat/mingw.h | 3 +++ write_or_die.c | 7 +------ 3 files changed, 26 insertions(+), 6 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index afc892d..4e63838 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -335,6 +335,28 @@ FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream) return freopen(filename, otype, stream); } +#undef fflush +int mingw_fflush(FILE *stream) +{ + int ret = fflush(stream); + + /* + * write() is used behind the scenes of stdio output functions. + * Since git code does not check for errors after each stdio write + * operation, it can happen that write() is called by a later + * stdio function even if an earlier write() call failed. In the + * case of a pipe whose readable end was closed, only the first + * call to write() reports EPIPE on Windows. Subsequent write() + * calls report EINVAL. It is impossible to notice whether this + * fflush invocation triggered such a case, therefore, we have to + * catch all EINVAL errors whole-sale. + */ + if (ret && errno == EINVAL) + errno = EPIPE; + + return ret; +} + /* * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC. * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch. diff --git a/compat/mingw.h b/compat/mingw.h index 61a6521..eeb08d1 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -185,6 +185,9 @@ FILE *mingw_fopen (const char *filename, const char *otype); FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream); #define freopen mingw_freopen +int mingw_fflush(FILE *stream); +#define fflush mingw_fflush + char *mingw_getcwd(char *pointer, int len); #define getcwd mingw_getcwd diff --git a/write_or_die.c b/write_or_die.c index d45b536..960f448 100644 --- a/write_or_die.c +++ b/write_or_die.c @@ -34,12 +34,7 @@ void maybe_flush_or_die(FILE *f, const char *desc) return; } if (fflush(f)) { - /* - * On Windows, EPIPE is returned only by the first write() - * after the reading end has closed its handle; subsequent - * write()s return EINVAL. - */ - if (errno == EPIPE || errno == EINVAL) + if (errno == EPIPE) exit(0); die_errno("write failure on '%s'", desc); } -- 1.8.0.rc2.1248.g3f5b13f ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-17 7:05 ` Johannes Sixt @ 2012-10-17 7:33 ` Junio C Hamano 0 siblings, 0 replies; 43+ messages in thread From: Junio C Hamano @ 2012-10-17 7:33 UTC (permalink / raw) To: Johannes Sixt; +Cc: Nguyễn Thái Ngọc Duy, git, Jeff King Johannes Sixt <j.sixt@viscovery.net> writes: > diff --git a/compat/mingw.c b/compat/mingw.c > index afc892d..4e63838 100644 > --- a/compat/mingw.c > +++ b/compat/mingw.c > @@ -335,6 +335,28 @@ FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream) > return freopen(filename, otype, stream); > } > > +#undef fflush > +int mingw_fflush(FILE *stream) > +{ > + int ret = fflush(stream); The "#undef" above is a bit unfortunate. Whenever I see this construct I start to wonder "I know this is to disable our own #define we have elsewhere that renames fflush() to mingw_fflush(), but what happens if the system include implements fflush() as a macro?" A better organization might be - make "int mingw_fflush(FILE *);" declaration available to all the callers and to this part of the file; and - make "#define fflush(x) mingw_fflush(x)" macro visible when compiling the rest of the system, but make it invisible to the implementation of the emulation function. The latter implies that a function in the emulation layer, if it needs to fflush(), would explicitly call mingw_fflush(). I know you did this knowing that it is not an issue on your platform, and this file is only used on your platform anyway, so I do not think we should address such a reorganization right now, but it is something we may want to keep an eye on, as other people may later try to stub away a real macro imitating this part of the code. Thanks for following through. Sometimes discussions on our list result in participant feeling satisified with the conclusion without completing the last mile of producing and applying the patch, which I find only after a few month when I'm trawling the list archive for anything we missed. Now I'll have to do my part and queue this to my tree ;-) ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-10 14:21 ` Johannes Sixt 2012-10-10 19:56 ` Junio C Hamano @ 2012-10-11 1:49 ` Nguyen Thai Ngoc Duy 2012-10-11 3:15 ` Junio C Hamano 1 sibling, 1 reply; 43+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-11 1:49 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Junio C Hamano, Jeff King On Wed, Oct 10, 2012 at 9:21 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: >> + git commit -m new && >> + echo "Binary file HEAD:t matches" >expect && >> + git grep text HEAD -- t >actual && >> + test_cmp expect actual && >> + git reset HEAD^ >> +' > > And in yet another test, should > > git grep text HEAD:t > > /not/ respect the binary attribute? Gray area. Is it ok to do that without documenting it (i.e. common sense)? I have something in mind that could do that, but it also makes "git grep text HEAD^{tree}" not respect attributes. -- Duy ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2 2/2] grep: stop looking at random places for .gitattributes 2012-10-11 1:49 ` Nguyen Thai Ngoc Duy @ 2012-10-11 3:15 ` Junio C Hamano 0 siblings, 0 replies; 43+ messages in thread From: Junio C Hamano @ 2012-10-11 3:15 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Johannes Sixt, git, Jeff King Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > On Wed, Oct 10, 2012 at 9:21 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: >>> + git commit -m new && >>> + echo "Binary file HEAD:t matches" >expect && >>> + git grep text HEAD -- t >actual && >>> + test_cmp expect actual && >>> + git reset HEAD^ >>> +' >> >> And in yet another test, should >> >> git grep text HEAD:t >> >> /not/ respect the binary attribute? > > Gray area. Is it ok to do that without documenting it (i.e. common > sense)? I have something in mind that could do that, but it also makes > "git grep text HEAD^{tree}" not respect attributes. Personally, I do not think HEAD:t is worth worrying about. We could use the get_sha1_with_context() to get "t" out of "HEAD:t", and we could even enhance get_sha1_with_context() to also preserve the value of what came before the colon, but that would mean that these three git grep text HEAD:t/t0200 git grep text $(git rev-parse HEAD:t/t0200) git grep text $(git rev-parse HEAD:t):t0200 would behave differently; only the first one has any chance of applying the correct set of ".gitattributes". All of them would be able to use the ".gitattributes" file contained in the tree object that corresponds to t/t0200 (if we updated attr.c to read from tree objects, that is), but the latter two would skip ".gitattributes" files from the top-level and "t" directories, still using the final fallback definition from $GIT_DIR/info/attributes file. If we have to draw a line somewhere, the saner place to draw it is to stop at git grep text HEAD -- t/t0200 ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v3] grep: stop looking at random places for .gitattributes 2012-10-10 13:59 ` [PATCH v2 2/2] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy 2012-10-10 14:21 ` Johannes Sixt @ 2012-10-12 10:49 ` Nguyễn Thái Ngọc Duy 2012-10-12 16:47 ` Junio C Hamano 1 sibling, 1 reply; 43+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-12 10:49 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Sixt, Nguyễn Thái Ngọc Duy grep searches for .gitattributes using "name" field in struct grep_source but that field is not real on-disk path name. For example, "grep pattern rev" fills the field with "rev:path", and Git looks for .gitattributes in the (non-existent but exploitable) path "rev:path" instead of "path". This patch passes real paths down to grep_source_load_driver() when: - grep on work tree - grep on the index - grep a commit (or a tag if it points to a commit) so that these cases look up .gitattributes at proper paths. .gitattributes lookup is disabled in all other cases. Initial-work-by: Jeff King <peff@peff.net> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- This fixes t7008 as Johannes commented and makes "git grep foo HEAD:t" not look up worktree's .gitattributes (in fact it does not look up anywhere). The quote_path_relative() patch is not required, it's an independent design issue. builtin/grep.c | 31 ++++++++++++++++++------------- grep.c | 11 ++++++++--- grep.h | 4 +++- t/t7008-grep-binary.sh | 22 ++++++++++++++++++++++ 4 files changed, 51 insertions(+), 17 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 82530a6..38a17eb 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -86,7 +86,7 @@ static pthread_cond_t cond_result; static int skip_first_line; static void add_work(struct grep_opt *opt, enum grep_source_type type, - const char *name, const void *id) + const char *name, const char *path, const void *id) { grep_lock(); @@ -94,7 +94,7 @@ static void add_work(struct grep_opt *opt, enum grep_source_type type, pthread_cond_wait(&cond_write, &grep_mutex); } - grep_source_init(&todo[todo_end].source, type, name, id); + grep_source_init(&todo[todo_end].source, type, name, path, id); if (opt->binary != GREP_BINARY_TEXT) grep_source_load_driver(&todo[todo_end].source); todo[todo_end].done = 0; @@ -371,7 +371,8 @@ static void *lock_and_read_sha1_file(const unsigned char *sha1, enum object_type } static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, - const char *filename, int tree_name_len) + const char *filename, int tree_name_len, + const char *path) { struct strbuf pathbuf = STRBUF_INIT; @@ -385,7 +386,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + add_work(opt, GREP_SOURCE_SHA1, pathbuf.buf, path, sha1); strbuf_release(&pathbuf); return 0; } else @@ -394,7 +395,7 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, sha1); + grep_source_init(&gs, GREP_SOURCE_SHA1, pathbuf.buf, path, sha1); strbuf_release(&pathbuf); hit = grep_source(opt, &gs); @@ -414,7 +415,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) #ifndef NO_PTHREADS if (use_threads) { - add_work(opt, GREP_SOURCE_FILE, buf.buf, filename); + add_work(opt, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); return 0; } else @@ -423,7 +424,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) struct grep_source gs; int hit; - grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename); + grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); hit = grep_source(opt, &gs); @@ -479,7 +480,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int if (cached || (ce->ce_flags & CE_VALID) || ce_skip_worktree(ce)) { if (ce_stage(ce)) continue; - hit |= grep_sha1(opt, ce->sha1, ce->name, 0); + hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name); } else hit |= grep_file(opt, ce->name); @@ -497,7 +498,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int } static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, - struct tree_desc *tree, struct strbuf *base, int tn_len) + struct tree_desc *tree, struct strbuf *base, int tn_len, + int check_attr) { int hit = 0; enum interesting match = entry_not_interesting; @@ -518,7 +520,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, strbuf_add(base, entry.path, te_len); if (S_ISREG(entry.mode)) { - hit |= grep_sha1(opt, entry.sha1, base->buf, tn_len); + hit |= grep_sha1(opt, entry.sha1, base->buf, tn_len, + check_attr ? base->buf + tn_len : NULL); } else if (S_ISDIR(entry.mode)) { enum object_type type; @@ -533,7 +536,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, strbuf_addch(base, '/'); init_tree_desc(&sub, data, size); - hit |= grep_tree(opt, pathspec, &sub, base, tn_len); + hit |= grep_tree(opt, pathspec, &sub, base, tn_len, + check_attr); free(data); } strbuf_setlen(base, old_baselen); @@ -548,7 +552,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, struct object *obj, const char *name) { if (obj->type == OBJ_BLOB) - return grep_sha1(opt, obj->sha1, name, 0); + return grep_sha1(opt, obj->sha1, name, 0, NULL); if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) { struct tree_desc tree; void *data; @@ -571,7 +575,8 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, strbuf_addch(&base, ':'); } init_tree_desc(&tree, data, size); - hit = grep_tree(opt, pathspec, &tree, &base, base.len); + hit = grep_tree(opt, pathspec, &tree, &base, base.len, + obj->type == OBJ_COMMIT); strbuf_release(&base); free(data); return hit; diff --git a/grep.c b/grep.c index edc7776..e36c01b 100644 --- a/grep.c +++ b/grep.c @@ -1373,7 +1373,7 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) struct grep_source gs; int r; - grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL); + grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL, NULL); gs.buf = buf; gs.size = size; @@ -1384,10 +1384,12 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) } void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier) + const char *name, const char *path, + const void *identifier) { gs->type = type; gs->name = name ? xstrdup(name) : NULL; + gs->path = path ? xstrdup(path) : NULL; gs->buf = NULL; gs->size = 0; gs->driver = NULL; @@ -1409,6 +1411,8 @@ void grep_source_clear(struct grep_source *gs) { free(gs->name); gs->name = NULL; + free(gs->path); + gs->path = NULL; free(gs->identifier); gs->identifier = NULL; grep_source_clear_data(gs); @@ -1501,7 +1505,8 @@ void grep_source_load_driver(struct grep_source *gs) return; grep_attr_lock(); - gs->driver = userdiff_find_by_path(gs->name); + if (gs->path) + gs->driver = userdiff_find_by_path(gs->path); if (!gs->driver) gs->driver = userdiff_find_by_name("default"); grep_attr_unlock(); diff --git a/grep.h b/grep.h index c256ac6..c2cf57b 100644 --- a/grep.h +++ b/grep.h @@ -158,11 +158,13 @@ struct grep_source { char *buf; unsigned long size; + char *path; /* for attribute lookups */ struct userdiff_driver *driver; }; void grep_source_init(struct grep_source *gs, enum grep_source_type type, - const char *name, const void *identifier); + const char *name, const char *path, + const void *identifier); void grep_source_clear_data(struct grep_source *gs); void grep_source_clear(struct grep_source *gs); void grep_source_load_driver(struct grep_source *gs); diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh index fd6410f..26f8319 100755 --- a/t/t7008-grep-binary.sh +++ b/t/t7008-grep-binary.sh @@ -111,6 +111,28 @@ test_expect_success 'grep respects binary diff attribute' ' test_cmp expect actual ' +test_expect_success 'grep --cached respects binary diff attribute' ' + git grep --cached text t >actual && + test_cmp expect actual +' + +test_expect_success 'grep --cached respects binary diff attribute (2)' ' + git add .gitattributes && + rm .gitattributes && + git grep --cached text t >actual && + test_when_finished "git rm --cached .gitattributes" && + test_when_finished "git checkout .gitattributes" && + test_cmp expect actual +' + +test_expect_success 'grep revision respects binary diff attribute' ' + git commit -m new && + echo "Binary file HEAD:t matches" >expect && + git grep text HEAD -- t >actual && + test_when_finished "git reset HEAD^" && + test_cmp expect actual +' + test_expect_success 'grep respects not-binary diff attribute' ' echo binQary | q_to_nul >b && git add b && -- 1.7.12.1.406.g6ab07c4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v3] grep: stop looking at random places for .gitattributes 2012-10-12 10:49 ` [PATCH v3] " Nguyễn Thái Ngọc Duy @ 2012-10-12 16:47 ` Junio C Hamano 0 siblings, 0 replies; 43+ messages in thread From: Junio C Hamano @ 2012-10-12 16:47 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King, Johannes Sixt Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > grep searches for .gitattributes using "name" field in struct > grep_source but that field is not real on-disk path name. For example, > "grep pattern rev" fills the field with "rev:path", and Git looks for > .gitattributes in the (non-existent but exploitable) path "rev:path" > instead of "path". > > This patch passes real paths down to grep_source_load_driver() when: > > - grep on work tree > - grep on the index > - grep a commit (or a tag if it points to a commit) > > so that these cases look up .gitattributes at proper paths. > .gitattributes lookup is disabled in all other cases. > > Initial-work-by: Jeff King <peff@peff.net> > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> > --- Looks sensible and straightforward. Thanks. ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2012-10-17 7:33 UTC | newest] Thread overview: 43+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-09 9:03 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Johannes Sixt 2012-10-09 9:38 ` Nguyen Thai Ngoc Duy 2012-10-09 12:01 ` Nguyen Thai Ngoc Duy 2012-10-09 12:41 ` Jeff King 2012-10-09 18:59 ` Junio C Hamano 2012-10-10 5:17 ` Nguyen Thai Ngoc Duy 2012-10-10 5:33 ` Junio C Hamano 2012-10-10 5:45 ` Nguyen Thai Ngoc Duy 2012-10-10 11:34 ` Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 1/3] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy 2012-10-10 21:13 ` Junio C Hamano 2012-10-11 13:04 ` Nguyen Thai Ngoc Duy 2012-10-11 16:42 ` Junio C Hamano 2012-10-10 11:34 ` [PATCH 2/3] grep: pass true path name to grep machinery Nguyễn Thái Ngọc Duy 2012-10-10 11:34 ` [PATCH 3/3] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy 2012-10-10 11:51 ` Johannes Sixt 2012-10-10 12:03 ` Nguyen Thai Ngoc Duy 2012-10-10 12:12 ` Johannes Sixt 2012-10-10 12:32 ` Nguyen Thai Ngoc Duy 2012-10-10 12:43 ` Johannes Sixt 2012-10-10 12:51 ` Nguyen Thai Ngoc Duy 2012-10-10 19:44 ` Junio C Hamano 2012-10-11 5:55 ` Johannes Sixt 2012-10-11 7:04 ` Michael Haggerty 2012-10-11 8:17 ` Nguyen Thai Ngoc Duy 2012-10-10 13:59 ` [PATCH v2 0/2] Re: 'git grep needle rev' attempts to access 'rev:.../.gitattributes' in the worktree Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 1/2] quote: let caller reset buffer for quote_path_relative() Nguyễn Thái Ngọc Duy 2012-10-10 13:59 ` [PATCH v2 2/2] grep: stop looking at random places for .gitattributes Nguyễn Thái Ngọc Duy 2012-10-10 14:21 ` Johannes Sixt 2012-10-10 19:56 ` Junio C Hamano 2012-10-11 5:45 ` Johannes Sixt 2012-10-11 15:51 ` Junio C Hamano 2012-10-12 7:33 ` Johannes Sixt 2012-10-14 4:29 ` Junio C Hamano 2012-10-15 6:02 ` Johannes Sixt 2012-10-15 16:54 ` Junio C Hamano 2012-10-16 6:39 ` Johannes Sixt 2012-10-17 7:05 ` Johannes Sixt 2012-10-17 7:33 ` Junio C Hamano 2012-10-11 1:49 ` Nguyen Thai Ngoc Duy 2012-10-11 3:15 ` Junio C Hamano 2012-10-12 10:49 ` [PATCH v3] " Nguyễn Thái Ngọc Duy 2012-10-12 16:47 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).