* [RFC 0/1] diff --check for indent-with-spaces style @ 2010-03-27 14:08 Chris Webb 2010-03-27 14:08 ` [RFC 1/1] Add new indent-with-tab whitespace check Chris Webb 0 siblings, 1 reply; 6+ messages in thread From: Chris Webb @ 2010-03-27 14:08 UTC (permalink / raw) To: git I've been preparing some patches against a project (qemu) which has an indent style of only spaces, i.e. tabs forbidden. I usually use diff --check to make sure I catch stray spaces when I'm working on tab-indented code, and although core.whitespace can be set to indent-with-non-tab to enforce the use of tabs, there's no corresponding 'indent-with-tab' rule to forbid them. The asymmetry bothered me, so I've added support for this to ws.c in the following small patch. Would this rule be more clearly described as indent-with-non-space instead? For completeness, I'd also like to support core.whitespace = indent-with-tab in git apply --whitespace=fix (although I don't use this myself). It's trivial to expand the tabs in ws_fix_copy(), but this would violate the assumption in the calling code that whitespace cleanup doesn't result in a dst string longer than src. I think the only sane way I can deal with this correctly is to properly replumb ws_fix_copy() and its callers to use a strbuf instead of a char *dst? cache.h | 1 + ws.c | 26 ++++++++++++++++++++------ 2 files changed, 21 insertions(+), 6 deletions(-) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC 1/1] Add new indent-with-tab whitespace check 2010-03-27 14:08 [RFC 0/1] diff --check for indent-with-spaces style Chris Webb @ 2010-03-27 14:08 ` Chris Webb 2010-03-31 16:23 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Chris Webb @ 2010-03-27 14:08 UTC (permalink / raw) To: git The indent-with-tab rule warns about any tab characters used in initial indent, and highlights them in git diff --check. Signed-off-by: Chris Webb <chris@arachsys.com> --- cache.h | 1 + ws.c | 26 ++++++++++++++++++++------ 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/cache.h b/cache.h index 2928107..d87bd85 100644 --- a/cache.h +++ b/cache.h @@ -1040,6 +1040,7 @@ void shift_tree_by(const unsigned char *, const unsigned char *, unsigned char * #define WS_INDENT_WITH_NON_TAB 04 #define WS_CR_AT_EOL 010 #define WS_BLANK_AT_EOF 020 +#define WS_INDENT_WITH_TAB 040 #define WS_TRAILING_SPACE (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF) #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB) extern unsigned whitespace_rule_cfg; diff --git a/ws.c b/ws.c index c089338..e44a711 100644 --- a/ws.c +++ b/ws.c @@ -18,6 +18,7 @@ static struct whitespace_rule { { "cr-at-eol", WS_CR_AT_EOL, 1 }, { "blank-at-eol", WS_BLANK_AT_EOL, 0 }, { "blank-at-eof", WS_BLANK_AT_EOF, 0 }, + { "indent-with-tab", WS_INDENT_WITH_TAB, 0 }, }; unsigned parse_whitespace_rule(const char *string) @@ -125,6 +126,11 @@ char *whitespace_error_string(unsigned ws) strbuf_addstr(&err, ", "); strbuf_addstr(&err, "indent with spaces"); } + if (ws & WS_INDENT_WITH_TAB) { + if (err.len) + strbuf_addstr(&err, ", "); + strbuf_addstr(&err, "indent with tab"); + } return strbuf_detach(&err, NULL); } @@ -163,23 +169,31 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule, } } - /* Check for space before tab in initial indent. */ + /* Check for indent using tab or space before tab in initial indent. */ for (i = 0; i < len; i++) { if (line[i] == ' ') continue; if (line[i] != '\t') break; - if ((ws_rule & WS_SPACE_BEFORE_TAB) && written < i) { + if (ws_rule & WS_INDENT_WITH_TAB) { + result |= WS_INDENT_WITH_TAB; + if (stream) { + fwrite(line + written, i - written, 1, stream); + fputs(ws, stream); + fwrite(line + i, 1, 1, stream); + fputs(reset, stream); + } + } else if ((ws_rule & WS_SPACE_BEFORE_TAB) && written < i) { result |= WS_SPACE_BEFORE_TAB; if (stream) { fputs(ws, stream); fwrite(line + written, i - written, 1, stream); fputs(reset, stream); + fwrite(line + i, 1, 1, stream); } - } else if (stream) - fwrite(line + written, i - written, 1, stream); - if (stream) - fwrite(line + i, 1, 1, stream); + } else if (stream) { + fwrite(line + written, i - written + 1, 1, stream); + } written = i + 1; } -- 1.7.0.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC 1/1] Add new indent-with-tab whitespace check 2010-03-27 14:08 ` [RFC 1/1] Add new indent-with-tab whitespace check Chris Webb @ 2010-03-31 16:23 ` Junio C Hamano 2010-04-01 18:51 ` Re* " Junio C Hamano 2010-04-02 10:55 ` Chris Webb 0 siblings, 2 replies; 6+ messages in thread From: Junio C Hamano @ 2010-03-31 16:23 UTC (permalink / raw) To: Chris Webb; +Cc: git Chris Webb <chris@arachsys.com> writes: > The indent-with-tab rule warns about any tab characters used in initial > indent, and highlights them in git diff --check. > > Signed-off-by: Chris Webb <chris@arachsys.com> > --- > cache.h | 1 + > ws.c | 26 ++++++++++++++++++++------ > 2 files changed, 21 insertions(+), 6 deletions(-) > > diff --git a/cache.h b/cache.h > index 2928107..d87bd85 100644 > --- a/cache.h > +++ b/cache.h > @@ -1040,6 +1040,7 @@ void shift_tree_by(const unsigned char *, const unsigned char *, unsigned char * > #define WS_INDENT_WITH_NON_TAB 04 > #define WS_CR_AT_EOL 010 > #define WS_BLANK_AT_EOF 020 > +#define WS_INDENT_WITH_TAB 040 > #define WS_TRAILING_SPACE (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF) > #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB) > extern unsigned whitespace_rule_cfg; > diff --git a/ws.c b/ws.c > index c089338..e44a711 100644 > --- a/ws.c > +++ b/ws.c > @@ -18,6 +18,7 @@ static struct whitespace_rule { > { "cr-at-eol", WS_CR_AT_EOL, 1 }, > { "blank-at-eol", WS_BLANK_AT_EOL, 0 }, > { "blank-at-eof", WS_BLANK_AT_EOF, 0 }, > + { "indent-with-tab", WS_INDENT_WITH_TAB, 0 }, User's existing attribute setting that uses "indent" as an abbreviation for "indent-with-non-tab" will probably be broken by this naming. This sounds more like "tab-in-indent". Naming the option that way would give it an added benefit of allowing it to be abbreviated to "tab" by the parser. I think WS_TAB_IN_INDENT are incompatible with either WS_SPACE_BEFORE_TAB or WS_INDENT_WITH_TAB. Don't we want to notice such a combination as an error? > @@ -163,23 +169,31 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule, > } > } > > - /* Check for space before tab in initial indent. */ > + /* Check for indent using tab or space before tab in initial indent. */ > for (i = 0; i < len; i++) { > if (line[i] == ' ') > continue; > if (line[i] != '\t') > break; > - if ((ws_rule & WS_SPACE_BEFORE_TAB) && written < i) { > + if (ws_rule & WS_INDENT_WITH_TAB) { > + result |= WS_INDENT_WITH_TAB; > + if (stream) { > + fwrite(line + written, i - written, 1, stream); > + fputs(ws, stream); > + fwrite(line + i, 1, 1, stream); > + fputs(reset, stream); > + } > + } else if ((ws_rule & WS_SPACE_BEFORE_TAB) && written < i) { Adding this new code as a new second branch for if-elif-else chain would match the order of options better. > result |= WS_SPACE_BEFORE_TAB; > if (stream) { > fputs(ws, stream); > fwrite(line + written, i - written, 1, stream); > fputs(reset, stream); > + fwrite(line + i, 1, 1, stream); > } > - } else if (stream) > - fwrite(line + written, i - written, 1, stream); > - if (stream) > - fwrite(line + i, 1, 1, stream); > + } else if (stream) { > + fwrite(line + written, i - written + 1, 1, stream); > + } I think it is a good change, and a necessary one for adding this new feature. The original code assumed that this loop always detected an after seeing a "good" letter (hence the writing of the current letter was unconditional), but the patch makes it more clear what is being checked and what is being highlighted by the logic between SPACE_BEFORE_TAB and TAB_IN_INDENT codepaths. I'll queue this with s/indent-with-tab/tab-in-indent/ tweak to 'pu' but the topic would need: - detecting incompatible settings; - tests; - docs; - matching change to apply --whitespace=fix; before moving to 'next', I think. Here is what I tentatively will queue, with an updated log message. Thanks. -- >8 -- From: Chris Webb <chris@arachsys.com> Date: Sat, 27 Mar 2010 14:08:01 +0000 Subject: [PATCH] Add tab-in-indent whitespace error class Some projects and languages use coding style where no tab character is used to indent the lines. This only adds support for "apply --whitespace=warn" and "diff --check"; follow-up patches need to add "apply --whitespace=fix", tests and docs. Signed-off-by: Chris Webb <chris@arachsys.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- cache.h | 1 + ws.c | 24 +++++++++++++++++++----- 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/cache.h b/cache.h index 6dcb100..030af32 100644 --- a/cache.h +++ b/cache.h @@ -1040,6 +1040,7 @@ void shift_tree_by(const unsigned char *, const unsigned char *, unsigned char * #define WS_INDENT_WITH_NON_TAB 04 #define WS_CR_AT_EOL 010 #define WS_BLANK_AT_EOF 020 +#define WS_TAB_IN_INDENT 040 #define WS_TRAILING_SPACE (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF) #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB) extern unsigned whitespace_rule_cfg; diff --git a/ws.c b/ws.c index c089338..d0b6c54 100644 --- a/ws.c +++ b/ws.c @@ -18,6 +18,7 @@ static struct whitespace_rule { { "cr-at-eol", WS_CR_AT_EOL, 1 }, { "blank-at-eol", WS_BLANK_AT_EOL, 0 }, { "blank-at-eof", WS_BLANK_AT_EOF, 0 }, + { "tab-in-indent", WS_TAB_IN_INDENT, 0 }, }; unsigned parse_whitespace_rule(const char *string) @@ -125,6 +126,11 @@ char *whitespace_error_string(unsigned ws) strbuf_addstr(&err, ", "); strbuf_addstr(&err, "indent with spaces"); } + if (ws & WS_TAB_IN_INDENT) { + if (err.len) + strbuf_addstr(&err, ", "); + strbuf_addstr(&err, "tab in indent"); + } return strbuf_detach(&err, NULL); } @@ -163,7 +169,7 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule, } } - /* Check for space before tab in initial indent. */ + /* Check indentation */ for (i = 0; i < len; i++) { if (line[i] == ' ') continue; @@ -175,11 +181,19 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule, fputs(ws, stream); fwrite(line + written, i - written, 1, stream); fputs(reset, stream); + fwrite(line + i, 1, 1, stream); } - } else if (stream) - fwrite(line + written, i - written, 1, stream); - if (stream) - fwrite(line + i, 1, 1, stream); + } else if (ws_rule & WS_TAB_IN_INDENT) { + result |= WS_TAB_IN_INDENT; + if (stream) { + fwrite(line + written, i - written, 1, stream); + fputs(ws, stream); + fwrite(line + i, 1, 1, stream); + fputs(reset, stream); + } + } else if (stream) { + fwrite(line + written, i - written + 1, 1, stream); + } written = i + 1; } -- 1.7.0.3.513.gc8ed0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re* [RFC 1/1] Add new indent-with-tab whitespace check 2010-03-31 16:23 ` Junio C Hamano @ 2010-04-01 18:51 ` Junio C Hamano 2010-04-01 19:02 ` Junio C Hamano 2010-04-02 10:55 ` Chris Webb 1 sibling, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2010-04-01 18:51 UTC (permalink / raw) To: Chris Webb; +Cc: git Junio C Hamano <gitster@pobox.com> writes: >> diff --git a/ws.c b/ws.c >> index c089338..e44a711 100644 >> --- a/ws.c >> +++ b/ws.c >> @@ -18,6 +18,7 @@ static struct whitespace_rule { >> { "cr-at-eol", WS_CR_AT_EOL, 1 }, >> { "blank-at-eol", WS_BLANK_AT_EOL, 0 }, >> { "blank-at-eof", WS_BLANK_AT_EOF, 0 }, >> + { "indent-with-tab", WS_INDENT_WITH_TAB, 0 }, > > User's existing attribute setting that uses "indent" as an abbreviation > for "indent-with-non-tab" will probably be broken by this naming. There is another issue with this change. Because "whitespace" without any string in .gitattributes are defined to cause all the whitespace breakages known to git to be caught, and tab-in-indent is inherently incompatible with indent-with-non-tab, this cannot be supported without changing the definition of "default set of whitespace breakage classes". The intention of allowing .gitattributes to say "*.txt whitespace" is to let the users and projects say: I trust the competence and good judgement made by git developers regarding whitespace issues. They may devise a new algorithm to catch common whitespace errors that the current tool may not catch, and when that happens, I'd like my project to take advantage of the new code and catch the newly defined classes of errors. and that is why we include all whitespace-rule except for the ones that loosens error conditions to the set of breakages we catch for such a specification. So on top of the replacement I sent earlier, I think we would need a patch like the attached. The topic still needs: > - detecting incompatible settings; > - tests; > - docs; > - matching change to apply --whitespace=fix; but otherwise with this patch I think it is usable by me ;-) Thanks. -- >8 -- whitespace: we cannot "catch all errors known to git" anymore Traditionally, "*.txt whitespace" in .gitattributes file has been an instruction to catch _all_ classes of whitespace errors known to git. This however has to change, as the recently introduced "tab-in-indent" is inherently incompatible with "indent-with-non-tab". As we do not want to break configuration of existing users, add a mechanism to allow marking selected rules to be excluded from "all rules known to git". Signed-off-by: Junio C Hamano <gitster@pobox.com> --- ws.c | 8 +++++--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/ws.c b/ws.c index d0b6c54..c7b4f91 100644 --- a/ws.c +++ b/ws.c @@ -10,7 +10,8 @@ static struct whitespace_rule { const char *rule_name; unsigned rule_bits; - unsigned loosens_error; + unsigned loosens_error:1, + not_in_all:1; } whitespace_rule_names[] = { { "trailing-space", WS_TRAILING_SPACE, 0 }, { "space-before-tab", WS_SPACE_BEFORE_TAB, 0 }, @@ -18,7 +19,7 @@ static struct whitespace_rule { { "cr-at-eol", WS_CR_AT_EOL, 1 }, { "blank-at-eol", WS_BLANK_AT_EOL, 0 }, { "blank-at-eof", WS_BLANK_AT_EOF, 0 }, - { "tab-in-indent", WS_TAB_IN_INDENT, 0 }, + { "tab-in-indent", WS_TAB_IN_INDENT, 0, 1 }, }; unsigned parse_whitespace_rule(const char *string) @@ -83,7 +84,8 @@ unsigned whitespace_rule(const char *pathname) unsigned all_rule = 0; int i; for (i = 0; i < ARRAY_SIZE(whitespace_rule_names); i++) - if (!whitespace_rule_names[i].loosens_error) + if (!whitespace_rule_names[i].loosens_error && + !whitespace_rule_names[i].not_in_all) all_rule |= whitespace_rule_names[i].rule_bits; return all_rule; } else if (ATTR_FALSE(value)) { ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Re* [RFC 1/1] Add new indent-with-tab whitespace check 2010-04-01 18:51 ` Re* " Junio C Hamano @ 2010-04-01 19:02 ` Junio C Hamano 0 siblings, 0 replies; 6+ messages in thread From: Junio C Hamano @ 2010-04-01 19:02 UTC (permalink / raw) To: Chris Webb; +Cc: git Junio C Hamano <gitster@pobox.com> writes: I need to make some clarification, as the following is misleading. This clarification fortunately does not have to change the patch I sent earlier. > There is another issue with this change. Because "whitespace" without any > string in .gitattributes are defined to cause all the whitespace breakages > known to git to be caught, and tab-in-indent is inherently incompatible > with indent-with-non-tab, this cannot be supported without changing the > definition of "default set of whitespace breakage classes". I said "default" but that is totally incorrect. We catch "trailing-space" and "space-before-tab" by default (i.e. the user does not have any custom "whitespace" attribute), and that will not change. > The intention of allowing .gitattributes to say "*.txt whitespace" is to > let the users and projects say: > > I trust the competence and good judgement made by git developers > regarding whitespace issues. They may devise a new algorithm to catch > common whitespace errors that the current tool may not catch, and when > that happens, I'd like my project to take advantage of the new code > and catch the newly defined classes of errors. > > and that is why we include all whitespace-rule except for the ones that > loosens error conditions to the set of breakages we catch for such a > specification. A corollary to the above clarification on "default" is that people who do not trust us but want to rely on the traditional "Only trailing-space and space-before-tab, nothing else" need to do _nothing_. If they do not have "whitespace" attribute defined for paths, the default is honored. People who want to catch the set of maximal whitespace breakage classes that is internally consistent are the only ones that already have entries like "*.txt whitespace" in their attributes file, and we keep our promises to them. If we take the "all errors known to git" too literally to include incompatible rules in them, the end result would become useless to them, and that is what the earlier update fixes. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC 1/1] Add new indent-with-tab whitespace check 2010-03-31 16:23 ` Junio C Hamano 2010-04-01 18:51 ` Re* " Junio C Hamano @ 2010-04-02 10:55 ` Chris Webb 1 sibling, 0 replies; 6+ messages in thread From: Chris Webb @ 2010-04-02 10:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Hi. Thanks for the detailed feedback. Junio C Hamano <gitster@pobox.com> writes: > This sounds more like "tab-in-indent". Naming the option that way would > give it an added benefit of allowing it to be abbreviated to "tab" by the > parser. Yes that's nice: much harder to confuse with indent-with-non-tab that way. > I think WS_TAB_IN_INDENT are incompatible with either WS_SPACE_BEFORE_TAB > or WS_INDENT_WITH_TAB. Don't we want to notice such a combination as an > error? Oh yes, that has to be the right behaviour for the finished feature here. (I guess strictly they could coexist with the meaning 'no indented lines allowed at all' but I don't think that's useful or sensible.) > I'll queue this with s/indent-with-tab/tab-in-indent/ tweak to 'pu' but > the topic would need: > > - detecting incompatible settings; > - tests; > - docs; > - matching change to apply --whitespace=fix; > > before moving to 'next', I think. Are you happy with my suggestion in [0/1] to replumb ws_fix_copy() to take a strbuf as dst rather than a char *, to deal with the fact that whitespace can grow with --whitespace=fix and WS_TAB_IN_INDENT? If so, I'll polish up the existing patch as you suggest in your first three points above, and then make this change to the way ws_fix_copy() works so I can do the right thing in whitespace=fix. > There is another issue with this change. Because "whitespace" without any > string in .gitattributes are defined to cause all the whitespace breakages > known to git to be caught, and tab-in-indent is inherently incompatible > with indent-with-non-tab, this cannot be supported without changing the > definition of "default set of whitespace breakage classes". Ah, this was a behaviour I didn't know about. I should probably put your patch (without the WS_TAB_IN_INDENT line) as the first one in the series, before I introduce WS_TAB_IN_INDENT, so we don't break this wildcard whitespace behaviour at any point in the series. Cheers, Chris. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-04-02 10:55 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-27 14:08 [RFC 0/1] diff --check for indent-with-spaces style Chris Webb 2010-03-27 14:08 ` [RFC 1/1] Add new indent-with-tab whitespace check Chris Webb 2010-03-31 16:23 ` Junio C Hamano 2010-04-01 18:51 ` Re* " Junio C Hamano 2010-04-01 19:02 ` Junio C Hamano 2010-04-02 10:55 ` Chris Webb
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).