* [PATCH 1/3] ws.c: add a helper to format comma separated messages
2011-05-27 22:47 ` Junio C Hamano
@ 2011-05-27 22:48 ` Junio C Hamano
2011-05-27 22:49 ` [PATCH 2/3] War on nbsp: a bit of retreat Junio C Hamano
` (3 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2011-05-27 22:48 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Schwab, Git Mailing List
We can find more than one class of whitespace errors on a single line,
and we concatenate a message per class in a strbuf, separated with a
colon. Use a small helper function to make the code easier to read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* Just a trivial clean-up before waging a "War on nbsp"
ws.c | 43 +++++++++++++++++++------------------------
1 files changed, 19 insertions(+), 24 deletions(-)
diff --git a/ws.c b/ws.c
index 9fb9b14..3058be4 100644
--- a/ws.c
+++ b/ws.c
@@ -116,36 +116,31 @@ unsigned whitespace_rule(const char *pathname)
}
}
+static void add_err_item(struct strbuf *err, const char *message)
+{
+ if (err->len)
+ strbuf_addstr(err, ", ");
+ strbuf_addstr(err, message);
+}
+
/* The returned string should be freed by the caller. */
char *whitespace_error_string(unsigned ws)
{
struct strbuf err = STRBUF_INIT;
- if ((ws & WS_TRAILING_SPACE) == WS_TRAILING_SPACE)
- strbuf_addstr(&err, "trailing whitespace");
- else {
+ if ((ws & WS_TRAILING_SPACE) == WS_TRAILING_SPACE) {
+ add_err_item(&err, "trailing whitespace");
+ } else {
if (ws & WS_BLANK_AT_EOL)
- strbuf_addstr(&err, "trailing whitespace");
- if (ws & WS_BLANK_AT_EOF) {
- if (err.len)
- strbuf_addstr(&err, ", ");
- strbuf_addstr(&err, "new blank line at EOF");
- }
- }
- if (ws & WS_SPACE_BEFORE_TAB) {
- if (err.len)
- strbuf_addstr(&err, ", ");
- strbuf_addstr(&err, "space before tab in indent");
- }
- if (ws & WS_INDENT_WITH_NON_TAB) {
- if (err.len)
- strbuf_addstr(&err, ", ");
- strbuf_addstr(&err, "indent with spaces");
- }
- if (ws & WS_TAB_IN_INDENT) {
- if (err.len)
- strbuf_addstr(&err, ", ");
- strbuf_addstr(&err, "tab in indent");
+ add_err_item(&err, "trailing whitespace");
+ if (ws & WS_BLANK_AT_EOF)
+ add_err_item(&err, "new blank line at EOF");
}
+ if (ws & WS_SPACE_BEFORE_TAB)
+ add_err_item(&err, "space before tab in indent");
+ if (ws & WS_INDENT_WITH_NON_TAB)
+ add_err_item(&err, "indent with spaces");
+ if (ws & WS_TAB_IN_INDENT)
+ add_err_item(&err, "tab in indent");
return strbuf_detach(&err, NULL);
}
--
1.7.5.3.503.g893a4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/3] War on nbsp: a bit of retreat
2011-05-27 22:47 ` Junio C Hamano
2011-05-27 22:48 ` [PATCH 1/3] ws.c: add a helper to format comma separated messages Junio C Hamano
@ 2011-05-27 22:49 ` Junio C Hamano
2011-05-27 22:51 ` [PATCH 3/3] War on nbsp: Add "nbsp" whitespace breakage class Junio C Hamano
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2011-05-27 22:49 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Schwab, Git Mailing List
Before introducing a new whitespace breakage class "nbsp" to catch
compiler-breaking use of nbsp (UTF-8 c2a0) in place of SP, update the
current code to treat a nbsp just like SP that takes one display column.
Indenting with 6 nbsp is not an error under the indent-with-non-tab rule,
as it only consumes 6 display columns even though it may occupy 12 bytes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
t/t4019-diff-wserror.sh | 93 ++++++++++++++++++++++++++++++++---------------
ws.c | 38 ++++++++++++++++----
2 files changed, 94 insertions(+), 37 deletions(-)
diff --git a/t/t4019-diff-wserror.sh b/t/t4019-diff-wserror.sh
index a501975..665f693 100755
--- a/t/t4019-diff-wserror.sh
+++ b/t/t4019-diff-wserror.sh
@@ -14,6 +14,9 @@ test_expect_success setup '
echo "With trailing SP " >>F &&
echo "Carriage ReturnQ" | tr Q "\015" >>F &&
echo "No problem" >>F &&
+ echo " Enough NBSP and Space" >>F &&
+ echo " Bit of NBSP and Space" >>F &&
+ echo "NBSP At End " >>F &&
echo >>F
'
@@ -47,8 +50,10 @@ test_expect_success default '
grep HT error >/dev/null &&
grep With error >/dev/null &&
grep Return error >/dev/null &&
- grep No normal >/dev/null
-
+ grep Enough normal >/dev/null &&
+ grep No normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'default (attribute)' '
@@ -61,8 +66,10 @@ test_expect_success 'default (attribute)' '
grep HT error >/dev/null &&
grep With error >/dev/null &&
grep Return error >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough error >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'default, tabwidth=10 (attribute)' '
@@ -75,8 +82,10 @@ test_expect_success 'default, tabwidth=10 (attribute)' '
grep HT error >/dev/null &&
grep With error >/dev/null &&
grep Return error >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'no check (attribute)' '
@@ -89,8 +98,10 @@ test_expect_success 'no check (attribute)' '
grep HT normal >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'no check, tabwidth=10 (attribute), must be irrelevant' '
@@ -103,8 +114,10 @@ test_expect_success 'no check, tabwidth=10 (attribute), must be irrelevant' '
grep HT normal >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'without -trail' '
@@ -117,8 +130,10 @@ test_expect_success 'without -trail' '
grep HT error >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'without -trail (attribute)' '
@@ -131,8 +146,10 @@ test_expect_success 'without -trail (attribute)' '
grep HT error >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'without -space' '
@@ -145,8 +162,10 @@ test_expect_success 'without -space' '
grep HT normal >/dev/null &&
grep With error >/dev/null &&
grep Return error >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'without -space (attribute)' '
@@ -159,8 +178,10 @@ test_expect_success 'without -space (attribute)' '
grep HT normal >/dev/null &&
grep With error >/dev/null &&
grep Return error >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'with indent-non-tab only' '
@@ -173,8 +194,10 @@ test_expect_success 'with indent-non-tab only' '
grep HT normal >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough error >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'with indent-non-tab only (attribute)' '
@@ -187,8 +210,10 @@ test_expect_success 'with indent-non-tab only (attribute)' '
grep HT normal >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough error >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'with indent-non-tab only, tabwidth=10' '
@@ -201,8 +226,10 @@ test_expect_success 'with indent-non-tab only, tabwidth=10' '
grep HT normal >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'with indent-non-tab only, tabwidth=10 (attribute)' '
@@ -215,8 +242,10 @@ test_expect_success 'with indent-non-tab only, tabwidth=10 (attribute)' '
grep HT normal >/dev/null &&
grep With normal >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End normal >/dev/null
'
test_expect_success 'with cr-at-eol' '
@@ -229,8 +258,10 @@ test_expect_success 'with cr-at-eol' '
grep HT error >/dev/null &&
grep With error >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'with cr-at-eol (attribute)' '
@@ -243,8 +274,10 @@ test_expect_success 'with cr-at-eol (attribute)' '
grep HT error >/dev/null &&
grep With error >/dev/null &&
grep Return normal >/dev/null &&
- grep No normal >/dev/null
-
+ grep No normal >/dev/null &&
+ grep Enough normal >/dev/null &&
+ grep Bit normal >/dev/null &&
+ grep End error >/dev/null
'
test_expect_success 'trailing empty lines (1)' '
diff --git a/ws.c b/ws.c
index 3058be4..68c7599 100644
--- a/ws.c
+++ b/ws.c
@@ -144,6 +144,12 @@ char *whitespace_error_string(unsigned ws)
return strbuf_detach(&err, NULL);
}
+static int is_nbsp(const char *at_)
+{
+ unsigned const char *at = (unsigned const char *)at_;
+ return at[0] == 0xc2 && at[1] == 0xa0;
+}
+
/* If stream is non-NULL, emits the line after checking. */
static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
FILE *stream, const char *set,
@@ -154,7 +160,7 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
int trailing_whitespace = -1;
int trailing_newline = 0;
int trailing_carriage_return = 0;
- int i;
+ int i, col_offset;
/* Logic is simpler if we temporarily ignore the trailing newline. */
if (len > 0 && line[len - 1] == '\n') {
@@ -170,11 +176,16 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
/* Check for trailing whitespace. */
if (ws_rule & WS_BLANK_AT_EOL) {
for (i = len - 1; i >= 0; i--) {
- if (isspace(line[i])) {
+ int is_space = isspace(line[i]);
+
+ if (!is_space && i && is_nbsp(&line[i-1])) {
+ is_space = 1;
+ i--;
+ }
+ if (is_space) {
trailing_whitespace = i;
result |= WS_BLANK_AT_EOL;
- }
- else
+ } else
break;
}
}
@@ -183,9 +194,14 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
trailing_whitespace = len;
/* Check indentation */
- for (i = 0; i < trailing_whitespace; i++) {
+ for (i = col_offset = 0; i < trailing_whitespace; i++) {
if (line[i] == ' ')
continue;
+ if (i + 1 < trailing_whitespace && is_nbsp(&line[i])) {
+ i++;
+ col_offset++;
+ continue;
+ }
if (line[i] != '\t')
break;
if ((ws_rule & WS_SPACE_BEFORE_TAB) && written < i) {
@@ -208,10 +224,12 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
fwrite(line + written, i - written + 1, 1, stream);
}
written = i + 1;
+ col_offset = 0;
}
/* Check for indent using non-tab. */
- if ((ws_rule & WS_INDENT_WITH_NON_TAB) && i - written >= ws_tab_width(ws_rule)) {
+ if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
+ i - written - col_offset >= ws_tab_width(ws_rule)) {
result |= WS_INDENT_WITH_NON_TAB;
if (stream) {
fputs(ws, stream);
@@ -270,7 +288,13 @@ int ws_blank_line(const char *line, int len, unsigned ws_rule)
* for now we just use this stupid definition.
*/
while (len-- > 0) {
- if (!isspace(*line))
+ if (isspace(*line))
+ ;
+ else if (len && is_nbsp(line)) {
+ line++;
+ len--;
+ }
+ else
return 0;
line++;
}
--
1.7.5.3.503.g893a4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/3] War on nbsp: Add "nbsp" whitespace breakage class
2011-05-27 22:47 ` Junio C Hamano
2011-05-27 22:48 ` [PATCH 1/3] ws.c: add a helper to format comma separated messages Junio C Hamano
2011-05-27 22:49 ` [PATCH 2/3] War on nbsp: a bit of retreat Junio C Hamano
@ 2011-05-27 22:51 ` Junio C Hamano
2011-05-28 1:31 ` [PATCH 4/3] War on nbsp: teach "git apply" to check and fix nbsp Junio C Hamano
2011-05-30 12:52 ` Whitespace and ' ' Daniel Nyström
4 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2011-05-27 22:51 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Schwab, Git Mailing List
Not even "less" shows nbsp as anything special nor unusual, so eyeballing
with "git log -p" after applying a patch that accidentally had it where a
regular SP should be, breaking compilation, would not help.
This only handles "diff", not "apply" yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* This is the moral equivalent of the earlier patch, but based on the two
clean-ups. If we do not consider nbsp an error, at least we should
count it just like an ordinary SP that takes one display column, and if
we do not like indenting with non-TAB or trailing whitespaces, we
should complain loudly, which was the topic of 2/3.
cache.h | 18 ++++++-----
t/t4019-diff-wserror.sh | 8 ++--
ws.c | 72 ++++++++++++++++++++++++++++++++++++++++++----
3 files changed, 79 insertions(+), 19 deletions(-)
diff --git a/cache.h b/cache.h
index dd34fed..4f58587 100644
--- a/cache.h
+++ b/cache.h
@@ -1122,17 +1122,19 @@ void shift_tree_by(const unsigned char *, const unsigned char *, unsigned char *
/*
* whitespace rules.
* used by both diff and apply
- * last two digits are tab width
+ * last 6-bits are tab width
*/
-#define WS_BLANK_AT_EOL 0100
-#define WS_SPACE_BEFORE_TAB 0200
-#define WS_INDENT_WITH_NON_TAB 0400
-#define WS_CR_AT_EOL 01000
-#define WS_BLANK_AT_EOF 02000
-#define WS_TAB_IN_INDENT 04000
+#define WS_TAB_WIDTH_MASK 077
+#define WS_BLANK_AT_EOL (1<< 6)
+#define WS_SPACE_BEFORE_TAB (1<< 7)
+#define WS_INDENT_WITH_NON_TAB (1<< 8)
+#define WS_CR_AT_EOL (1<< 9)
+#define WS_BLANK_AT_EOF (1<<10)
+#define WS_TAB_IN_INDENT (1<<11)
+#define WS_NBSP (1<<12)
#define WS_TRAILING_SPACE (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF)
#define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB|8)
-#define WS_TAB_WIDTH_MASK 077
+
extern unsigned whitespace_rule_cfg;
extern unsigned whitespace_rule(const char *);
extern unsigned parse_whitespace_rule(const char *);
diff --git a/t/t4019-diff-wserror.sh b/t/t4019-diff-wserror.sh
index 665f693..8c7fea2 100755
--- a/t/t4019-diff-wserror.sh
+++ b/t/t4019-diff-wserror.sh
@@ -56,7 +56,7 @@ test_expect_success default '
grep End error >/dev/null
'
-test_expect_success 'default (attribute)' '
+test_expect_success 'default (attribute) -- must check all available rule' '
test_might_fail git config --unset core.whitespace &&
echo "F whitespace" >.gitattributes &&
@@ -68,7 +68,7 @@ test_expect_success 'default (attribute)' '
grep Return error >/dev/null &&
grep No normal >/dev/null &&
grep Enough error >/dev/null &&
- grep Bit normal >/dev/null &&
+ grep Bit error >/dev/null &&
grep End error >/dev/null
'
@@ -83,8 +83,8 @@ test_expect_success 'default, tabwidth=10 (attribute)' '
grep With error >/dev/null &&
grep Return error >/dev/null &&
grep No normal >/dev/null &&
- grep Enough normal >/dev/null &&
- grep Bit normal >/dev/null &&
+ grep Enough error >/dev/null &&
+ grep Bit error >/dev/null &&
grep End error >/dev/null
'
diff --git a/ws.c b/ws.c
index 68c7599..53e263d 100644
--- a/ws.c
+++ b/ws.c
@@ -20,6 +20,7 @@ static struct whitespace_rule {
{ "blank-at-eol", WS_BLANK_AT_EOL, 0 },
{ "blank-at-eof", WS_BLANK_AT_EOF, 0 },
{ "tab-in-indent", WS_TAB_IN_INDENT, 0, 1 },
+ { "nbsp", WS_NBSP, 0, 0 },
};
unsigned parse_whitespace_rule(const char *string)
@@ -141,6 +142,8 @@ char *whitespace_error_string(unsigned ws)
add_err_item(&err, "indent with spaces");
if (ws & WS_TAB_IN_INDENT)
add_err_item(&err, "tab in indent");
+ if (ws & WS_NBSP)
+ add_err_item(&err, " in source");
return strbuf_detach(&err, NULL);
}
@@ -150,6 +153,45 @@ static int is_nbsp(const char *at_)
return at[0] == 0xc2 && at[1] == 0xa0;
}
+/*
+ * Show line while highlighting nbsp " " (c2a0) if ws is set
+ */
+static void emit_with_nbsp_hilite(FILE *stream,
+ const char *set, const char *reset,
+ const char *ws,
+ const char *line, int len)
+{
+ if (!len)
+ return;
+ while (len) {
+ /* number of bytes in the leading segment w/o nbsp error */
+ int ok;
+ if (!ws) {
+ ok = len;
+ } else {
+ for (ok = 0; ok < len; ok++) {
+ if (ok < len - 1 && is_nbsp(line + ok))
+ break;
+ }
+ }
+ if (ok) {
+ fputs(set, stream);
+ fwrite(line, ok, 1, stream);
+ fputs(reset, stream);
+ }
+ line += ok;
+ len -= ok;
+ if (len) {
+ /* do not bother bundling consecutive ones */
+ fputs(ws, stream);
+ fwrite(line, 2, 1, stream);
+ fputs(reset, stream);
+ line += 2;
+ len -= 2;
+ }
+ }
+}
+
/* If stream is non-NULL, emits the line after checking. */
static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
FILE *stream, const char *set,
@@ -173,6 +215,24 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
len--;
}
+ /* Check for nbsp in UTF-8 (c2a0) */
+ if (ws_rule & WS_NBSP) {
+ for (i = 1; i < len; i++) {
+ switch (line[i] & 0xff) {
+ case 0xc2:
+ break;
+ case 0xa0:
+ if ((line[i-1] & 0xff) == 0xc2) {
+ result |= WS_NBSP;
+ continue;
+ }
+ /* fallthru */
+ default:
+ i++;
+ }
+ }
+ }
+
/* Check for trailing whitespace. */
if (ws_rule & WS_BLANK_AT_EOL) {
for (i = len - 1; i >= 0; i--) {
@@ -245,13 +305,11 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
* The non-highlighted part ends at "trailing_whitespace".
*/
- /* Emit non-highlighted (middle) segment. */
- if (trailing_whitespace - written > 0) {
- fputs(set, stream);
- fwrite(line + written,
- trailing_whitespace - written, 1, stream);
- fputs(reset, stream);
- }
+ /* Emit middle segment, highlighting nbsp as needed */
+ emit_with_nbsp_hilite(stream, set, reset,
+ (result & WS_NBSP) ? ws : NULL,
+ line + written,
+ trailing_whitespace - written);
/* Highlight errors in trailing whitespace. */
if (trailing_whitespace != len) {
--
1.7.5.3.503.g893a4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/3] War on nbsp: teach "git apply" to check and fix nbsp
2011-05-27 22:47 ` Junio C Hamano
` (2 preceding siblings ...)
2011-05-27 22:51 ` [PATCH 3/3] War on nbsp: Add "nbsp" whitespace breakage class Junio C Hamano
@ 2011-05-28 1:31 ` Junio C Hamano
2011-05-30 12:52 ` Whitespace and ' ' Daniel Nyström
4 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2011-05-28 1:31 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Schwab, Git Mailing List
This still does not work to apply in reverse "git apply -R", but I thought
it is a good place to stop, as it is dubious if this series makes much
sense to begin with.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
ws.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++----------------
1 files changed, 54 insertions(+), 18 deletions(-)
diff --git a/ws.c b/ws.c
index 53e263d..4413f95 100644
--- a/ws.c
+++ b/ws.c
@@ -374,6 +374,7 @@ void ws_fix_copy(struct strbuf *dst, const char *src, int len, unsigned ws_rule,
int last_tab_in_indent = -1;
int last_space_in_indent = -1;
int need_fix_leading_space = 0;
+ int col_offset = 0;
/*
* Strip trailing whitespace
@@ -387,10 +388,17 @@ void ws_fix_copy(struct strbuf *dst, const char *src, int len, unsigned ws_rule,
len--;
}
}
- if (0 < len && isspace(src[len - 1])) {
- while (0 < len && isspace(src[len-1]))
- len--;
- fixed = 1;
+ if (0 < len) {
+ int orig_len = len;
+ while (len) {
+ if (isspace(src[len - 1]))
+ len--;
+ else if (1 < len && is_nbsp(&src[len - 2]))
+ len -= 2;
+ else
+ break;
+ }
+ fixed = (orig_len != len);
}
}
@@ -404,13 +412,23 @@ void ws_fix_copy(struct strbuf *dst, const char *src, int len, unsigned ws_rule,
if ((ws_rule & WS_SPACE_BEFORE_TAB) &&
0 <= last_space_in_indent)
need_fix_leading_space = 1;
- } else if (ch == ' ') {
- last_space_in_indent = i;
- if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
- ws_tab_width(ws_rule) <= i - last_tab_in_indent)
- need_fix_leading_space = 1;
- } else
+ col_offset = 0;
+ continue;
+ }
+
+ if (ch == ' ') {
+ ;
+ } else if ((i < len - 1) && is_nbsp(src + i)) {
+ i++;
+ col_offset++;
+ } else {
break;
+ }
+ last_space_in_indent = i;
+
+ if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
+ ws_tab_width(ws_rule) <= (i - col_offset) - last_tab_in_indent)
+ need_fix_leading_space = 1;
}
if (need_fix_leading_space) {
@@ -432,15 +450,20 @@ void ws_fix_copy(struct strbuf *dst, const char *src, int len, unsigned ws_rule,
*/
for (i = 0; i < last; i++) {
char ch = src[i];
- if (ch != ' ') {
+
+ if (ch == ' ') {
+ ;
+ } else if ((i < last - 1) && is_nbsp(src + i)) {
+ i++;
+ } else {
consecutive_spaces = 0;
strbuf_addch(dst, ch);
- } else {
- consecutive_spaces++;
- if (consecutive_spaces == ws_tab_width(ws_rule)) {
- strbuf_addch(dst, '\t');
- consecutive_spaces = 0;
- }
+ continue;
+ }
+ consecutive_spaces++;
+ if (consecutive_spaces == ws_tab_width(ws_rule)) {
+ strbuf_addch(dst, '\t');
+ consecutive_spaces = 0;
}
}
while (0 < consecutive_spaces--)
@@ -465,7 +488,20 @@ void ws_fix_copy(struct strbuf *dst, const char *src, int len, unsigned ws_rule,
fixed = 1;
}
- strbuf_add(dst, src, len);
+ if (ws_rule & WS_NBSP) {
+ while (len--) {
+ if (len && is_nbsp(src)) {
+ src++;
+ len--;
+ strbuf_addch(dst, ' ');
+ } else {
+ strbuf_addch(dst, *src);
+ }
+ src++;
+ }
+ } else {
+ strbuf_add(dst, src, len);
+ }
if (add_cr_to_tail)
strbuf_addch(dst, '\r');
if (add_nl_to_tail)
--
1.7.5.3.503.g893a4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Whitespace and ' '
2011-05-27 22:47 ` Junio C Hamano
` (3 preceding siblings ...)
2011-05-28 1:31 ` [PATCH 4/3] War on nbsp: teach "git apply" to check and fix nbsp Junio C Hamano
@ 2011-05-30 12:52 ` Daniel Nyström
4 siblings, 0 replies; 15+ messages in thread
From: Daniel Nyström @ 2011-05-30 12:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Andreas Schwab, Git Mailing List
2011/5/28 Junio C Hamano <gitster@pobox.com>:
>> Again, I'm not convinced git should really care, but I'm also not
>> convinced that sbsp is necessarily all about the git whitespace
>> fixups.
>
> I am not convinced git should care, either, but if nobody else helps us,
> we need to help ourselves ;-).
Or write a standalone tool through which patches (and code for that
matter) could be piped, with the only purpose of showing "misleading"
unicode characters and the likes?
I can see even wider use of such a tool.
^ permalink raw reply [flat|nested] 15+ messages in thread