From: Junio C Hamano <gitster@pobox.com>
To: Adrian Ratiu <adrian.ratiu@collabora.com>
Cc: git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH] ws: add new tab-between-non-ws check
Date: Wed, 24 Dec 2025 09:31:16 +0900 [thread overview]
Message-ID: <xmqq5x9wpvor.fsf@gitster.g> (raw)
In-Reply-To: <20251223132756.604036-1-adrian.ratiu@collabora.com> (Adrian Ratiu's message of "Tue, 23 Dec 2025 15:27:56 +0200")
Adrian Ratiu <adrian.ratiu@collabora.com> writes:
> This adds a new check to detect HT in the middle of sentences that
> should have been a SP, as suggested by Junio in
> https://public-inbox.org/git/xmqqy0mwsedz.fsf@gitster.g/
>
> The check is a bit complex because we want to detect places where
> a SP was intended and the naive before/after character check can
> issue false positives in cases like "a\tb".
Add something like "where the tab expands to more than one display
column" at the end of the string, probably. Otherwise what you
meant by "false positive" is unclear.
> +test_expect_success 'check tab between non-whitespace (tab-between-non-ws: off)' '
> + git config core.whitespace "-tab-between-non-ws" &&
> + printf "1234567\tb" >x &&
> + git add x &&
> + git diff --cached --check
> +'
I think the cases covered by these tests are wide enough. The
feature covered by the code change in this patch is insufficient, so
when it is improved, the tests for missing features would need to be
added to each of these cases.
> @@ -228,6 +234,32 @@ static unsigned ws_check_emit_1(const char *line, int len, unsigned ws_rule,
> written = i;
> }
>
> + if (ws_rule & WS_TAB_BETWEEN_NON_WS) {
> + /*
> + * A tab surrounded by non-whitespace characters is a typo candidate
> + * (a space might have been intended). This checks for a tab that
> + * would be expanded to a single space, which is when it appears at
> + * a column that is one less than a multiple of the tabwidth.
> + */
> + int col = 0;
> + int tabwidth = ws_tab_width(ws_rule);
> +
> + if (!tabwidth)
> + BUG("a known tabwidth is required by WS_TAB_BETWEEN_NON_WS");
> +
> + for (i = 0; i < len; i++) {
> + if (line[i] == '\t') {
> + if (i > 0 && i < len - 1 &&
> + !isspace(line[i - 1]) && !isspace(line[i + 1]) &&
> + (col % tabwidth) == (tabwidth - 1))
> + result |= WS_TAB_BETWEEN_NON_WS;
> + col += tabwidth - (col % tabwidth);
Checking "i < len -1" is good. We do not need to check for a tab at
the end of the string here, as it is *not* "between no-ws". If a
file type does not want a trailing tab, it can be marked with
trailing-whitespace.
> + } else {
> + col++;
> + }
> + }
> + }
> +
> if (stream) {
> /*
> * Now the rest of the line starts at "written".
> diff --git a/ws.h b/ws.h
> index 06d5cb73f8..35475fd320 100644
> --- a/ws.h
> +++ b/ws.h
> @@ -16,6 +16,7 @@ struct strbuf;
> #define WS_BLANK_AT_EOF (1<<10)
> #define WS_TAB_IN_INDENT (1<<11)
> #define WS_INCOMPLETE_LINE (1<<12)
> +#define WS_TAB_BETWEEN_NON_WS (1<<13)
>
> #define WS_TRAILING_SPACE (WS_BLANK_AT_EOL|WS_BLANK_AT_EOF)
> #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB|8)
The above is sufficient to make "git diff --check" and "git apply
--whitespace=warn" notice, but the code change in this patch is not
sufficient.
* Colored "git diff" output needs to highlight whitespace errors.
diff.c:diff_colors[] tells us to use BG_RED to paint them by
default. At the end of ws_check_emit_1(), when stream is not
NULL, we emit the middle segment as-is, but when this new
whitespace error class is in effect, that part needs to paint the
tab at 7th column between !isspace() bytes. Introduce a helper
function "static emit_middle_section()" to do so, perhaps.
* "git apply --whitespace=fix" needs to turn such a HT to a SP.
This is probably done in ws_fix_copy().
Thanks.
prev parent reply other threads:[~2025-12-24 0:31 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-23 13:27 [PATCH] ws: add new tab-between-non-ws check Adrian Ratiu
2025-12-24 0:31 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq5x9wpvor.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=adrian.ratiu@collabora.com \
--cc=git@vger.kernel.org \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).