From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Masahiro Yamada <masahiroy@kernel.org>
Cc: git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH 4/5] wildmatch: use char instead of uchar
Date: Fri, 10 Feb 2023 14:09:34 +0100 [thread overview]
Message-ID: <230210.86a61lwtq7.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <20230210075939.44949-5-masahiroy@kernel.org>
On Fri, Feb 10 2023, Masahiro Yamada wrote:
> dowild() casts (char *) and (uchar *) back-and-forth, which is
> ugly.
>
> This file was imported from rsync, which started to use (unsigned char)
> since the following commit:
>
> | commit e11c42511903adc6d27cf1671cc76fa711ea37e5
> | Author: Wayne Davison <wayned@samba.org>
> | Date: Sun Jul 6 04:33:54 2003 +0000
> |
> | - Added [:class:] handling to the character-class code.
> | - Use explicit unsigned characters for proper set checks.
> | - Made the character-class code honor backslash escapes.
> | - Accept '^' as a class-negation character in addition to '!'.
>
> Perhaps, it was needed because rsync relies on is*() from <ctypes.h>.
>
> GIT has its own implementations, so the behavior is clear.
>
> In fact, commit 4546738b58a0 ("Unlocalized isspace and friends")
> says one of the motivations is "we want the right signed behaviour".
>
> sane_istest() casts the given character to (unsigned char) anyway
> before sane_ctype[] table lookup, so dowild() can use 'char'.
>
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> ---
>
> wildmatch.c | 24 +++++++++++-------------
> 1 file changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/wildmatch.c b/wildmatch.c
> index 93800b8eac..7dffd783cb 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -12,21 +12,19 @@
> #include "cache.h"
> #include "wildmatch.h"
>
> -typedef unsigned char uchar;
> -
> #define CC_EQ(class, len, litmatch) ((len) == sizeof (litmatch)-1 \
> && *(class) == *(litmatch) \
> - && strncmp((char*)class, litmatch, len) == 0)
> + && strncmp(class, litmatch, len) == 0)
>
> /* Match pattern "p" against "text" */
> -static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> +static int dowild(const char *p, const char *text, unsigned int flags)
> {
> - uchar p_ch;
> - const uchar *pattern = p;
> + char p_ch;
> + const char *pattern = p;
>
> for ( ; (p_ch = *p) != '\0'; text++, p++) {
> int matched, match_slash, negated;
> - uchar t_ch, prev_ch;
> + char t_ch, prev_ch;
> if ((t_ch = *text) == '\0' && p_ch != '*')
> return WM_ABORT_ALL;
> if ((flags & WM_CASEFOLD) && isupper(t_ch))
> @@ -50,7 +48,7 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> continue;
> case '*':
> if (*++p == '*') {
> - const uchar *prev_p = p - 2;
> + const char *prev_p = p - 2;
> while (*++p == '*') {}
> if (!(flags & WM_PATHNAME))
> /* without WM_PATHNAME, '*' == '**' */
> @@ -90,10 +88,10 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> * with WM_PATHNAME matches the next
> * directory
> */
> - const char *slash = strchr((char*)text, '/');
> + const char *slash = strchr(text, '/');
> if (!slash)
> return WM_NOMATCH;
> - text = (const uchar*)slash;
> + text = slash;
> /* the slash is consumed by the top-level for loop */
> break;
> }
> @@ -160,13 +158,13 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> if (t_ch <= p_ch && t_ch >= prev_ch)
> matched = 1;
> else if ((flags & WM_CASEFOLD) && islower(t_ch)) {
> - uchar t_ch_upper = toupper(t_ch);
> + char t_ch_upper = toupper(t_ch);
> if (t_ch_upper <= p_ch && t_ch_upper >= prev_ch)
> matched = 1;
> }
> p_ch = 0; /* This makes "prev_ch" get set to 0. */
> } else if (p_ch == '[' && p[1] == ':') {
> - const uchar *s;
> + const char *s;
> int i;
> for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
> if (!p_ch)
> @@ -237,5 +235,5 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> /* Match the "pattern" against the "text" string. */
> int wildmatch(const char *pattern, const char *text, unsigned int flags)
> {
> - return dowild((const uchar*)pattern, (const uchar*)text, flags);
> + return dowild(pattern, text, flags);
> }
This looks good to me. I independently wrote much the same a while ago
for another reason, in: https://github.com/avar/git/commit/079f555375a
I.e. this happens to be the only bit in-tree that's stopping us from
running the xlc compiler in the c99 mode.
My solution was different, but I like yours better. I had not done your
analysis to discover that we didn't need this to be unsigned in the
first place, I merly converted the "uchar" to an "unsigned char".
next prev parent reply other threads:[~2023-02-10 13:11 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-10 7:59 [PATCH 0/5] Clean up wildmatch.c Masahiro Yamada
2023-02-10 7:59 ` [PATCH 1/5] git-compat-util: add isblank() and isgraph() Masahiro Yamada
2023-02-10 13:16 ` Ævar Arnfjörð Bjarmason
2023-02-10 16:56 ` Masahiro Yamada
2023-02-10 19:10 ` Junio C Hamano
2023-02-10 19:25 ` Masahiro Yamada
2023-02-10 22:03 ` René Scharfe
2023-02-11 7:01 ` Masahiro Yamada
2023-02-11 13:48 ` René Scharfe
2023-02-11 14:11 ` René Scharfe
2023-02-10 7:59 ` [PATCH 2/5] wildmatch: remove IS*() macros Masahiro Yamada
2023-02-10 7:59 ` [PATCH 3/5] wildmatch: remove NEGATE_CLASS and NEGATE_CLASS2 macros Masahiro Yamada
2023-02-10 13:11 ` Ævar Arnfjörð Bjarmason
2023-02-10 17:03 ` Masahiro Yamada
2023-02-10 7:59 ` [PATCH 4/5] wildmatch: use char instead of uchar Masahiro Yamada
2023-02-10 13:09 ` Ævar Arnfjörð Bjarmason [this message]
2023-02-10 7:59 ` [PATCH 5/5] wildmatch: more cleanups after killing uchar Masahiro Yamada
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=230210.86a61lwtq7.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=masahiroy@kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).