From: Junio C Hamano <gitster@pobox.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: "Masahiro Yamada" <masahiroy@kernel.org>,
git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH v2 1/5] git-compat-util: add isblank() and isgraph()
Date: Mon, 27 Feb 2023 11:39:44 -0800 [thread overview]
Message-ID: <xmqqpm9usxrz.fsf@gitster.g> (raw)
In-Reply-To: <36cd059e-c676-2aa2-68d9-41a7b0db57f0@web.de> ("René Scharfe"'s message of "Sun, 26 Feb 2023 19:45:22 +0100")
René Scharfe <l.s.r@web.de> writes:
>> In the previous submission, I just moved isblank() and isgraph() as
>> implemented in wildmatch.c. I knew they were not robust against the
>> pointer increment like isblank(*s++), but I thought it was the same
>> pattern as isprint(), which has the same issue. Unfortunately, it was
>> more controversial than I had expected...
>
> Not sure we need that story in the commit message,...
Usually we encourage people to write as if there were no "previous
iterations". Describe how to reach the best end-result without any
detour, using the experience gained from failed attempts.
> ... but it gave me an idea:
;-)
> ... So it's not so
> easy, however, ...
>
>> This version implements them as inline functions because we ran out
>> all bits in the sane_ctype[] table. This is the same pattern as
>> islower() and isupper().
>
> ... if you remove GIT_SPACE from the definition above you get a
> macro version of isgraph() that uses a single table lookup.
>
> If we're out of bits then isblank() is a good choice to implement
> without a table lookup, as this class only contains two characters
> and two comparisons should be quite fast.
Implementing sane_isblank() plus sane_isgraph() as static inlines is
already not too bad, but if one of them can be made into a simple
table look-up, that indeed is better ;-)
> Stepping back a bit: Is using the unlocalized is* macros in
> wildmatch() safe, i.e. do we get the same results as before
> regardless of locale? Junio's remark in
> https://lore.kernel.org/git/xmqq3579crsd.fsf@gitster.g/ sounds
> convincing to me if we don't care about single-byte code pages
> and require plain ASCII or UTF-8. I think it's a good idea to
> address that point in the commit message.
True. To be honest, I wasn't thinking about non-UTF-8 single-byte
"code pages". In the context of wildmatch, the strings we deal with
mostly interact with the paths on the filesystem. If you interact
with your filesystem in latin-1 that may pose an issue, and even if
(or rather, especially if) we are to declare that it does not matter,
we should document it in the proposed log message.
>
> And adding tests to t/helper/test-ctype.c would be nice.
Very true.
>> +#define isblank(x) sane_isblank(x)
>> +#define isgraph(x) sane_isgraph(x)
>> @@ -1270,6 +1274,16 @@ static inline int sane_iscase(int x, int is_lower)
>> return (x & 0x20) == 0;
>> }
>>
>> +static inline int sane_isblank(int c)
>> +{
>> + return c == ' ' || c == '\t';
>> +}
>> +
>> +static inline int sane_isgraph(int c)
>> +{
>> + return isprint(c) && !isspace(c);
>> +}
next prev parent reply other threads:[~2023-02-27 19:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-26 11:50 [PATCH v2 0/5] Clean up wildmatch.c Masahiro Yamada
2023-02-26 11:50 ` [PATCH v2 1/5] git-compat-util: add isblank() and isgraph() Masahiro Yamada
2023-02-26 18:45 ` René Scharfe
2023-02-27 19:39 ` Junio C Hamano [this message]
2023-02-26 11:50 ` [PATCH v2 2/5] wildmatch: remove IS*() macros Masahiro Yamada
2023-02-26 11:50 ` [PATCH v2 3/5] wildmatch: remove NEGATE_CLASS(2) macros with trivial refactoring Masahiro Yamada
2023-02-26 11:50 ` [PATCH v2 4/5] wildmatch: use char instead of uchar Masahiro Yamada
2023-02-27 20:07 ` Junio C Hamano
2023-02-26 11:50 ` [PATCH v2 5/5] wildmatch: more cleanups after killing uchar Masahiro Yamada
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqpm9usxrz.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=l.s.r@web.de \
--cc=masahiroy@kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.