All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: tboegi@web.de
Cc: git@vger.kernel.org,  l.s.r@web.de
Subject: Re: [PATCH v2 1/2] utf8.c: Prepare workaround for iconv under macOS 14/15
Date: Sun, 11 Jan 2026 20:58:45 -0800	[thread overview]
Message-ID: <xmqqwm1no29m.fsf@gitster.g> (raw)
In-Reply-To: <20260111195149.716177-1-tboegi@web.de> (tboegi@web.de's message of "Sun, 11 Jan 2026 20:51:49 +0100")

tboegi@web.de writes:

> From: Torsten Bögershausen <tboegi@web.de>
>
> MacOS14 (Sonoma) has started to ship an iconv library with bugs.
> The same bugs exists even in MacOS 15 (Sequoia)
>
> A bug report running the Git test suite says:
>
> three tests of t3900 fail on macOS 26.1 for me:
>
>   not ok 17 - ISO-2022-JP should be shown in UTF-8 now
>   not ok 25 - ISO-2022-JP should be shown in UTF-8 now
>   not ok 38 - commit --fixup into ISO-2022-JP from UTF-8
>
> Here's the verbose output of the first one:
>
> ----- snip! -----

Doesn't this tell "git am" that your log message ends here, and ...

> expecting success of 3900.17 'ISO-2022-JP should be shown in UTF-8 now':
>                 compare_with ISO-2022-JP "$TEST_DIRECTORY"/t3900/2-UTF-8.txt
>
> --- /Users/x/src/git/t/t3900/2-UTF-8.txt 2024-10-01 19:43:24.605230684 +0000
> +++ current     2025-12-08 21:52:45.786161909 +0000

... makes the tool to apply the patch to file "current"?

> @@ -1,4 +1,4 @@
>  はれひほふ
>
>  しているのが、いるので。
> -濱浜ほれぷりぽれまびぐりろへ。
> +濱浜ほれぷりぽれまび$0$j$m$X!#
> not ok 17 - ISO-2022-JP should be shown in UTF-8 now
> 1..17
> ----- snap! -----

IOW, indent the displayed material used as an example in the
proposed log message.

> compare_with runs git show to display a commit message, which in this
> case here was encoded using ISO-2022-JP and is supposed to be reencoded
> to UTF-8, but git show only does that half-way -- the "$0$j$m$X!#" part
> is from the original ISO-2022-JP representation.
>
> That botched conversion is done by utf8.c::reencode_string_iconv().  It
> calls iconv(3) to do the actual work, initially with an output buffer of
> the same size as the input.  If the output needs more space the function
> enlarges the buffer and calls iconv(3) again.
>
> iconv(3) won't tell us how much space it needs, but it will report what
> part it already managed to convert, so we can increase the buffer and
> continue from there.  ISO-2022-JP has escape codes for switching between
> character sets, so it's a stateful encoding.  I guess the iconv(3) on my
> machine forgets the state at the end of part one and then messes up part
> two.
>
> [end of citation]
>
> Working around the buggy iconv shipped with the OS can be done in
> two  ways:
> a) Link Git against a different version of iconv
> b) Improve the handling when iconv needs a larger output buffer
>
> a) is already done by default when either Fink [1] or MacPorts [2]
>    or Homebrew [3] is installed.
> b) is implemented here, in case that no fixed iconv is available:
>    When the output buffer is too short, increase it (as before)
>    and start from scratch (this is new).
>
> This workound needs to be enabled with
> '#define ICONV_RESTART_RESET'
> and a makefile knob will be added in the next commit
>
> Suggested-by: René Scharfe <l.s.r@web.de>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
>
> [1] https://www.finkproject.org/
> [2] https://www.macports.org/
> [3] https://brew.sh/
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
>  utf8.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/utf8.c b/utf8.c
> index 35a0251939..96460cc414 100644
> --- a/utf8.c
> +++ b/utf8.c
> @@ -515,6 +515,19 @@ char *reencode_string_iconv(const char *in, size_t insz, iconv_t conv,
>  			out = xrealloc(out, outalloc);
>  			outpos = out + sofar;
>  			outsz = outalloc - sofar - 1;
> +#ifdef ICONV_RESTART_RESET
> +			/*
> +			 * If iconv(3) messes up piecemeal conversions
> +			 * then restore the original pointers, sizes,
> +			 * and converter state, then retry converting
> +			 * the full string using the reallocated buffer.
> +			 */
> +			insz += cp - (iconv_ibp)in; /* Restore insz */
> +			cp = (iconv_ibp)in;         /* original start value */
> +			outpos = out + bom_len;     /* original start value */
> +			outsz = outalloc - bom_len - 1; /* new len */
> +			iconv(conv, NULL, NULL, NULL, NULL); /* reset iconv machinery */
> +#endif
>  		}
>  		else {
>  			*outpos = '\0';

  reply	other threads:[~2026-01-12  4:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-11 19:51 [PATCH v2 1/2] utf8.c: Prepare workaround for iconv under macOS 14/15 tboegi
2026-01-12  4:58 ` Junio C Hamano [this message]
2026-01-12 16:25   ` [PATCH v3 0/2] Workaround " tboegi
2026-01-12 16:25   ` [PATCH v3 1/2] utf8.c: Prepare workaround " tboegi
2026-01-12 16:25   ` [PATCH v3 2/2] utf8.c: Enable " tboegi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwm1no29m.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.