From: Ben Knoble <ben.knoble@gmail.com>
To: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [RFC] send-email: UTF-8 encoding in subject line
Date: Mon, 23 Feb 2026 16:38:31 -0500 [thread overview]
Message-ID: <43DCEEB9-33C4-4EE2-9FF3-49DCB9B837E0@gmail.com> (raw)
In-Reply-To: <20260222155559.1777883-1-shreyanshpaliwalcmsmn@gmail.com>
> Le 22 févr. 2026 à 10:56, Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> a écrit :
>
>
>>
>>> On Sun, Feb 22, 2026 at 9:07 AM Shreyansh Paliwal
>>> <shreyanshpaliwalcmsmn@gmail.com> wrote:
>>>
>>>>> That makes sense, I tried it below.
>>>>> I also wondered whether, in addition to this, it might be helpful to warn on
>>>>> an invalid charset, and/or possibly fall back to UTF-8.
>>>>
>>>> Agreed on the first half of the statement, if we have an easy and
>>>> portable way to tell if a given random string names a valid charset.
>>>> I do not recommend to "fall back" to anything, if we are asking an
>>>> input from the user.
>>>
>>> Following up on this, I tried adding a warning when the provided charset
>>> does not appear to be valid. Current flow is,
>>>
>>> Which 8bit encoding should I declare [UTF-8]? y
>>> Are you sure you want to use <y> [y/N]? y
>>>
>>> With the additional check, it becomes,
>>>
>>> Which 8bit encoding should I declare [default: UTF-8]? y
>>> warning: 'y' does not appear to be a valid charset name.
>>> Are you sure you want to use <y> [y/N]?
>>>
>>> This uses find_encoding() from Perl’s Encode module to detect any
>>> unrecognized charset names.
>>>
>>> Let me know what you think.
>>> Also, is there any new test that should be added for this change?
>>>
>>> Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
>>> ---
>>> git-send-email.perl | 23 ++++++++++++++++++++---
>>> 1 file changed, 20 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/git-send-email.perl b/git-send-email.perl
>>> index cd4b316ddc..e62fa259ba 100755
>>> --- a/git-send-email.perl
>>> +++ b/git-send-email.perl
>>> @@ -23,6 +23,7 @@
>>> use Git::LoadCPAN::Error qw(:try);
>>> use Git;
>>> use Git::I18N;
>>> +use Encode qw(find_encoding);
>>>
>>> Getopt::Long::Configure qw/ pass_through /;
>>>
>>> @@ -1044,9 +1045,25 @@ sub file_declares_8bit_cte {
>>> foreach my $f (sort keys %broken_encoding) {
>>> print " $f\n";
>>> }
>>> - $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
>>> - valid_re => qr/.{4}/, confirm_only => 1,
>>> - default => "UTF-8");
>>> + while (1) {
>>> + my $encoding = ask(__("Which 8bit encoding should I declare [default: UTF-8]? "),
>>> + valid_re => qr/^\S+$/,
>>> + default => "UTF-8");
>>
>> Here we change things, right?
>>
>> - The original validation is "at least 4 characters", the new
>> validation is "at least one non-blank." I'm not sure why we'd prefer
>> one or the other, frankly. The original goes to 852a15d748
>> (send-email: ask confirmation if given encoding name is very short,
>> 2015-02-13), which is motivated by the same problem we're discussing
>> here!
>
> I see.
> My understanding of the earlier change (852a15d748) is that the
> length check was intended as a heuristic check to catch obviously invalid
> inputs like "y" and trigger an extra confirmation based on the fact that
> charset names would be at least 4 letters.
>
> With the additional find_encoding() check, the validation becomes semantic
> rather than length-based, recognized charset names are accepted directly,
> while unrecognized ones trigger a warning and still require explicit
> confirmation. The relaxed regex (at least one non-blank) is only meant to
> ensure we receive some non-empty input before passing it to find_encoding().
>
>> - We get rid of confirm_only, since we're about to roll our own
>> confirmation below:
>>
>>> + next unless defined $encoding;
>>> + if (find_encoding($encoding)) {
>>> + $auto_8bit_encoding = $encoding;
>>> + last;
>>> + }
>>> + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
>>> + my $yesno = ask(
>>> + sprintf(__("Are you sure you want to use <%s> [y/N]? "), $encoding),
>>> + valid_re => qr/^(?:y|n)/i,
>>> + default => 'n');
>>
>> …which might want refactored a bit so it can stay close to the original? idk.
>>
>
> Actually the flow needed to change slightly to insert the validity warning
> before the final confirmation step. Since ask() handles confirmation internally
> using confrim_only and is used in multiple places, it seemed simpler to keep the
> additional confirmation local here rather than modifying ask() itself.
>
> Let me know what you think.
>
> Best,
> Shreyansh
Ah, my mistake for being ambiguous. I meant:
The code is similar enough to the original that perhaps a helper can be introduced, or at least we should keep the equivalent strings together to help those who change one.
next prev parent reply other threads:[~2026-02-23 21:38 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-20 14:50 [RFC] send-email: UTF-8 encoding in subject line Shreyansh Paliwal
2026-02-21 2:28 ` Ben Knoble
2026-02-21 13:38 ` Shreyansh Paliwal
2026-02-21 17:30 ` Junio C Hamano
2026-02-22 14:03 ` Shreyansh Paliwal
2026-02-22 14:53 ` Philip Oakley
2026-02-22 15:00 ` D. Ben Knoble
2026-02-22 15:52 ` Shreyansh Paliwal
2026-02-23 21:38 ` Ben Knoble [this message]
2026-02-24 7:55 ` [GSOC] Discuss: Refactoring in order to reduce global state Shreyansh Paliwal
2026-02-22 14:53 ` [RFC] send-email: UTF-8 encoding in subject line D. Ben Knoble
2026-02-24 14:33 ` [PATCH] send-email: validate charset name in 8bit encoding prompt Shreyansh Paliwal
2026-02-24 21:11 ` Junio C Hamano
2026-02-24 21:37 ` [PATCH v2] " Shreyansh Paliwal
2026-02-24 22:06 ` Junio C Hamano
2026-02-24 22:20 ` Shreyansh Paliwal
2026-02-25 16:37 ` D. Ben Knoble
2026-02-26 17:32 ` Shreyansh Paliwal
2026-02-26 16:16 ` [PATCH v3] " Shreyansh Paliwal
2026-02-26 18:45 ` Junio C Hamano
2026-02-26 19:06 ` Junio C Hamano
2026-02-28 8:41 ` Shreyansh Paliwal
2026-02-28 8:36 ` Shreyansh Paliwal
2026-02-28 11:20 ` [PATCH v4] " Shreyansh Paliwal
2026-02-28 21:16 ` D. Ben Knoble
2026-03-02 16:10 ` Junio C Hamano
2026-03-03 19:06 ` Shreyansh Paliwal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43DCEEB9-33C4-4EE2-9FF3-49DCB9B837E0@gmail.com \
--to=ben.knoble@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=shreyanshpaliwalcmsmn@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox