* Re: [PATCH] send-email: validate charset name in 8bit encoding prompt
2026-02-24 14:33 ` [PATCH] send-email: validate charset name in 8bit encoding prompt Shreyansh Paliwal
@ 2026-02-24 21:11 ` Junio C Hamano
2026-02-24 21:37 ` [PATCH v2] " Shreyansh Paliwal
` (2 subsequent siblings)
3 siblings, 0 replies; 27+ messages in thread
From: Junio C Hamano @ 2026-02-24 21:11 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, ben.knoble, philipoakley
Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
> + if ($warn_invalid) {
> + if (find_encoding($resp))
> + return $resp;
> + else
> + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
> + } else
> + return $resp;
This is not C but Perl.
git-send-email.perl | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git i/git-send-email.perl w/git-send-email.perl
index dc4e5418d3..15387ac377 100755
--- i/git-send-email.perl
+++ w/git-send-email.perl
@@ -1008,12 +1008,14 @@ sub ask {
}
if (!defined $valid_re or $resp =~ /$valid_re/) {
if ($warn_invalid) {
- if (find_encoding($resp))
+ if (find_encoding($resp)) {
return $resp;
- else
+ } else {
printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
- } else
+ }
+ } else {
return $resp;
+ }
}
if ($confirm_only) {
my $yesno = $term->readline(
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v2] send-email: validate charset name in 8bit encoding prompt
2026-02-24 14:33 ` [PATCH] send-email: validate charset name in 8bit encoding prompt Shreyansh Paliwal
2026-02-24 21:11 ` Junio C Hamano
@ 2026-02-24 21:37 ` Shreyansh Paliwal
2026-02-24 22:06 ` Junio C Hamano
2026-02-25 16:37 ` D. Ben Knoble
2026-02-26 16:16 ` [PATCH v3] " Shreyansh Paliwal
2026-02-28 11:20 ` [PATCH v4] " Shreyansh Paliwal
3 siblings, 2 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-24 21:37 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley, Shreyansh Paliwal
When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,
Which 8bit encoding should I declare [UTF-8]? foo
After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,
Are you sure you want to use <foo> [y/N]? y
This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
greater than 4 characters.
Add a semantic validation of the charset name using the
Encode::find_encoding() module of perl. If the encoding is not recognized,
warn the user and ask for confirmation before proceeding. After this
validation the lenght based validation becomes redundant and also breaks
flow, so change the regex of valid input to any non blank string.
Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.
The intended flow is,
Declare which 8bit encoding to use [default: UTF-8]? foobar
warning: 'foobar' does not appear to be a valid charset name.
Are you sure you want to use <foobar> [y/N]?
[1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
---
Changes in v2:
- Added braces in if-else block.
git-send-email.perl | 17 ++++++++++++++---
t/t9001-send-email.sh | 2 +-
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index cd4b316ddc..15387ac377 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -23,6 +23,7 @@
use Git::LoadCPAN::Error qw(:try);
use Git;
use Git::I18N;
+use Encode qw(find_encoding);
Getopt::Long::Configure qw/ pass_through /;
@@ -987,6 +988,7 @@ sub get_patch_subject {
sub ask {
my ($prompt, %arg) = @_;
my $valid_re = $arg{valid_re};
+ my $warn_invalid = $arg{warn_invalid};
my $default = $arg{default};
my $confirm_only = $arg{confirm_only};
my $resp;
@@ -1005,7 +1007,15 @@ sub ask {
return $default;
}
if (!defined $valid_re or $resp =~ /$valid_re/) {
- return $resp;
+ if ($warn_invalid) {
+ if (find_encoding($resp)) {
+ return $resp;
+ } else {
+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
+ }
+ } else {
+ return $resp;
+ }
}
if ($confirm_only) {
my $yesno = $term->readline(
@@ -1044,8 +1054,9 @@ sub file_declares_8bit_cte {
foreach my $f (sort keys %broken_encoding) {
print " $f\n";
}
- $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
- valid_re => qr/.{4}/, confirm_only => 1,
+ $auto_8bit_encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
+ valid_re => qr/^\S+$/, confirm_only => 1,
+ warn_invalid => 1,
default => "UTF-8");
}
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e56e0c8d77..24f6c76aee 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1691,7 +1691,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
email-using-8bit >stdout &&
grep "do not declare a Content-Transfer-Encoding" stdout &&
grep email-using-8bit stdout &&
- grep "Which 8bit encoding" stdout &&
+ grep "Declare which 8bit encoding to use" stdout &&
grep -E "Content|MIME" msgtxt1 >actual &&
test_cmp content-type-decl actual
'
Range-diff against v1:
1: 70fa4d2899 ! 1: 954c1dae9f send-email: validate charset name in 8bit encoding prompt
@@ git-send-email.perl: sub ask {
if (!defined $valid_re or $resp =~ /$valid_re/) {
- return $resp;
+ if ($warn_invalid) {
-+ if (find_encoding($resp))
++ if (find_encoding($resp)) {
+ return $resp;
-+ else
++ } else {
+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
-+ } else
++ }
++ } else {
+ return $resp;
++ }
}
if ($confirm_only) {
my $yesno = $term->readline(
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v2] send-email: validate charset name in 8bit encoding prompt
2026-02-24 21:37 ` [PATCH v2] " Shreyansh Paliwal
@ 2026-02-24 22:06 ` Junio C Hamano
2026-02-24 22:20 ` Shreyansh Paliwal
2026-02-25 16:37 ` D. Ben Knoble
1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2026-02-24 22:06 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, ben.knoble, philipoakley
Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
> When a non-ASCII character is detected in the body or subject of the email
> the user is prompted with,
>
> Which 8bit encoding should I declare [UTF-8]? foo
>
> After this the input string is validated by the regex, based on the fact
> that the charset string will be minimum 4 characters [1]. If the string is
> more than 4 letters the email is sent, if not then a second prompt to
> confirm is asked to the user,
>
> Are you sure you want to use <foo> [y/N]? y
>
> This relies on a length based regex heuristic check to validate the user
> input, and can allow clearly invalid charset names to pass if the input is
> greater than 4 characters.
>
> Add a semantic validation of the charset name using the
> Encode::find_encoding() module of perl. If the encoding is not recognized,
> warn the user and ask for confirmation before proceeding. After this
> validation the lenght based validation becomes redundant and also breaks
> flow, so change the regex of valid input to any non blank string.
>
> Additionally, the wording of the first prompt can confuse the user if not
> read properly or under any default assumptions for a yes/no prompt. Change
> the wording to make it explicitly clear to the user that the prompt needs a
> string input, UTF-8 being the default.
>
> The intended flow is,
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar
> warning: 'foobar' does not appear to be a valid charset name.
> Are you sure you want to use <foobar> [y/N]?
>
> [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
>
> Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> ---
> Changes in v2:
> - Added braces in if-else block.
>
> git-send-email.perl | 17 ++++++++++++++---
> t/t9001-send-email.sh | 2 +-
> 2 files changed, 15 insertions(+), 4 deletions(-)
Curious. This change to t9001 was there even in the previous
iteration that did not even work. How did you test it?
Will replace.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] send-email: validate charset name in 8bit encoding prompt
2026-02-24 22:06 ` Junio C Hamano
@ 2026-02-24 22:20 ` Shreyansh Paliwal
0 siblings, 0 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-24 22:20 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley
> Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
>
> > When a non-ASCII character is detected in the body or subject of the email
> > the user is prompted with,
> >
> > Which 8bit encoding should I declare [UTF-8]? foo
> >
> > After this the input string is validated by the regex, based on the fact
> > that the charset string will be minimum 4 characters [1]. If the string is
> > more than 4 letters the email is sent, if not then a second prompt to
> > confirm is asked to the user,
> >
> > Are you sure you want to use <foo> [y/N]? y
> >
> > This relies on a length based regex heuristic check to validate the user
> > input, and can allow clearly invalid charset names to pass if the input is
> > greater than 4 characters.
> >
> > Add a semantic validation of the charset name using the
> > Encode::find_encoding() module of perl. If the encoding is not recognized,
> > warn the user and ask for confirmation before proceeding. After this
> > validation the lenght based validation becomes redundant and also breaks
> > flow, so change the regex of valid input to any non blank string.
> >
> > Additionally, the wording of the first prompt can confuse the user if not
> > read properly or under any default assumptions for a yes/no prompt. Change
> > the wording to make it explicitly clear to the user that the prompt needs a
> > string input, UTF-8 being the default.
> >
> > The intended flow is,
> >
> > Declare which 8bit encoding to use [default: UTF-8]? foobar
> > warning: 'foobar' does not appear to be a valid charset name.
> > Are you sure you want to use <foobar> [y/N]?
> >
> > [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
> >
> > Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> > ---
> > Changes in v2:
> > - Added braces in if-else block.
> >
> > git-send-email.perl | 17 ++++++++++++++---
> > t/t9001-send-email.sh | 2 +-
> > 2 files changed, 15 insertions(+), 4 deletions(-)
>
> Curious. This change to t9001 was there even in the previous
> iteration that did not even work. How did you test it?
Initially I had the braces in place, when I made the wording change to the failing
test, so all the tests passed. But just before sending the patch, I saw that the
indentation looked a bit off in the annotated git send-email, so I fixed that and
also removed what I thought were unnecessary braces, but I neglected to rerun the
tests after that. My bad.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] send-email: validate charset name in 8bit encoding prompt
2026-02-24 21:37 ` [PATCH v2] " Shreyansh Paliwal
2026-02-24 22:06 ` Junio C Hamano
@ 2026-02-25 16:37 ` D. Ben Knoble
2026-02-26 17:32 ` Shreyansh Paliwal
1 sibling, 1 reply; 27+ messages in thread
From: D. Ben Knoble @ 2026-02-25 16:37 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, gitster, philipoakley
On Tue, Feb 24, 2026 at 4:39 PM Shreyansh Paliwal
<shreyanshpaliwalcmsmn@gmail.com> wrote:
>
> When a non-ASCII character is detected in the body or subject of the email
> the user is prompted with,
>
> Which 8bit encoding should I declare [UTF-8]? foo
>
> After this the input string is validated by the regex, based on the fact
> that the charset string will be minimum 4 characters [1]. If the string is
> more than 4 letters the email is sent, if not then a second prompt to
> confirm is asked to the user,
>
> Are you sure you want to use <foo> [y/N]? y
>
> This relies on a length based regex heuristic check to validate the user
> input, and can allow clearly invalid charset names to pass if the input is
> greater than 4 characters.
>
> Add a semantic validation of the charset name using the
> Encode::find_encoding() module of perl. If the encoding is not recognized,
> warn the user and ask for confirmation before proceeding. After this
> validation the lenght based validation becomes redundant and also breaks
> flow, so change the regex of valid input to any non blank string.
>
> Additionally, the wording of the first prompt can confuse the user if not
> read properly or under any default assumptions for a yes/no prompt. Change
> the wording to make it explicitly clear to the user that the prompt needs a
> string input, UTF-8 being the default.
>
> The intended flow is,
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar
> warning: 'foobar' does not appear to be a valid charset name.
> Are you sure you want to use <foobar> [y/N]?
>
> [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
>
> Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> ---
> Changes in v2:
> - Added braces in if-else block.
>
> git-send-email.perl | 17 ++++++++++++++---
> t/t9001-send-email.sh | 2 +-
> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index cd4b316ddc..15387ac377 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -23,6 +23,7 @@
> use Git::LoadCPAN::Error qw(:try);
> use Git;
> use Git::I18N;
> +use Encode qw(find_encoding);
>
> Getopt::Long::Configure qw/ pass_through /;
>
> @@ -987,6 +988,7 @@ sub get_patch_subject {
> sub ask {
> my ($prompt, %arg) = @_;
> my $valid_re = $arg{valid_re};
> + my $warn_invalid = $arg{warn_invalid};
> my $default = $arg{default};
> my $confirm_only = $arg{confirm_only};
> my $resp;
> @@ -1005,7 +1007,15 @@ sub ask {
> return $default;
> }
> if (!defined $valid_re or $resp =~ /$valid_re/) {
> - return $resp;
> + if ($warn_invalid) {
> + if (find_encoding($resp)) {
> + return $resp;
> + } else {
> + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
> + }
> + } else {
> + return $resp;
> + }
I think this is asking "ask" to do too much, since only encoding
askers can use warn_invalid.
What I rather meant was to extract relevant helper procedures so that
open-coding ask around the encoding question would be easier to
maintain.
> }
> if ($confirm_only) {
> my $yesno = $term->readline(
> @@ -1044,8 +1054,9 @@ sub file_declares_8bit_cte {
> foreach my $f (sort keys %broken_encoding) {
> print " $f\n";
> }
> - $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
> - valid_re => qr/.{4}/, confirm_only => 1,
> + $auto_8bit_encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
> + valid_re => qr/^\S+$/, confirm_only => 1,
> + warn_invalid => 1,
> default => "UTF-8");
> }
>
> diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
> index e56e0c8d77..24f6c76aee 100755
> --- a/t/t9001-send-email.sh
> +++ b/t/t9001-send-email.sh
> @@ -1691,7 +1691,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
> email-using-8bit >stdout &&
> grep "do not declare a Content-Transfer-Encoding" stdout &&
> grep email-using-8bit stdout &&
> - grep "Which 8bit encoding" stdout &&
> + grep "Declare which 8bit encoding to use" stdout &&
> grep -E "Content|MIME" msgtxt1 >actual &&
> test_cmp content-type-decl actual
> '
>
> Range-diff against v1:
> 1: 70fa4d2899 ! 1: 954c1dae9f send-email: validate charset name in 8bit encoding prompt
> @@ git-send-email.perl: sub ask {
> if (!defined $valid_re or $resp =~ /$valid_re/) {
> - return $resp;
> + if ($warn_invalid) {
> -+ if (find_encoding($resp))
> ++ if (find_encoding($resp)) {
> + return $resp;
> -+ else
> ++ } else {
> + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
> -+ } else
> ++ }
> ++ } else {
> + return $resp;
> ++ }
> }
> if ($confirm_only) {
> my $yesno = $term->readline(
> --
> 2.53.0
--
D. Ben Knoble
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v2] send-email: validate charset name in 8bit encoding prompt
2026-02-25 16:37 ` D. Ben Knoble
@ 2026-02-26 17:32 ` Shreyansh Paliwal
0 siblings, 0 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-26 17:32 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley
> On Tue, Feb 24, 2026 at 4:39 PM Shreyansh Paliwal
> <shreyanshpaliwalcmsmn@gmail.com> wrote:
> >
> > When a non-ASCII character is detected in the body or subject of the email
> > the user is prompted with,
> >
> > Which 8bit encoding should I declare [UTF-8]? foo
> >
> > After this the input string is validated by the regex, based on the fact
> > that the charset string will be minimum 4 characters [1]. If the string is
> > more than 4 letters the email is sent, if not then a second prompt to
> > confirm is asked to the user,
> >
> > Are you sure you want to use <foo> [y/N]? y
> >
> > This relies on a length based regex heuristic check to validate the user
> > input, and can allow clearly invalid charset names to pass if the input is
> > greater than 4 characters.
> >
> > Add a semantic validation of the charset name using the
> > Encode::find_encoding() module of perl. If the encoding is not recognized,
> > warn the user and ask for confirmation before proceeding. After this
> > validation the lenght based validation becomes redundant and also breaks
> > flow, so change the regex of valid input to any non blank string.
> >
> > Additionally, the wording of the first prompt can confuse the user if not
> > read properly or under any default assumptions for a yes/no prompt. Change
> > the wording to make it explicitly clear to the user that the prompt needs a
> > string input, UTF-8 being the default.
> >
> > The intended flow is,
> >
> > Declare which 8bit encoding to use [default: UTF-8]? foobar
> > warning: 'foobar' does not appear to be a valid charset name.
> > Are you sure you want to use <foobar> [y/N]?
> >
> > [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
> >
> > Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> > ---
> > Changes in v2:
> > - Added braces in if-else block.
> >
> > git-send-email.perl | 17 ++++++++++++++---
> > t/t9001-send-email.sh | 2 +-
> > 2 files changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/git-send-email.perl b/git-send-email.perl
> > index cd4b316ddc..15387ac377 100755
> > --- a/git-send-email.perl
> > +++ b/git-send-email.perl
> > @@ -23,6 +23,7 @@
> > use Git::LoadCPAN::Error qw(:try);
> > use Git;
> > use Git::I18N;
> > +use Encode qw(find_encoding);
> >
> > Getopt::Long::Configure qw/ pass_through /;
> >
> > @@ -987,6 +988,7 @@ sub get_patch_subject {
> > sub ask {
> > my ($prompt, %arg) = @_;
> > my $valid_re = $arg{valid_re};
> > + my $warn_invalid = $arg{warn_invalid};
> > my $default = $arg{default};
> > my $confirm_only = $arg{confirm_only};
> > my $resp;
> > @@ -1005,7 +1007,15 @@ sub ask {
> > return $default;
> > }
> > if (!defined $valid_re or $resp =~ /$valid_re/) {
> > - return $resp;
> > + if ($warn_invalid) {
> > + if (find_encoding($resp)) {
> > + return $resp;
> > + } else {
> > + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
> > + }
> > + } else {
> > + return $resp;
> > + }
>
> I think this is asking "ask" to do too much, since only encoding
> askers can use warn_invalid.
>
> What I rather meant was to extract relevant helper procedures so that
> open-coding ask around the encoding question would be easier to
> maintain.
Hi,
I have sent a v3 on this, in which I introduced a helper for the confirmation
prompt and also made validation logic specific to the 8bit prompt.
Do you think it would also be better to move the encoding validation
into a separate helper, or does the current split look reasonable?
Best,
Shreyansh
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v3] send-email: validate charset name in 8bit encoding prompt
2026-02-24 14:33 ` [PATCH] send-email: validate charset name in 8bit encoding prompt Shreyansh Paliwal
2026-02-24 21:11 ` Junio C Hamano
2026-02-24 21:37 ` [PATCH v2] " Shreyansh Paliwal
@ 2026-02-26 16:16 ` Shreyansh Paliwal
2026-02-26 18:45 ` Junio C Hamano
2026-02-28 11:20 ` [PATCH v4] " Shreyansh Paliwal
3 siblings, 1 reply; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-26 16:16 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley, Shreyansh Paliwal
When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,
Which 8bit encoding should I declare [UTF-8]? foo
After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,
Are you sure you want to use <foo> [y/N]? y
This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
greater than 4 characters.
Add a semantic validation of the charset name using the
Encode::find_encoding() module of perl. If the encoding is not recognized,
warn the user and ask for confirmation before proceeding. After this
validation the lenght based validation becomes redundant and also breaks
flow, so change the regex of valid input to any non blank string.
Introduce a dedicated helper for confirmation handling that can be reused
both by ask() and the custom 8bit prompt flow. Make the encoding warning
logic specific to the 8bit prompt, this reduces the load on ask(), and
improves maintainability.
Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.
The intended flow is,
Declare which 8bit encoding to use [default: UTF-8]? foobar
warning: 'foobar' does not appear to be a valid charset name.
Are you sure you want to use <foobar> [y/N]?
[1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
---
Changes in v2:
- Added a helper function confirm_ask() to handle yes/no confirmation prompts.
- Added the validation and warning logic in the 8bit prompt instead of ask().
git-send-email.perl | 36 +++++++++++++++++++++++++++++-------
t/t9001-send-email.sh | 2 +-
2 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index cd4b316ddc..3230b80701 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -23,6 +23,7 @@
use Git::LoadCPAN::Error qw(:try);
use Git;
use Git::I18N;
+use Encode qw(find_encoding);
Getopt::Long::Configure qw/ pass_through /;
@@ -984,6 +985,18 @@ sub get_patch_subject {
}
}
+sub confirm_ask {
+ my ($resp) = @_;
+ my $term = term();
+ return 0
+ unless defined $term->IN and defined fileno($term->IN) and
+ defined $term->OUT and defined fileno($term->OUT);
+ my $yesno = $term->readline(
+ # TRANSLATORS: please keep [y/N] as is.
+ sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
+ return defined $yesno && $yesno =~ /y/i;
+}
+
sub ask {
my ($prompt, %arg) = @_;
my $valid_re = $arg{valid_re};
@@ -1008,10 +1021,7 @@ sub ask {
return $resp;
}
if ($confirm_only) {
- my $yesno = $term->readline(
- # TRANSLATORS: please keep [y/N] as is.
- sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
- if (defined $yesno && $yesno =~ /y/i) {
+ if (confirm_ask($resp)) {
return $resp;
}
}
@@ -1044,9 +1054,21 @@ sub file_declares_8bit_cte {
foreach my $f (sort keys %broken_encoding) {
print " $f\n";
}
- $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
- valid_re => qr/.{4}/, confirm_only => 1,
- default => "UTF-8");
+ while(1) {
+ my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
+ valid_re => qr/^\S+$/,
+ default => "UTF-8");
+ next unless defined $encoding;
+ if (find_encoding($encoding)) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
+ if (confirm_ask($encoding)) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
+ }
}
if (!$force) {
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e56e0c8d77..24f6c76aee 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1691,7 +1691,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
email-using-8bit >stdout &&
grep "do not declare a Content-Transfer-Encoding" stdout &&
grep email-using-8bit stdout &&
- grep "Which 8bit encoding" stdout &&
+ grep "Declare which 8bit encoding to use" stdout &&
grep -E "Content|MIME" msgtxt1 >actual &&
test_cmp content-type-decl actual
'
Range-diff against v2:
1: 954c1dae9f ! 1: 748bb03a00 send-email: validate charset name in 8bit encoding prompt
@@ Commit message
When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,
- Which 8bit encoding should I declare [UTF-8]? foo
+ Which 8bit encoding should I declare [UTF-8]? foo
After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,
- Are you sure you want to use <foo> [y/N]? y
+ Are you sure you want to use <foo> [y/N]? y
This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
@@ Commit message
validation the lenght based validation becomes redundant and also breaks
flow, so change the regex of valid input to any non blank string.
+ Introduce a dedicated helper for confirmation handling that can be reused
+ both by ask() and the custom 8bit prompt flow. This makes the encoding
+ warning logic specific to the 8bit prompt, reduces the load on ask(), and
+ improves maintainability.
+
Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.
+ The intended flow is,
+
+ Declare which 8bit encoding to use [default: UTF-8]? foobar
+ warning: 'foobar' does not appear to be a valid charset name.
+ Are you sure you want to use <foobar> [y/N]?
+
+ [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
+
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
## git-send-email.perl ##
@@ git-send-email.perl
Getopt::Long::Configure qw/ pass_through /;
@@ git-send-email.perl: sub get_patch_subject {
+ }
+ }
+
++sub confirm_ask {
++ my ($resp) = @_;
++ my $term = term();
++ return 0
++ unless defined $term->IN and defined fileno($term->IN) and
++ defined $term->OUT and defined fileno($term->OUT);
++ my $yesno = $term->readline(
++ # TRANSLATORS: please keep [y/N] as is.
++ sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
++ return defined $yesno && $yesno =~ /y/i;
++}
++
sub ask {
my ($prompt, %arg) = @_;
my $valid_re = $arg{valid_re};
-+ my $warn_invalid = $arg{warn_invalid};
- my $default = $arg{default};
- my $confirm_only = $arg{confirm_only};
- my $resp;
@@ git-send-email.perl: sub ask {
- return $default;
- }
- if (!defined $valid_re or $resp =~ /$valid_re/) {
-- return $resp;
-+ if ($warn_invalid) {
-+ if (find_encoding($resp)) {
-+ return $resp;
-+ } else {
-+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $resp;
-+ }
-+ } else {
-+ return $resp;
-+ }
+ return $resp;
}
if ($confirm_only) {
- my $yesno = $term->readline(
+- my $yesno = $term->readline(
+- # TRANSLATORS: please keep [y/N] as is.
+- sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
+- if (defined $yesno && $yesno =~ /y/i) {
++ if (confirm_ask($resp)) {
+ return $resp;
+ }
+ }
@@ git-send-email.perl: sub file_declares_8bit_cte {
foreach my $f (sort keys %broken_encoding) {
print " $f\n";
}
- $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
- valid_re => qr/.{4}/, confirm_only => 1,
-+ $auto_8bit_encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
-+ valid_re => qr/^\S+$/, confirm_only => 1,
-+ warn_invalid => 1,
- default => "UTF-8");
+- default => "UTF-8");
++ while(1) {
++ my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
++ valid_re => qr/^\S+$/,
++ default => "UTF-8");
++ next unless defined $encoding;
++ if (find_encoding($encoding)) {
++ $auto_8bit_encoding = $encoding;
++ last;
++ }
++ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
++ if (confirm_ask($encoding)) {
++ $auto_8bit_encoding = $encoding;
++ last;
++ }
++ }
}
+ if (!$force) {
## t/t9001-send-email.sh ##
@@ t/t9001-send-email.sh: test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
--
2.53.0.154.g7c02d39fc2.dirty
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v3] send-email: validate charset name in 8bit encoding prompt
2026-02-26 16:16 ` [PATCH v3] " Shreyansh Paliwal
@ 2026-02-26 18:45 ` Junio C Hamano
2026-02-26 19:06 ` Junio C Hamano
2026-02-28 8:36 ` Shreyansh Paliwal
0 siblings, 2 replies; 27+ messages in thread
From: Junio C Hamano @ 2026-02-26 18:45 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, ben.knoble, philipoakley
Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
> diff --git a/git-send-email.perl b/git-send-email.perl
> index cd4b316ddc..3230b80701 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -23,6 +23,7 @@
> use Git::LoadCPAN::Error qw(:try);
> use Git;
> use Git::I18N;
> +use Encode qw(find_encoding);
I wonder how common is this module already installed on users'
systems (not asking "how widely available"---which is "can users
easily make it work?", but asking "would this work out of box with
what users already have?").
> +sub confirm_ask {
> + my ($resp) = @_;
> + my $term = term();
> + return 0
> + unless defined $term->IN and defined fileno($term->IN) and
> + defined $term->OUT and defined fileno($term->OUT);
> + my $yesno = $term->readline(
> + # TRANSLATORS: please keep [y/N] as is.
> + sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
> + return defined $yesno && $yesno =~ /y/i;
> +}
This is a bit incosistent with what "sub ask" (the only caller of
this sub) does, isn't it? Before entering the loop that makes a
call into this, it does this:
sub ask {
my ($prompt, %arg) = @_;
my $valid_re = $arg{valid_re};
my $default = $arg{default};
my $confirm_only = $arg{confirm_only};
my $resp;
my $i = 0;
my $term = term();
return defined $default ? $default : undef
unless defined $term->IN and defined fileno($term->IN) and
defined $term->OUT and defined fileno($term->OUT);
If $term is not usable for interactive prompt, it uses the default
setting. But the new confirm_ask always says "no".
confirm_ask does its own "check term() to see it is usable" because
it is called from another code path which does not have its own
logic, but it may be a wrong abstraction to give uneven interface.
It would make it more clear what is going on if you just do the
interactive $term->readline() thing in "sub ask", instead of calling
"sub confirm_ask" that does tghe $term thing redundantly.
Can't the other confirm_ask() caller call a normal "sub ask"?
I am not sure why we want to add a dedicated sub, just to ask "are
you sure you want to use X [y/N]? ".
> The intended flow is,
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar
> warning: 'foobar' does not appear to be a valid charset name.
> Are you sure you want to use <foobar> [y/N]?
It somehow looks uneven to have three lines, two of them
capitalizing their first word while the other one is all lowercase.
I wonder if this would be simpler?
Declare which 8bit encoding to use [default: UTF-8]? foobar<RET>
Do you really mean 'foobar', not a valid charset name [y/N]?
So, taking all of the above together, perhaps:
* Discard changes to "sub ask" and addition of "sub confirm_ask".
* Tweak this part a bit to call ask().
> + while(1) {
Style. missing SP before "(".
> + my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
Overly long line.
> + valid_re => qr/^\S+$/,
> + default => "UTF-8");
> + next unless defined $encoding;
> + if (find_encoding($encoding)) {
> + $auto_8bit_encoding = $encoding;
> + last;
> + }
> + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
> + if (confirm_ask($encoding)) {
Use ask() to ask
Do you really mean 'foobar', not a valid charset name [y/N]?
here, perhaps?
> + $auto_8bit_encoding = $encoding;
> + last;
> + }
> + }
> }
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v3] send-email: validate charset name in 8bit encoding prompt
2026-02-26 18:45 ` Junio C Hamano
@ 2026-02-26 19:06 ` Junio C Hamano
2026-02-28 8:41 ` Shreyansh Paliwal
2026-02-28 8:36 ` Shreyansh Paliwal
1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2026-02-26 19:06 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, ben.knoble, philipoakley
Junio C Hamano <gitster@pobox.com> writes:
> Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
>
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> index cd4b316ddc..3230b80701 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -23,6 +23,7 @@
>> use Git::LoadCPAN::Error qw(:try);
>> use Git;
>> use Git::I18N;
>> +use Encode qw(find_encoding);
>
> I wonder how common is this module already installed on users'
> systems (not asking "how widely available"---which is "can users
> easily make it work?", but asking "would this work out of box with
> what users already have?").
Answering my own question: "yes".
We use Encode::find_encoding as well as Encode::{de,en}code in
gitweb and git-svn, so it is very likely that anybody who has a full
installation of Git would already have it on their system. Also
Encode.pm is distributed as part of Perl itself, if I am not
mistaken.
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v3] send-email: validate charset name in 8bit encoding prompt
2026-02-26 19:06 ` Junio C Hamano
@ 2026-02-28 8:41 ` Shreyansh Paliwal
0 siblings, 0 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-28 8:41 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
> >
> >> diff --git a/git-send-email.perl b/git-send-email.perl
> >> index cd4b316ddc..3230b80701 100755
> >> --- a/git-send-email.perl
> >> +++ b/git-send-email.perl
> >> @@ -23,6 +23,7 @@
> >> use Git::LoadCPAN::Error qw(:try);
> >> use Git;
> >> use Git::I18N;
> >> +use Encode qw(find_encoding);
> >
> > I wonder how common is this module already installed on users'
> > systems (not asking "how widely available"---which is "can users
> > easily make it work?", but asking "would this work out of box with
> > what users already have?").
>
> Answering my own question: "yes".
>
> We use Encode::find_encoding as well as Encode::{de,en}code in
> gitweb and git-svn, so it is very likely that anybody who has a full
> installation of Git would already have it on their system. Also
> Encode.pm is distributed as part of Perl itself, if I am not
> mistaken.
That's right, Encode is bundled with Perl, so users do not need to
install anything extra, other than what is already required for
building Git.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v3] send-email: validate charset name in 8bit encoding prompt
2026-02-26 18:45 ` Junio C Hamano
2026-02-26 19:06 ` Junio C Hamano
@ 2026-02-28 8:36 ` Shreyansh Paliwal
1 sibling, 0 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-28 8:36 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley
[...]
> > +sub confirm_ask {
> > + my ($resp) = @_;
> > + my $term = term();
> > + return 0
> > + unless defined $term->IN and defined fileno($term->IN) and
> > + defined $term->OUT and defined fileno($term->OUT);
> > + my $yesno = $term->readline(
> > + # TRANSLATORS: please keep [y/N] as is.
> > + sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
> > + return defined $yesno && $yesno =~ /y/i;
> > +}
>
> This is a bit incosistent with what "sub ask" (the only caller of
> this sub) does, isn't it? Before entering the loop that makes a
> call into this, it does this:
>
> sub ask {
> my ($prompt, %arg) = @_;
> my $valid_re = $arg{valid_re};
> my $default = $arg{default};
> my $confirm_only = $arg{confirm_only};
> my $resp;
> my $i = 0;
> my $term = term();
> return defined $default ? $default : undef
> unless defined $term->IN and defined fileno($term->IN) and
> defined $term->OUT and defined fileno($term->OUT);
>
> If $term is not usable for interactive prompt, it uses the default
> setting. But the new confirm_ask always says "no".
>
> confirm_ask does its own "check term() to see it is usable" because
> it is called from another code path which does not have its own
> logic, but it may be a wrong abstraction to give uneven interface.
> It would make it more clear what is going on if you just do the
> interactive $term->readline() thing in "sub ask", instead of calling
> "sub confirm_ask" that does tghe $term thing redundantly.
>
> Can't the other confirm_ask() caller call a normal "sub ask"?
>
> I am not sure why we want to add a dedicated sub, just to ask "are
> you sure you want to use X [y/N]? ".
>
> > The intended flow is,
> >
> > Declare which 8bit encoding to use [default: UTF-8]? foobar
> > warning: 'foobar' does not appear to be a valid charset name.
> > Are you sure you want to use <foobar> [y/N]?
>
> It somehow looks uneven to have three lines, two of them
> capitalizing their first word while the other one is all lowercase.
> I wonder if this would be simpler?
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar<RET>
> Do you really mean 'foobar', not a valid charset name [y/N]?
>
Actually that makes sense, because if we need to add a special warning
in between the two prompts (what I was aiming for), either we need to
modify ask() to add the warning into the flow, or we had to seperate the
confirm_ask because we have to change the flow in any case, but if we
drop the additional warning, and instead warn/confirm together in the
second prompt we dont need this abstraction.
>
>
> So, taking all of the above together, perhaps:
>
> * Discard changes to "sub ask" and addition of "sub confirm_ask".
>
> * Tweak this part a bit to call ask().
>
> > + while(1) {
>
> Style. missing SP before "(".
>
> > + my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
>
> Overly long line.
>
my bad. will fix.
> > + valid_re => qr/^\S+$/,
> > + default => "UTF-8");
> > + next unless defined $encoding;
> > + if (find_encoding($encoding)) {
> > + $auto_8bit_encoding = $encoding;
> > + last;
> > + }
>
> > + printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
> > + if (confirm_ask($encoding)) {
>
> Use ask() to ask
>
> Do you really mean 'foobar', not a valid charset name [y/N]?
>
> here, perhaps?
>
Understood. I am hoping now this doesn't need any additional
abstraction as Ben suggested.
Sorry for the delay in response to the review.
I will send a reroll.
Thanks,
Shreyansh
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v4] send-email: validate charset name in 8bit encoding prompt
2026-02-24 14:33 ` [PATCH] send-email: validate charset name in 8bit encoding prompt Shreyansh Paliwal
` (2 preceding siblings ...)
2026-02-26 16:16 ` [PATCH v3] " Shreyansh Paliwal
@ 2026-02-28 11:20 ` Shreyansh Paliwal
2026-02-28 21:16 ` D. Ben Knoble
2026-03-02 16:10 ` Junio C Hamano
3 siblings, 2 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-02-28 11:20 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley, Shreyansh Paliwal
When a non-ASCII character is detected in the body or subject of the email
the user is prompted with,
Which 8bit encoding should I declare [UTF-8]? foo
After this the input string is validated by the regex, based on the fact
that the charset string will be minimum 4 characters [1]. If the string is
more than 4 letters the email is sent, if not then a second prompt to
confirm is asked to the user,
Are you sure you want to use <foo> [y/N]? y
This relies on a length based regex heuristic check to validate the user
input, and can allow clearly invalid charset names to pass if the input is
greater than 4 characters.
Add a semantic validation of the charset name using the
Encode::find_encoding() which is a bundled module of perl. If the encoding
is not recognized, warn the user and ask for confirmation before proceeding.
After this validation the lenght based validation becomes redundant and also
breaks flow, so change the regex of valid input to any non blank string.
Make the encoding warning logic specific to the 8bit prompt, also add a
unique confirmation prompt which reduces the load on ask(), and improves
maintainability.
Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
the wording to make it explicitly clear to the user that the prompt needs a
string input, UTF-8 being the default.
The intended flow is,
Declare which 8bit encoding to use [default: UTF-8]? foobar
<foobar> does not appear to be a valid charset name. Use it anyway [y/N]?
[1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
---
Changes in v4:
- removed the confirm_ask() helper and changes to ask().
- make a new warning/confirmation prompt specific to the 8bit encoding flow.
git-send-email.perl | 25 ++++++++++++++++++++++---
t/t9001-send-email.sh | 2 +-
2 files changed, 23 insertions(+), 4 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index cd4b316ddc..3186104709 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -23,6 +23,7 @@
use Git::LoadCPAN::Error qw(:try);
use Git;
use Git::I18N;
+use Encode qw(find_encoding);
Getopt::Long::Configure qw/ pass_through /;
@@ -1044,9 +1045,27 @@ sub file_declares_8bit_cte {
foreach my $f (sort keys %broken_encoding) {
print " $f\n";
}
- $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
- valid_re => qr/.{4}/, confirm_only => 1,
- default => "UTF-8");
+ while (1) {
+ my $encoding = ask(
+ __("Declare which 8bit encoding to use [default: UTF-8]? "),
+ valid_re => qr/^\S+$/,
+ default => "UTF-8");
+ next unless defined $encoding;
+ if (find_encoding($encoding)) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
+ my $yesno = ask(
+ sprintf(
+ __("'%s' does not appear to be a valid charset name. Use it anyway [y/N]? "),
+ $encoding),
+ valid_re => qr/^(?:y|n)/i,
+ default => "n");
+ if (defined $yesno && $yesno =~ /^y/i) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
+ }
}
if (!$force) {
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e56e0c8d77..24f6c76aee 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1691,7 +1691,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
email-using-8bit >stdout &&
grep "do not declare a Content-Transfer-Encoding" stdout &&
grep email-using-8bit stdout &&
- grep "Which 8bit encoding" stdout &&
+ grep "Declare which 8bit encoding to use" stdout &&
grep -E "Content|MIME" msgtxt1 >actual &&
test_cmp content-type-decl actual
'
Range-diff against v3:
1: 748bb03a00 ! 1: 37e17eac68 send-email: validate charset name in 8bit encoding prompt
@@ Commit message
validation the lenght based validation becomes redundant and also breaks
flow, so change the regex of valid input to any non blank string.
- Introduce a dedicated helper for confirmation handling that can be reused
- both by ask() and the custom 8bit prompt flow. This makes the encoding
- warning logic specific to the 8bit prompt, reduces the load on ask(), and
- improves maintainability.
+ Make the encoding warning logic specific to the 8bit prompt, also add a
+ unique confirmation prompt which reduces the load on ask(), and improves
+ maintainability.
Additionally, the wording of the first prompt can confuse the user if not
read properly or under any default assumptions for a yes/no prompt. Change
@@ Commit message
The intended flow is,
Declare which 8bit encoding to use [default: UTF-8]? foobar
- warning: 'foobar' does not appear to be a valid charset name.
- Are you sure you want to use <foobar> [y/N]?
+ <foobar> does not appear to be a valid charset name. Use it anyway [y/N]?
[1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
@@ git-send-email.perl
Getopt::Long::Configure qw/ pass_through /;
-@@ git-send-email.perl: sub get_patch_subject {
- }
- }
-
-+sub confirm_ask {
-+ my ($resp) = @_;
-+ my $term = term();
-+ return 0
-+ unless defined $term->IN and defined fileno($term->IN) and
-+ defined $term->OUT and defined fileno($term->OUT);
-+ my $yesno = $term->readline(
-+ # TRANSLATORS: please keep [y/N] as is.
-+ sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
-+ return defined $yesno && $yesno =~ /y/i;
-+}
-+
- sub ask {
- my ($prompt, %arg) = @_;
- my $valid_re = $arg{valid_re};
-@@ git-send-email.perl: sub ask {
- return $resp;
- }
- if ($confirm_only) {
-- my $yesno = $term->readline(
-- # TRANSLATORS: please keep [y/N] as is.
-- sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
-- if (defined $yesno && $yesno =~ /y/i) {
-+ if (confirm_ask($resp)) {
- return $resp;
- }
- }
@@ git-send-email.perl: sub file_declares_8bit_cte {
foreach my $f (sort keys %broken_encoding) {
print " $f\n";
@@ git-send-email.perl: sub file_declares_8bit_cte {
- $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
- valid_re => qr/.{4}/, confirm_only => 1,
- default => "UTF-8");
-+ while(1) {
-+ my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
-+ valid_re => qr/^\S+$/,
-+ default => "UTF-8");
++ while (1) {
++ my $encoding = ask(
++ __("Declare which 8bit encoding to use [default: UTF-8]? "),
++ valid_re => qr/^\S+$/,
++ default => "UTF-8");
+ next unless defined $encoding;
+ if (find_encoding($encoding)) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
-+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
-+ if (confirm_ask($encoding)) {
++ my $yesno = ask(
++ sprintf(
++ __("'%s' does not appear to be a valid charset name. Use it anyway [y/N]? "),
++ $encoding),
++ valid_re => qr/^(?:y|n)/i,
++ default => "n");
++ if (defined $yesno && $yesno =~ /^y/i) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
--
2.53.0.155.g748bb03a00.dirty
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v4] send-email: validate charset name in 8bit encoding prompt
2026-02-28 11:20 ` [PATCH v4] " Shreyansh Paliwal
@ 2026-02-28 21:16 ` D. Ben Knoble
2026-03-02 16:10 ` Junio C Hamano
1 sibling, 0 replies; 27+ messages in thread
From: D. Ben Knoble @ 2026-02-28 21:16 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, gitster, philipoakley
On Sat, Feb 28, 2026 at 6:22 AM Shreyansh Paliwal
<shreyanshpaliwalcmsmn@gmail.com> wrote:
>
> When a non-ASCII character is detected in the body or subject of the email
> the user is prompted with,
>
> Which 8bit encoding should I declare [UTF-8]? foo
>
> After this the input string is validated by the regex, based on the fact
> that the charset string will be minimum 4 characters [1]. If the string is
> more than 4 letters the email is sent, if not then a second prompt to
> confirm is asked to the user,
>
> Are you sure you want to use <foo> [y/N]? y
>
> This relies on a length based regex heuristic check to validate the user
> input, and can allow clearly invalid charset names to pass if the input is
> greater than 4 characters.
>
> Add a semantic validation of the charset name using the
> Encode::find_encoding() which is a bundled module of perl. If the encoding
> is not recognized, warn the user and ask for confirmation before proceeding.
> After this validation the lenght based validation becomes redundant and also
> breaks flow, so change the regex of valid input to any non blank string.
>
> Make the encoding warning logic specific to the 8bit prompt, also add a
> unique confirmation prompt which reduces the load on ask(), and improves
> maintainability.
>
> Additionally, the wording of the first prompt can confuse the user if not
> read properly or under any default assumptions for a yes/no prompt. Change
> the wording to make it explicitly clear to the user that the prompt needs a
> string input, UTF-8 being the default.
>
> The intended flow is,
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar
> <foobar> does not appear to be a valid charset name. Use it anyway [y/N]?
>
> [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
>
> Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> ---
> Changes in v4:
> - removed the confirm_ask() helper and changes to ask().
> - make a new warning/confirmation prompt specific to the 8bit encoding flow.
>
> git-send-email.perl | 25 ++++++++++++++++++++++---
> t/t9001-send-email.sh | 2 +-
> 2 files changed, 23 insertions(+), 4 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index cd4b316ddc..3186104709 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -23,6 +23,7 @@
> use Git::LoadCPAN::Error qw(:try);
> use Git;
> use Git::I18N;
> +use Encode qw(find_encoding);
>
> Getopt::Long::Configure qw/ pass_through /;
>
> @@ -1044,9 +1045,27 @@ sub file_declares_8bit_cte {
> foreach my $f (sort keys %broken_encoding) {
> print " $f\n";
> }
> - $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
> - valid_re => qr/.{4}/, confirm_only => 1,
> - default => "UTF-8");
> + while (1) {
> + my $encoding = ask(
> + __("Declare which 8bit encoding to use [default: UTF-8]? "),
> + valid_re => qr/^\S+$/,
> + default => "UTF-8");
> + next unless defined $encoding;
> + if (find_encoding($encoding)) {
> + $auto_8bit_encoding = $encoding;
> + last;
> + }
> + my $yesno = ask(
> + sprintf(
> + __("'%s' does not appear to be a valid charset name. Use it anyway [y/N]? "),
> + $encoding),
> + valid_re => qr/^(?:y|n)/i,
> + default => "n");
> + if (defined $yesno && $yesno =~ /^y/i) {
> + $auto_8bit_encoding = $encoding;
> + last;
> + }
> + }
> }
>
> if (!$force) {
> diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
> index e56e0c8d77..24f6c76aee 100755
> --- a/t/t9001-send-email.sh
> +++ b/t/t9001-send-email.sh
> @@ -1691,7 +1691,7 @@ test_expect_success $PREREQ 'asks about and fixes 8bit encodings' '
> email-using-8bit >stdout &&
> grep "do not declare a Content-Transfer-Encoding" stdout &&
> grep email-using-8bit stdout &&
> - grep "Which 8bit encoding" stdout &&
> + grep "Declare which 8bit encoding to use" stdout &&
> grep -E "Content|MIME" msgtxt1 >actual &&
> test_cmp content-type-decl actual
> '
>
> Range-diff against v3:
> 1: 748bb03a00 ! 1: 37e17eac68 send-email: validate charset name in 8bit encoding prompt
> @@ Commit message
> validation the lenght based validation becomes redundant and also breaks
> flow, so change the regex of valid input to any non blank string.
>
> - Introduce a dedicated helper for confirmation handling that can be reused
> - both by ask() and the custom 8bit prompt flow. This makes the encoding
> - warning logic specific to the 8bit prompt, reduces the load on ask(), and
> - improves maintainability.
> + Make the encoding warning logic specific to the 8bit prompt, also add a
> + unique confirmation prompt which reduces the load on ask(), and improves
> + maintainability.
>
> Additionally, the wording of the first prompt can confuse the user if not
> read properly or under any default assumptions for a yes/no prompt. Change
> @@ Commit message
> The intended flow is,
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar
> - warning: 'foobar' does not appear to be a valid charset name.
> - Are you sure you want to use <foobar> [y/N]?
> + <foobar> does not appear to be a valid charset name. Use it anyway [y/N]?
>
> [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
>
> @@ git-send-email.perl
>
> Getopt::Long::Configure qw/ pass_through /;
>
> -@@ git-send-email.perl: sub get_patch_subject {
> - }
> - }
> -
> -+sub confirm_ask {
> -+ my ($resp) = @_;
> -+ my $term = term();
> -+ return 0
> -+ unless defined $term->IN and defined fileno($term->IN) and
> -+ defined $term->OUT and defined fileno($term->OUT);
> -+ my $yesno = $term->readline(
> -+ # TRANSLATORS: please keep [y/N] as is.
> -+ sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
> -+ return defined $yesno && $yesno =~ /y/i;
> -+}
> -+
> - sub ask {
> - my ($prompt, %arg) = @_;
> - my $valid_re = $arg{valid_re};
> -@@ git-send-email.perl: sub ask {
> - return $resp;
> - }
> - if ($confirm_only) {
> -- my $yesno = $term->readline(
> -- # TRANSLATORS: please keep [y/N] as is.
> -- sprintf(__("Are you sure you want to use <%s> [y/N]? "), $resp));
> -- if (defined $yesno && $yesno =~ /y/i) {
> -+ if (confirm_ask($resp)) {
> - return $resp;
> - }
> - }
> @@ git-send-email.perl: sub file_declares_8bit_cte {
> foreach my $f (sort keys %broken_encoding) {
> print " $f\n";
> @@ git-send-email.perl: sub file_declares_8bit_cte {
> - $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
> - valid_re => qr/.{4}/, confirm_only => 1,
> - default => "UTF-8");
> -+ while(1) {
> -+ my $encoding = ask(__("Declare which 8bit encoding to use [default: UTF-8]? "),
> -+ valid_re => qr/^\S+$/,
> -+ default => "UTF-8");
> ++ while (1) {
> ++ my $encoding = ask(
> ++ __("Declare which 8bit encoding to use [default: UTF-8]? "),
> ++ valid_re => qr/^\S+$/,
> ++ default => "UTF-8");
> + next unless defined $encoding;
> + if (find_encoding($encoding)) {
> + $auto_8bit_encoding = $encoding;
> + last;
> + }
> -+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
> -+ if (confirm_ask($encoding)) {
> ++ my $yesno = ask(
> ++ sprintf(
> ++ __("'%s' does not appear to be a valid charset name. Use it anyway [y/N]? "),
> ++ $encoding),
> ++ valid_re => qr/^(?:y|n)/i,
> ++ default => "n");
> ++ if (defined $yesno && $yesno =~ /^y/i) {
> + $auto_8bit_encoding = $encoding;
> + last;
> + }
> --
> 2.53.0.155.g748bb03a00.dirty
This version looks nice to my eyes. Thanks!
--
D. Ben Knoble
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v4] send-email: validate charset name in 8bit encoding prompt
2026-02-28 11:20 ` [PATCH v4] " Shreyansh Paliwal
2026-02-28 21:16 ` D. Ben Knoble
@ 2026-03-02 16:10 ` Junio C Hamano
2026-03-03 19:06 ` Shreyansh Paliwal
1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2026-03-02 16:10 UTC (permalink / raw)
To: Shreyansh Paliwal; +Cc: git, ben.knoble, philipoakley
Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:
> Additionally, the wording of the first prompt can confuse the user if not
> read properly or under any default assumptions for a yes/no prompt. Change
> the wording to make it explicitly clear to the user that the prompt needs a
> string input, UTF-8 being the default.
>
> The intended flow is,
>
> Declare which 8bit encoding to use [default: UTF-8]? foobar
> <foobar> does not appear to be a valid charset name. Use it anyway [y/N]?
>
> [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
>
> Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> ---
> Changes in v4:
> - removed the confirm_ask() helper and changes to ask().
> - make a new warning/confirmation prompt specific to the 8bit encoding flow.
Looking quite straight-forward. Will replace.
Shall we declare victory and mark the topic for 'next'?
Thanks.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v4] send-email: validate charset name in 8bit encoding prompt
2026-03-02 16:10 ` Junio C Hamano
@ 2026-03-03 19:06 ` Shreyansh Paliwal
0 siblings, 0 replies; 27+ messages in thread
From: Shreyansh Paliwal @ 2026-03-03 19:06 UTC (permalink / raw)
To: git; +Cc: ben.knoble, gitster, philipoakley
> > Additionally, the wording of the first prompt can confuse the user if not
> > read properly or under any default assumptions for a yes/no prompt. Change
> > the wording to make it explicitly clear to the user that the prompt needs a
> > string input, UTF-8 being the default.
> >
> > The intended flow is,
> >
> > Declare which 8bit encoding to use [default: UTF-8]? foobar
> > <foobar> does not appear to be a valid charset name. Use it anyway [y/N]?
> >
> > [1]- https://github.com/git/git/commit/852a15d748034eec87adbee73a72689c8936fb8b
> >
> > Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
> > ---
> > Changes in v4:
> > - removed the confirm_ask() helper and changes to ask().
> > - make a new warning/confirmation prompt specific to the 8bit encoding flow.
>
> Looking quite straight-forward. Will replace.
>
> Shall we declare victory and mark the topic for 'next'?
>
> Thanks.
Yup, it is good to go from my side, presumably from Ben too.
Best,
Shreyansh
^ permalink raw reply [flat|nested] 27+ messages in thread