* Apostrophe at the end of author name
@ 2012-06-29 12:41 Kacper Kornet
2012-06-29 17:05 ` Robin H. Johnson
0 siblings, 1 reply; 8+ messages in thread
From: Kacper Kornet @ 2012-06-29 12:41 UTC (permalink / raw)
To: git
I try to import some repositories into git and one of the developers has
asked his name to be presented as: Name 'Nick' <email>.
However git commit --author="Name 'Nick' <email>" strips the last
apostrophe and produces a commit authored by: Name 'Nick <email>.
Maybe the function strbuf_addstr_without_crud in ident.c should strip
the trailing apostrophe only when it removed it also from the beginning
of the string?
--
Kacper
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 12:41 Apostrophe at the end of author name Kacper Kornet
@ 2012-06-29 17:05 ` Robin H. Johnson
2012-06-29 17:43 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Robin H. Johnson @ 2012-06-29 17:05 UTC (permalink / raw)
To: Git Mailing List
On Fri, Jun 29, 2012 at 02:41:22PM +0200, Kacper Kornet wrote:
> I try to import some repositories into git and one of the developers has
> asked his name to be presented as: Name 'Nick' <email>.
> However git commit --author="Name 'Nick' <email>" strips the last
> apostrophe and produces a commit authored by: Name 'Nick <email>.
>
> Maybe the function strbuf_addstr_without_crud in ident.c should strip
> the trailing apostrophe only when it removed it also from the beginning
> of the string?
Which version of Git? And is it being stripped by git, or one of the
import tools?
--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 17:05 ` Robin H. Johnson
@ 2012-06-29 17:43 ` Jeff King
2012-06-29 18:17 ` Kacper Kornet
0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2012-06-29 17:43 UTC (permalink / raw)
To: Robin H. Johnson; +Cc: Kacper Kornet, Git Mailing List
On Fri, Jun 29, 2012 at 05:05:31PM +0000, Robin H. Johnson wrote:
> On Fri, Jun 29, 2012 at 02:41:22PM +0200, Kacper Kornet wrote:
> > I try to import some repositories into git and one of the developers has
> > asked his name to be presented as: Name 'Nick' <email>.
> > However git commit --author="Name 'Nick' <email>" strips the last
> > apostrophe and produces a commit authored by: Name 'Nick <email>.
> >
> > Maybe the function strbuf_addstr_without_crud in ident.c should strip
> > the trailing apostrophe only when it removed it also from the beginning
> > of the string?
> Which version of Git? And is it being stripped by git, or one of the
> import tools?
I'm sure it's the most recent one, as strbuf_addstr_without_crud was
only added recently (but it is a refactoring of older code which should
have the same behavior). We had a similar complaint recently that
"A.B.C. <abc@example.com>" has its trailing dot stripped, even though
the internal ones are retained.
Those stripping rules date back to very early versions of git to try to
clean up cruft from gecos or other unreliable sources. I wonder if we
are better off being a bit more liberal.
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 17:43 ` Jeff King
@ 2012-06-29 18:17 ` Kacper Kornet
2012-06-29 18:29 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Kacper Kornet @ 2012-06-29 18:17 UTC (permalink / raw)
To: Jeff King; +Cc: Robin H. Johnson, Git Mailing List
On Fri, Jun 29, 2012 at 01:43:58PM -0400, Jeff King wrote:
> On Fri, Jun 29, 2012 at 05:05:31PM +0000, Robin H. Johnson wrote:
> > On Fri, Jun 29, 2012 at 02:41:22PM +0200, Kacper Kornet wrote:
> > > I try to import some repositories into git and one of the developers has
> > > asked his name to be presented as: Name 'Nick' <email>.
> > > However git commit --author="Name 'Nick' <email>" strips the last
> > > apostrophe and produces a commit authored by: Name 'Nick <email>.
> > > Maybe the function strbuf_addstr_without_crud in ident.c should strip
> > > the trailing apostrophe only when it removed it also from the beginning
> > > of the string?
> > Which version of Git? And is it being stripped by git, or one of the
> > import tools?
> I'm sure it's the most recent one,
Yes, it is 1.7.11
> as strbuf_addstr_without_crud was
> only added recently (but it is a refactoring of older code which should
> have the same behavior).
It depends what you call recently. It was refactored in July 2005
(commit: 6aa33f4035d5). But it looks like the previous code (before
refactoring) removed only comma, dot and semicolon from the end of the
author name.
--
Kacper
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 18:17 ` Kacper Kornet
@ 2012-06-29 18:29 ` Jeff King
2012-06-29 19:04 ` Junio C Hamano
2012-06-29 19:59 ` Kacper Kornet
0 siblings, 2 replies; 8+ messages in thread
From: Jeff King @ 2012-06-29 18:29 UTC (permalink / raw)
To: Kacper Kornet; +Cc: Robin H. Johnson, Git Mailing List
On Fri, Jun 29, 2012 at 08:17:01PM +0200, Kacper Kornet wrote:
> > as strbuf_addstr_without_crud was
> > only added recently (but it is a refactoring of older code which should
> > have the same behavior).
>
> It depends what you call recently. It was refactored in July 2005
> (commit: 6aa33f4035d5). But it looks like the previous code (before
> refactoring) removed only comma, dot and semicolon from the end of the
> author name.
I meant the name strbuf_addstr_without_crud did not exist until I added
it in c96f0c8, about a month ago. But yes, the functionality of the code
has been there since the very early days.
I'm tempting by the patch below, which would remove only the
syntactically significant meta-characters ("\n", "<", and ">"), as well
as trimming any stray whitespace at the edges. The problem is that we
don't really have a clue how many people were relying on this trimming
to clean up their names or emails, so there may be regressions for other
people.
diff --git a/ident.c b/ident.c
index 443c075..4552f8d 100644
--- a/ident.c
+++ b/ident.c
@@ -127,15 +127,8 @@ const char *ident_default_date(void)
static int crud(unsigned char c)
{
return c <= 32 ||
- c == '.' ||
- c == ',' ||
- c == ':' ||
- c == ';' ||
c == '<' ||
- c == '>' ||
- c == '"' ||
- c == '\\' ||
- c == '\'';
+ c == '>';
}
/*
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 18:29 ` Jeff King
@ 2012-06-29 19:04 ` Junio C Hamano
2012-06-29 19:35 ` Jeff King
2012-06-29 19:59 ` Kacper Kornet
1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2012-06-29 19:04 UTC (permalink / raw)
To: Jeff King; +Cc: Kacper Kornet, Robin H. Johnson, Git Mailing List
Jeff King <peff@peff.net> writes:
> I'm tempting by the patch below, which would remove only the
> syntactically significant meta-characters ("\n", "<", and ">"), as well
> as trimming any stray whitespace at the edges. The problem is that we
> don't really have a clue how many people were relying on this trimming
> to clean up their names or emails, so there may be regressions for other
> people.
What do you exactly mean by "syntactically significant"? In other
words, "whose syntax"?
The code with the patch will leave "." out of the crud, so with
spearce:*:1000:1000:Shawn O. Pearce:/home/spearce:/bin/sh
we would get:
From: Shawn O. Pearce <spearce@spearce.org>
without dropping the "." in the name. Your MTA would likely to
reject it.
I think that quoting "syntactically significant meta-characters" in
the context of e-mail headers is a job for the MSA, and the human
readable names in GIT_AUTHOR_NAME should allow any reasonable
character. And I agree that it is a sane definition of "reasonable"
to exclude "\n", "<", and ">" (and nothing else), as they are the
only "syntactically significant" in the context of commit object
header.
The patch goes in the right direction in that sense, but you need to
make sure that git-send-email and git-imap-send (the only two MSA we
ship) do the right thing when fed names with ".", dq, etc. first.
> diff --git a/ident.c b/ident.c
> index 443c075..4552f8d 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -127,15 +127,8 @@ const char *ident_default_date(void)
> static int crud(unsigned char c)
> {
> return c <= 32 ||
> - c == '.' ||
> - c == ',' ||
> - c == ':' ||
> - c == ';' ||
> c == '<' ||
> - c == '>' ||
> - c == '"' ||
> - c == '\\' ||
> - c == '\'';
> + c == '>';
> }
>
> /*
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 19:04 ` Junio C Hamano
@ 2012-06-29 19:35 ` Jeff King
0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2012-06-29 19:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Kacper Kornet, Robin H. Johnson, Git Mailing List
On Fri, Jun 29, 2012 at 12:04:18PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > I'm tempting by the patch below, which would remove only the
> > syntactically significant meta-characters ("\n", "<", and ">"), as well
> > as trimming any stray whitespace at the edges. The problem is that we
> > don't really have a clue how many people were relying on this trimming
> > to clean up their names or emails, so there may be regressions for other
> > people.
>
> What do you exactly mean by "syntactically significant"? In other
> words, "whose syntax"?
I meant syntactically significant to git's ident lines. I.e., "<", ">",
and "\n". If the identities are going to be put somewhere else with
different syntax (e.g., an rfc822 header), then they would obviously
need to be quoted (and that is no different than the current state), but
that should happen elsewhere.
> The code with the patch will leave "." out of the crud, so with
>
> spearce:*:1000:1000:Shawn O. Pearce:/home/spearce:/bin/sh
>
> we would get:
>
> From: Shawn O. Pearce <spearce@spearce.org>
>
> without dropping the "." in the name. Your MTA would likely to
> reject it.
But that is already the case. The crud() function is checked only for
the beginning and end of each item. Which is why you get senseless
outcomes like "A.B.C." turning into "A.B.C". AFAICT, the motivation for
most items in the crud function is purely about "this is junk that we
might find in a gecos field and should be stripped to make the name
prettier". I.e., they are heuristics, and we now have two reports of
those heuristics being wrong.
My concern is that those heuristics are sometimes _right_, and are
helping people. But we don't know how often, and I suspect there is no
way to know without changing it and waiting for people to scream, which
does not excite me.
The only example I could come up with by thinking is that we probably
_do_ want to strip a trailing dot from an email address (e.g., some
people will express a hostname as "example.com." to indicate that it is
fully qualified, but it is typically omitted from an email address).
Handling that would involve splitting the heuristics for names and
emails.
> I think that quoting "syntactically significant meta-characters" in
> the context of e-mail headers is a job for the MSA, and the human
> readable names in GIT_AUTHOR_NAME should allow any reasonable
> character. And I agree that it is a sane definition of "reasonable"
> to exclude "\n", "<", and ">" (and nothing else), as they are the
> only "syntactically significant" in the context of commit object
> header.
>
> The patch goes in the right direction in that sense, but you need to
> make sure that git-send-email and git-imap-send (the only two MSA we
> ship) do the right thing when fed names with ".", dq, etc. first.
Actually, it is format-patch where the quoting should go, as it is the
thing that puts the ident in the rfc822 header. And indeed, it already
does so (as it must, because we _do_ allow "." as in "Shawn O. Pearce",
and have always done so).
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name
2012-06-29 18:29 ` Jeff King
2012-06-29 19:04 ` Junio C Hamano
@ 2012-06-29 19:59 ` Kacper Kornet
1 sibling, 0 replies; 8+ messages in thread
From: Kacper Kornet @ 2012-06-29 19:59 UTC (permalink / raw)
To: Jeff King; +Cc: Robin H. Johnson, Git Mailing List
On Fri, Jun 29, 2012 at 02:29:44PM -0400, Jeff King wrote:
> On Fri, Jun 29, 2012 at 08:17:01PM +0200, Kacper Kornet wrote:
> > > as strbuf_addstr_without_crud was
> > > only added recently (but it is a refactoring of older code which should
> > > have the same behavior).
> > It depends what you call recently. It was refactored in July 2005
> > (commit: 6aa33f4035d5). But it looks like the previous code (before
> > refactoring) removed only comma, dot and semicolon from the end of the
> > author name.
> I meant the name strbuf_addstr_without_crud did not exist until I added
> it in c96f0c8, about a month ago. But yes, the functionality of the code
> has been there since the very early days.
> I'm tempting by the patch below, which would remove only the
> syntactically significant meta-characters ("\n", "<", and ">"), as well
> as trimming any stray whitespace at the edges. The problem is that we
> don't really have a clue how many people were relying on this trimming
> to clean up their names or emails, so there may be regressions for other
> people.
So maybe the option to enable/disable the old behaviour should be added.
--
Kacper
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-06-29 19:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-29 12:41 Apostrophe at the end of author name Kacper Kornet
2012-06-29 17:05 ` Robin H. Johnson
2012-06-29 17:43 ` Jeff King
2012-06-29 18:17 ` Kacper Kornet
2012-06-29 18:29 ` Jeff King
2012-06-29 19:04 ` Junio C Hamano
2012-06-29 19:35 ` Jeff King
2012-06-29 19:59 ` Kacper Kornet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).