* Apostrophe at the end of author name @ 2012-06-29 12:41 Kacper Kornet 2012-06-29 17:05 ` Robin H. Johnson 0 siblings, 1 reply; 8+ messages in thread From: Kacper Kornet @ 2012-06-29 12:41 UTC (permalink / raw) To: git I try to import some repositories into git and one of the developers has asked his name to be presented as: Name 'Nick' <email>. However git commit --author="Name 'Nick' <email>" strips the last apostrophe and produces a commit authored by: Name 'Nick <email>. Maybe the function strbuf_addstr_without_crud in ident.c should strip the trailing apostrophe only when it removed it also from the beginning of the string? -- Kacper ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 12:41 Apostrophe at the end of author name Kacper Kornet @ 2012-06-29 17:05 ` Robin H. Johnson 2012-06-29 17:43 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Robin H. Johnson @ 2012-06-29 17:05 UTC (permalink / raw) To: Git Mailing List On Fri, Jun 29, 2012 at 02:41:22PM +0200, Kacper Kornet wrote: > I try to import some repositories into git and one of the developers has > asked his name to be presented as: Name 'Nick' <email>. > However git commit --author="Name 'Nick' <email>" strips the last > apostrophe and produces a commit authored by: Name 'Nick <email>. > > Maybe the function strbuf_addstr_without_crud in ident.c should strip > the trailing apostrophe only when it removed it also from the beginning > of the string? Which version of Git? And is it being stripped by git, or one of the import tools? -- Robin Hugh Johnson Gentoo Linux: Developer, Trustee & Infrastructure Lead E-Mail : robbat2@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 17:05 ` Robin H. Johnson @ 2012-06-29 17:43 ` Jeff King 2012-06-29 18:17 ` Kacper Kornet 0 siblings, 1 reply; 8+ messages in thread From: Jeff King @ 2012-06-29 17:43 UTC (permalink / raw) To: Robin H. Johnson; +Cc: Kacper Kornet, Git Mailing List On Fri, Jun 29, 2012 at 05:05:31PM +0000, Robin H. Johnson wrote: > On Fri, Jun 29, 2012 at 02:41:22PM +0200, Kacper Kornet wrote: > > I try to import some repositories into git and one of the developers has > > asked his name to be presented as: Name 'Nick' <email>. > > However git commit --author="Name 'Nick' <email>" strips the last > > apostrophe and produces a commit authored by: Name 'Nick <email>. > > > > Maybe the function strbuf_addstr_without_crud in ident.c should strip > > the trailing apostrophe only when it removed it also from the beginning > > of the string? > Which version of Git? And is it being stripped by git, or one of the > import tools? I'm sure it's the most recent one, as strbuf_addstr_without_crud was only added recently (but it is a refactoring of older code which should have the same behavior). We had a similar complaint recently that "A.B.C. <abc@example.com>" has its trailing dot stripped, even though the internal ones are retained. Those stripping rules date back to very early versions of git to try to clean up cruft from gecos or other unreliable sources. I wonder if we are better off being a bit more liberal. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 17:43 ` Jeff King @ 2012-06-29 18:17 ` Kacper Kornet 2012-06-29 18:29 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Kacper Kornet @ 2012-06-29 18:17 UTC (permalink / raw) To: Jeff King; +Cc: Robin H. Johnson, Git Mailing List On Fri, Jun 29, 2012 at 01:43:58PM -0400, Jeff King wrote: > On Fri, Jun 29, 2012 at 05:05:31PM +0000, Robin H. Johnson wrote: > > On Fri, Jun 29, 2012 at 02:41:22PM +0200, Kacper Kornet wrote: > > > I try to import some repositories into git and one of the developers has > > > asked his name to be presented as: Name 'Nick' <email>. > > > However git commit --author="Name 'Nick' <email>" strips the last > > > apostrophe and produces a commit authored by: Name 'Nick <email>. > > > Maybe the function strbuf_addstr_without_crud in ident.c should strip > > > the trailing apostrophe only when it removed it also from the beginning > > > of the string? > > Which version of Git? And is it being stripped by git, or one of the > > import tools? > I'm sure it's the most recent one, Yes, it is 1.7.11 > as strbuf_addstr_without_crud was > only added recently (but it is a refactoring of older code which should > have the same behavior). It depends what you call recently. It was refactored in July 2005 (commit: 6aa33f4035d5). But it looks like the previous code (before refactoring) removed only comma, dot and semicolon from the end of the author name. -- Kacper ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 18:17 ` Kacper Kornet @ 2012-06-29 18:29 ` Jeff King 2012-06-29 19:04 ` Junio C Hamano 2012-06-29 19:59 ` Kacper Kornet 0 siblings, 2 replies; 8+ messages in thread From: Jeff King @ 2012-06-29 18:29 UTC (permalink / raw) To: Kacper Kornet; +Cc: Robin H. Johnson, Git Mailing List On Fri, Jun 29, 2012 at 08:17:01PM +0200, Kacper Kornet wrote: > > as strbuf_addstr_without_crud was > > only added recently (but it is a refactoring of older code which should > > have the same behavior). > > It depends what you call recently. It was refactored in July 2005 > (commit: 6aa33f4035d5). But it looks like the previous code (before > refactoring) removed only comma, dot and semicolon from the end of the > author name. I meant the name strbuf_addstr_without_crud did not exist until I added it in c96f0c8, about a month ago. But yes, the functionality of the code has been there since the very early days. I'm tempting by the patch below, which would remove only the syntactically significant meta-characters ("\n", "<", and ">"), as well as trimming any stray whitespace at the edges. The problem is that we don't really have a clue how many people were relying on this trimming to clean up their names or emails, so there may be regressions for other people. diff --git a/ident.c b/ident.c index 443c075..4552f8d 100644 --- a/ident.c +++ b/ident.c @@ -127,15 +127,8 @@ const char *ident_default_date(void) static int crud(unsigned char c) { return c <= 32 || - c == '.' || - c == ',' || - c == ':' || - c == ';' || c == '<' || - c == '>' || - c == '"' || - c == '\\' || - c == '\''; + c == '>'; } /* ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 18:29 ` Jeff King @ 2012-06-29 19:04 ` Junio C Hamano 2012-06-29 19:35 ` Jeff King 2012-06-29 19:59 ` Kacper Kornet 1 sibling, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2012-06-29 19:04 UTC (permalink / raw) To: Jeff King; +Cc: Kacper Kornet, Robin H. Johnson, Git Mailing List Jeff King <peff@peff.net> writes: > I'm tempting by the patch below, which would remove only the > syntactically significant meta-characters ("\n", "<", and ">"), as well > as trimming any stray whitespace at the edges. The problem is that we > don't really have a clue how many people were relying on this trimming > to clean up their names or emails, so there may be regressions for other > people. What do you exactly mean by "syntactically significant"? In other words, "whose syntax"? The code with the patch will leave "." out of the crud, so with spearce:*:1000:1000:Shawn O. Pearce:/home/spearce:/bin/sh we would get: From: Shawn O. Pearce <spearce@spearce.org> without dropping the "." in the name. Your MTA would likely to reject it. I think that quoting "syntactically significant meta-characters" in the context of e-mail headers is a job for the MSA, and the human readable names in GIT_AUTHOR_NAME should allow any reasonable character. And I agree that it is a sane definition of "reasonable" to exclude "\n", "<", and ">" (and nothing else), as they are the only "syntactically significant" in the context of commit object header. The patch goes in the right direction in that sense, but you need to make sure that git-send-email and git-imap-send (the only two MSA we ship) do the right thing when fed names with ".", dq, etc. first. > diff --git a/ident.c b/ident.c > index 443c075..4552f8d 100644 > --- a/ident.c > +++ b/ident.c > @@ -127,15 +127,8 @@ const char *ident_default_date(void) > static int crud(unsigned char c) > { > return c <= 32 || > - c == '.' || > - c == ',' || > - c == ':' || > - c == ';' || > c == '<' || > - c == '>' || > - c == '"' || > - c == '\\' || > - c == '\''; > + c == '>'; > } > > /* > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 19:04 ` Junio C Hamano @ 2012-06-29 19:35 ` Jeff King 0 siblings, 0 replies; 8+ messages in thread From: Jeff King @ 2012-06-29 19:35 UTC (permalink / raw) To: Junio C Hamano; +Cc: Kacper Kornet, Robin H. Johnson, Git Mailing List On Fri, Jun 29, 2012 at 12:04:18PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > I'm tempting by the patch below, which would remove only the > > syntactically significant meta-characters ("\n", "<", and ">"), as well > > as trimming any stray whitespace at the edges. The problem is that we > > don't really have a clue how many people were relying on this trimming > > to clean up their names or emails, so there may be regressions for other > > people. > > What do you exactly mean by "syntactically significant"? In other > words, "whose syntax"? I meant syntactically significant to git's ident lines. I.e., "<", ">", and "\n". If the identities are going to be put somewhere else with different syntax (e.g., an rfc822 header), then they would obviously need to be quoted (and that is no different than the current state), but that should happen elsewhere. > The code with the patch will leave "." out of the crud, so with > > spearce:*:1000:1000:Shawn O. Pearce:/home/spearce:/bin/sh > > we would get: > > From: Shawn O. Pearce <spearce@spearce.org> > > without dropping the "." in the name. Your MTA would likely to > reject it. But that is already the case. The crud() function is checked only for the beginning and end of each item. Which is why you get senseless outcomes like "A.B.C." turning into "A.B.C". AFAICT, the motivation for most items in the crud function is purely about "this is junk that we might find in a gecos field and should be stripped to make the name prettier". I.e., they are heuristics, and we now have two reports of those heuristics being wrong. My concern is that those heuristics are sometimes _right_, and are helping people. But we don't know how often, and I suspect there is no way to know without changing it and waiting for people to scream, which does not excite me. The only example I could come up with by thinking is that we probably _do_ want to strip a trailing dot from an email address (e.g., some people will express a hostname as "example.com." to indicate that it is fully qualified, but it is typically omitted from an email address). Handling that would involve splitting the heuristics for names and emails. > I think that quoting "syntactically significant meta-characters" in > the context of e-mail headers is a job for the MSA, and the human > readable names in GIT_AUTHOR_NAME should allow any reasonable > character. And I agree that it is a sane definition of "reasonable" > to exclude "\n", "<", and ">" (and nothing else), as they are the > only "syntactically significant" in the context of commit object > header. > > The patch goes in the right direction in that sense, but you need to > make sure that git-send-email and git-imap-send (the only two MSA we > ship) do the right thing when fed names with ".", dq, etc. first. Actually, it is format-patch where the quoting should go, as it is the thing that puts the ident in the rfc822 header. And indeed, it already does so (as it must, because we _do_ allow "." as in "Shawn O. Pearce", and have always done so). -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Apostrophe at the end of author name 2012-06-29 18:29 ` Jeff King 2012-06-29 19:04 ` Junio C Hamano @ 2012-06-29 19:59 ` Kacper Kornet 1 sibling, 0 replies; 8+ messages in thread From: Kacper Kornet @ 2012-06-29 19:59 UTC (permalink / raw) To: Jeff King; +Cc: Robin H. Johnson, Git Mailing List On Fri, Jun 29, 2012 at 02:29:44PM -0400, Jeff King wrote: > On Fri, Jun 29, 2012 at 08:17:01PM +0200, Kacper Kornet wrote: > > > as strbuf_addstr_without_crud was > > > only added recently (but it is a refactoring of older code which should > > > have the same behavior). > > It depends what you call recently. It was refactored in July 2005 > > (commit: 6aa33f4035d5). But it looks like the previous code (before > > refactoring) removed only comma, dot and semicolon from the end of the > > author name. > I meant the name strbuf_addstr_without_crud did not exist until I added > it in c96f0c8, about a month ago. But yes, the functionality of the code > has been there since the very early days. > I'm tempting by the patch below, which would remove only the > syntactically significant meta-characters ("\n", "<", and ">"), as well > as trimming any stray whitespace at the edges. The problem is that we > don't really have a clue how many people were relying on this trimming > to clean up their names or emails, so there may be regressions for other > people. So maybe the option to enable/disable the old behaviour should be added. -- Kacper ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-06-29 19:59 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-29 12:41 Apostrophe at the end of author name Kacper Kornet 2012-06-29 17:05 ` Robin H. Johnson 2012-06-29 17:43 ` Jeff King 2012-06-29 18:17 ` Kacper Kornet 2012-06-29 18:29 ` Jeff King 2012-06-29 19:04 ` Junio C Hamano 2012-06-29 19:35 ` Jeff King 2012-06-29 19:59 ` Kacper Kornet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).