* t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
@ 2012-05-24 12:01 Michael Haggerty
2012-05-24 17:16 ` Jeff King
0 siblings, 1 reply; 16+ messages in thread
From: Michael Haggerty @ 2012-05-24 12:01 UTC (permalink / raw)
To: Jeff King; +Cc: git discussion list
On my setup, the above commit causes 12 tests in t4014 to fail. For
example, test 25:
> expecting success:
> check_threading expect.thread --thread master
>
> --- expect.thread 2012-05-24 11:57:26.841337237 +0000
> +++ actual 2012-05-24 11:57:26.853337255 +0000
> @@ -1,10 +1,10 @@
> ---
> -Message-Id: <0>
> +Message-Id: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
> ---
> -Message-Id: <1>
> -In-Reply-To: <0>
> -References: <0>
> +Message-Id: <fd00575a8382ce27c62b83730a40bcff1dc2f25f.1337860646.git.mhagger@michael.(none)>
> +In-Reply-To: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
> +References: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
> ---
> -Message-Id: <2>
> -In-Reply-To: <0>
> -References: <0>
> +Message-Id: <18ed22aae56367787c36a882bd61281e07994f11.1337860646.git.mhagger@michael.(none)>
> +In-Reply-To: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
> +References: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
> not ok - 25 thread
Beyond bisecting, I haven't looked into this problem further. Let me
know if you need more info.
Michael
--
Michael Haggerty
Head of Software Development
JPK Instruments AG
Bouchéstr. 12
12435 Berlin, Germany
tel: +49 30 5331 12070
fax: +49 30 5331 22555
mail: haggerty@jpk.com
web: www.jpk.com, www.nanobioviews.net
Aufsichtsrat: Dr. Franz-Ferdinand von Falkenhausen (Vorsitzender)
Vorstand: Torsten Jähnke, Frank Pelzer, Jörn Kamps, René Grünberg
JPKinstruments Aktiengesellschaft
Amtsgericht Berlin-Charlottenburg
HRB 75513
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 12:01 t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids Michael Haggerty
@ 2012-05-24 17:16 ` Jeff King
2012-05-24 20:07 ` Junio C Hamano
2012-05-24 20:49 ` Michael Haggerty
0 siblings, 2 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 17:16 UTC (permalink / raw)
To: Michael Haggerty; +Cc: Junio C Hamano, git discussion list
On Thu, May 24, 2012 at 02:01:57PM +0200, Michael Haggerty wrote:
> On my setup, the above commit causes 12 tests in t4014 to fail. For
> example, test 25:
>
> >-Message-Id: <0>
> >+Message-Id: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
Thanks for the report. I know exactly what the issue is, as it came up
in the discussion of the original series. 43ae9f47ab stopped using
git_committer_info (which looks at $GIT_COMMITTER_EMAIL) for the end of
the message-id and started using the default-generated email directly.
Nobody should care, because either:
1. The defaults set up a reasonable hostname for your machine.
2. They do not, but you adjust it by setting user.email. Otherwise,
your author ident would have this bogus email in it.
The only setup that _would_ care is if the generated default is bogus
and you set $GIT_COMMITTER_EMAIL in the environment and relied on that
to get a sane value. Which is exactly what the test environment does.
The question is, is what it is doing sane and something we should care
about? Or is the test broken (it fails to parse the message-id that
contains ".(none)", but I am not even sure that is intentional and not
simply lazy regex writing in the test).
I suspect that is not especially sane, but at the same time, it is not
hard to support. The patch below (on top of jk/ident-gecos-strbuf)
should fix it.
-- >8 --
Subject: format-patch: use GIT_COMMITTER_EMAIL when making message ids
Before commit 43ae9f4, we generated the tail of a message id
by calling git_committer_info and parsing the email out of
the result. 43ae9f4 changed to use ident_default_email
directly, so we didn't have to bother with parsing. As a
side effect, it meant we no longer used GIT_COMMITTER_EMAIL
at all.
In general, this is probably reasonable behavior. Either the
default email is sane on your system, or you are using
user.email to provide something sane. The exception is if
you rely on GIT_COMMITTER_EMAIL being set all the time to
override the bogus generated email.
This is unlikely to match anybody's real-life setup, but we
do use it in the test environment. And furthermore, it's
what we have always done, and the change in 43ae9f4 was
about cleaning up, not fixing any bug; we should be
conservative and keep the behavior identical.
Signed-off-by: Jeff King <peff@peff.net>
---
Note that we check the environment outside of the usual strbuf_trim that
happens to the default email. And outside of fmt_ident, which trims
whitespace, as well. So compared to the state before this series,
something like:
GIT_COMMITTER_EMAIL="$(printf 'foo@bar\n')" git format-patch ...
is now broken. It also strikes me as a little ugly that this code path
needs to care about $GIT_COMMITTER_EMAIL at all. I can rework the ident
interface to provide a more sanitized broken-down version of the ident
if we care.
builtin/log.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/builtin/log.c b/builtin/log.c
index 8010a40..3f1883c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -739,8 +739,11 @@ static void get_patch_ids(struct rev_info *rev, struct patch_ids *ids, const cha
static void gen_message_id(struct rev_info *info, char *base)
{
struct strbuf buf = STRBUF_INIT;
+ const char *email = getenv("GIT_COMMITTER_EMAIL");
+ if (!email)
+ email = ident_default_email();
strbuf_addf(&buf, "%s.%lu.git.%s", base,
- (unsigned long) time(NULL), ident_default_email());
+ (unsigned long) time(NULL), email);
info->message_id = strbuf_detach(&buf, NULL);
}
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 17:16 ` Jeff King
@ 2012-05-24 20:07 ` Junio C Hamano
2012-05-24 20:15 ` Jeff King
2012-05-24 20:49 ` Michael Haggerty
1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2012-05-24 20:07 UTC (permalink / raw)
To: Jeff King; +Cc: Michael Haggerty, Junio C Hamano, git discussion list
Jeff King <peff@peff.net> writes:
> On Thu, May 24, 2012 at 02:01:57PM +0200, Michael Haggerty wrote:
>
>> On my setup, the above commit causes 12 tests in t4014 to fail. For
>> example, test 25:
>>
>> >-Message-Id: <0>
>> >+Message-Id: <1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
>
> Thanks for the report. I know exactly what the issue is, as it came up
> in the discussion of the original series. 43ae9f47ab stopped using
> git_committer_info (which looks at $GIT_COMMITTER_EMAIL) for the end of
> the message-id and started using the default-generated email directly.
>
> Nobody should care, because either:
>
> 1. The defaults set up a reasonable hostname for your machine.
>
> 2. They do not, but you adjust it by setting user.email. Otherwise,
> your author ident would have this bogus email in it.
>
> The only setup that _would_ care is if the generated default is bogus
> and you set $GIT_COMMITTER_EMAIL in the environment and relied on that
> to get a sane value. Which is exactly what the test environment does.
Or they worked to create their series in a good machine, pull it down to
another machine during their lunch break, and run format-patch to send it
out after the final eyeballing. Perhaps they are not supposed to be
working on the project in question during the day at work, so the work
machine does not have user.email set up correctly yet.
> The question is, is what it is doing sane and something we should care
> about? Or is the test broken (it fails to parse the message-id that
> contains ".(none)", but I am not even sure that is intentional and not
> simply lazy regex writing in the test).
I doubt that it was carefully written to try to exclude ".(none)".
It somewhat curious---it seems to want to grab everything after "<" up to
the first occurrence of ">"---why isn't this pattern matching?
> This is unlikely to match anybody's real-life setup, but we do use it in
> the test environment. And furthermore, it's what we have always done,
> and the change in 43ae9f4 was about cleaning up, not fixing any bug; we
> should be conservative and keep the behavior identical.
I agree with the last sentence. I wouldn't be surprised if somebody was
indexing patch e-mails on the list, keyed with their Message-Id: field,
and using the key as something more than just a random "uniqueness" token.
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Note that we check the environment outside of the usual strbuf_trim that
> happens to the default email. And outside of fmt_ident, which trims
> whitespace, as well. So compared to the state before this series,
> something like:
>
> GIT_COMMITTER_EMAIL="$(printf 'foo@bar\n')" git format-patch ...
>
> is now broken.
True, and setting GIT_COMMITTER_EMAIL to that value will not break other
uses because they trim. But as we discussed before, we do not strongly
stop users from deliberately adding unnecessary whitespaces around names
and e-mails that they themselves configure (i.e. user.email and these
environment variables), so it probably is not a huge deal.
> It also strikes me as a little ugly that this code path
> needs to care about $GIT_COMMITTER_EMAIL at all.
Do you mean "why committer and not author"? It primarily is because we
want to see "who is this person who wants a unique token tied to his
identity" and author and committer ident are both equally reasonable
choices. But we have picked to use committer in these cases long time
ago.
If you mean "why environment and not an API call?", then I would have to
agree. ident_committer_email() call, that returns a sanitized version,
would have been a natural way to write this, if it were available.
> I can rework the ident interface to provide a more sanitized broken-down
> version of the ident if we care.
That is a healthy thing to do in the longer term, I would guess.
> builtin/log.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 8010a40..3f1883c 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -739,8 +739,11 @@ static void get_patch_ids(struct rev_info *rev, struct patch_ids *ids, const cha
> static void gen_message_id(struct rev_info *info, char *base)
> {
> struct strbuf buf = STRBUF_INIT;
> + const char *email = getenv("GIT_COMMITTER_EMAIL");
> + if (!email)
> + email = ident_default_email();
> strbuf_addf(&buf, "%s.%lu.git.%s", base,
> - (unsigned long) time(NULL), ident_default_email());
> + (unsigned long) time(NULL), email);
> info->message_id = strbuf_detach(&buf, NULL);
> }
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 20:07 ` Junio C Hamano
@ 2012-05-24 20:15 ` Jeff King
2012-05-24 23:25 ` Jeff King
0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2012-05-24 20:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
On Thu, May 24, 2012 at 01:07:03PM -0700, Junio C Hamano wrote:
> > The only setup that _would_ care is if the generated default is bogus
> > and you set $GIT_COMMITTER_EMAIL in the environment and relied on that
> > to get a sane value. Which is exactly what the test environment does.
>
> Or they worked to create their series in a good machine, pull it down to
> another machine during their lunch break, and run format-patch to send it
> out after the final eyeballing. Perhaps they are not supposed to be
> working on the project in question during the day at work, so the work
> machine does not have user.email set up correctly yet.
True. Although the chances that they have set GIT_COMMITTER_EMAIL in
their environment seem unlikely in that case. In other words, it was
broken before, and it is broken now.
> > The question is, is what it is doing sane and something we should care
> > about? Or is the test broken (it fails to parse the message-id that
> > contains ".(none)", but I am not even sure that is intentional and not
> > simply lazy regex writing in the test).
>
> I doubt that it was carefully written to try to exclude ".(none)".
>
> It somewhat curious---it seems to want to grab everything after "<" up to
> the first occurrence of ">"---why isn't this pattern matching?
I think it is even grosser than that. We follow-up that match with a
search-and-replace using the message-ids we have found as regular
expressions, but we do not bother to quote them. So the ()
metacharacters get interpreted as regular expressions. I suspect
something like this would fix it:
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index b473b6d..3171c06 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -251,7 +251,7 @@ check_threading () {
}
if ($printing) {
$h{$1}=$i++ if (/<([^>]+)>/ and !exists $h{$1});
- for $k (keys %h) {s/$k/$h{$k}/};
+ for $k (keys %h) {s/\Q$k\E/$h{$k}/};
print;
}
print "---\n" if /^From /i;
> > It also strikes me as a little ugly that this code path
> > needs to care about $GIT_COMMITTER_EMAIL at all.
>
> Do you mean "why committer and not author"? It primarily is because we
> want to see "who is this person who wants a unique token tied to his
> identity" and author and committer ident are both equally reasonable
> choices. But we have picked to use committer in these cases long time
> ago.
>
> If you mean "why environment and not an API call?", then I would have to
> agree. ident_committer_email() call, that returns a sanitized version,
> would have been a natural way to write this, if it were available.
I meant the latter. There is no such call, but I can make one. Let me
see how awkward it is.
-Peff
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 17:16 ` Jeff King
2012-05-24 20:07 ` Junio C Hamano
@ 2012-05-24 20:49 ` Michael Haggerty
2012-05-24 21:02 ` Jeff King
1 sibling, 1 reply; 16+ messages in thread
From: Michael Haggerty @ 2012-05-24 20:49 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git discussion list
On 05/24/2012 07:16 PM, Jeff King wrote:
> On Thu, May 24, 2012 at 02:01:57PM +0200, Michael Haggerty wrote:
>
>> On my setup, the above commit causes 12 tests in t4014 to fail. For
>> example, test 25:
>>
>>> -Message-Id:<0>
>>> +Message-Id:<1135adfeed86678c55e1aad7c568046ee8215660.1337860646.git.mhagger@michael.(none)>
>
> Thanks for the report. I know exactly what the issue is, as it came up
> in the discussion of the original series. 43ae9f47ab stopped using
> git_committer_info (which looks at $GIT_COMMITTER_EMAIL) for the end of
> the message-id and started using the default-generated email directly.
>
> Nobody should care, because either:
>
> 1. The defaults set up a reasonable hostname for your machine.
>
> 2. They do not, but you adjust it by setting user.email. Otherwise,
> your author ident would have this bogus email in it.
I'm trying hard not to get sucked into this topic (I just want the test
suite to work again!) but I infer that the reason for the failure in my
setup is that I have a global user.name but no global user.email
configured. I want git to remind me to configure user.email at the
repository level so that I can set my work email address for proprietary
projects and my personal email for open-source projects.
Ignorant idea: since this test is executed in a test repo, would it help
to set a dummy user.name and user.email at the test repository level
using "git config", perhaps as part of the standard test repo setup?
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 20:49 ` Michael Haggerty
@ 2012-05-24 21:02 ` Jeff King
0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 21:02 UTC (permalink / raw)
To: Michael Haggerty; +Cc: Junio C Hamano, git discussion list
On Thu, May 24, 2012 at 10:49:55PM +0200, Michael Haggerty wrote:
> >Nobody should care, because either:
> >
> > 1. The defaults set up a reasonable hostname for your machine.
> >
> > 2. They do not, but you adjust it by setting user.email. Otherwise,
> > your author ident would have this bogus email in it.
>
> I'm trying hard not to get sucked into this topic (I just want the
> test suite to work again!) but I infer that the reason for the
> failure in my setup is that I have a global user.name but no global
> user.email configured.
No, not at all. The problem is that the test suite does not look in your
.gitconfig at all (nor should it), but rather than providing its own
sensible gitconfig, it relies on the environment variables. Your
personal setup should not be relevant; only the fact that your machine
happens to not have a fully qualified hostname.
> I want git to remind me to configure user.email at the repository
> level so that I can set my work email address for proprietary projects
> and my personal email for open-source projects.
What you're doing is sane. However, I suspect that format-patch silently
generates bogus message-ids when you do not have your user.email set
(and it always has). I wonder if it is worth having it barf when
"(none)" is in the email.
For that matter, I really wonder if the "(none)" fallback even makes
sense these days. We encourage people to set up user.*. But I guess it
helps people on remote servers which write reflogs during a push; it is
better to write junk into the reflog than to fail the push.
> Ignorant idea: since this test is executed in a test repo, would it
> help to set a dummy user.name and user.email at the test repository
> level using "git config", perhaps as part of the standard test repo
> setup?
Yes, that would solve your test failure. On the other hand, not having
it has spurred more discussion of this situation, so maybe there is some
good in having it that way.
At any rate, there is a slight complication, because tests sometimes
make their own sub-repositories. A more sensible solution would be to
put it in $HOME/.gitconfig, but that is also complicated. The tests have
$HOME set to the repository directory, so the extra cruft there would
cause some tests to fail (and splitting them into two directories would
likewise cause other tests to fail). That can be remedied, of course,
but it is definitely not a one-line fix.
I think at this point I favor just fixing your bug, either with the
patch I just posted, or a slightly nicer series refactoring fmt_ident to
handle this situation better (which I am working up right now).
-Peff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 20:15 ` Jeff King
@ 2012-05-24 23:25 ` Jeff King
2012-05-24 23:26 ` [PATCH 1/7] ident: refactor empty ident error message Jeff King
` (7 more replies)
0 siblings, 8 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:25 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
On Thu, May 24, 2012 at 04:15:53PM -0400, Jeff King wrote:
> > If you mean "why environment and not an API call?", then I would have to
> > agree. ident_committer_email() call, that returns a sanitized version,
> > would have been a natural way to write this, if it were available.
>
> I meant the latter. There is no such call, but I can make one. Let me
> see how awkward it is.
Here it is.
[1/7]: ident: refactor empty ident error message
This one is only tangentially related. I was going to touch the message
more in patch 3, but decided not to (details in that patch).
[2/7]: ident: refactor NO_DATE flag in fmt_ident
[3/7]: ident: let callers omit name with fmt_indent
[4/7]: format-patch: use GIT_COMMITTER_EMAIL in message ids
These ones should fix Michael's failing test and restore the original
behavior.
[5/7]: ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
[6/7]: ident: reject bogus email addresses with IDENT_STRICT
[7/7]: format-patch: do not use bogus email addresses in message ids
These ones prevent bogus message ids from being generated at all
(which is an improvement over the previous state).
-Peff
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/7] ident: refactor empty ident error message
2012-05-24 23:25 ` Jeff King
@ 2012-05-24 23:26 ` Jeff King
2012-05-24 23:26 ` [PATCH 2/7] ident: refactor NO_DATE flag in fmt_ident Jeff King
` (6 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
There's on point in printing the name, since it is by
definition the empty string if we have reached this code
path. Instead, let's be more clear that we are complaining
about the empty name, but still show the email address that
it is attached to (since that may provide some context to
the user).
Signed-off-by: Jeff King <peff@peff.net>
---
As I mentioned in the cover letter, this one is optional. But I think it
makes sense.
ident.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ident.c b/ident.c
index e279039..f5160e1 100644
--- a/ident.c
+++ b/ident.c
@@ -281,7 +281,7 @@ const char *fmt_ident(const char *name, const char *email,
if (error_on_no_name) {
if (name == git_default_name.buf)
fputs(env_hint, stderr);
- die("empty ident %s <%s> not allowed", name, email);
+ die("empty ident name (for <%s>) not allowed", email);
}
pw = xgetpwuid_self();
name = pw->pw_name;
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/7] ident: refactor NO_DATE flag in fmt_ident
2012-05-24 23:25 ` Jeff King
2012-05-24 23:26 ` [PATCH 1/7] ident: refactor empty ident error message Jeff King
@ 2012-05-24 23:26 ` Jeff King
2012-05-24 23:27 ` [PATCH 3/7] ident: let callers omit name with fmt_indent Jeff King
` (5 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
As a short-hand, we extract this flag into the local
variable "name_addr_only". It's more accurate to simply
negate this and refer to it as "want_date", which will be
less confusing when we add more NO_* flags.
While we're touching this part of the code, let's move the
call to ident_default_date() only when we are actually going
to use it, not when we have NO_DATE set, or when we get a
date from the environment.
Signed-off-by: Jeff King <peff@peff.net>
---
Just refactoring for the next patch...
ident.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/ident.c b/ident.c
index f5160e1..59beef2 100644
--- a/ident.c
+++ b/ident.c
@@ -268,7 +268,7 @@ const char *fmt_ident(const char *name, const char *email,
static struct strbuf ident = STRBUF_INIT;
char date[50];
int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
- int name_addr_only = (flag & IDENT_NO_DATE);
+ int want_date = !(flag & IDENT_NO_DATE);
if (!name)
name = ident_default_name();
@@ -287,10 +287,13 @@ const char *fmt_ident(const char *name, const char *email,
name = pw->pw_name;
}
- strcpy(date, ident_default_date());
- if (!name_addr_only && date_str && date_str[0]) {
- if (parse_date(date_str, date, sizeof(date)) < 0)
- die("invalid date format: %s", date_str);
+ if (want_date) {
+ if (date_str && date_str[0]) {
+ if (parse_date(date_str, date, sizeof(date)) < 0)
+ die("invalid date format: %s", date_str);
+ }
+ else
+ strcpy(date, ident_default_date());
}
strbuf_reset(&ident);
@@ -298,7 +301,7 @@ const char *fmt_ident(const char *name, const char *email,
strbuf_addstr(&ident, " <");
strbuf_addstr_without_crud(&ident, email);
strbuf_addch(&ident, '>');
- if (!name_addr_only) {
+ if (want_date) {
strbuf_addch(&ident, ' ');
strbuf_addstr_without_crud(&ident, date);
}
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/7] ident: let callers omit name with fmt_indent
2012-05-24 23:25 ` Jeff King
2012-05-24 23:26 ` [PATCH 1/7] ident: refactor empty ident error message Jeff King
2012-05-24 23:26 ` [PATCH 2/7] ident: refactor NO_DATE flag in fmt_ident Jeff King
@ 2012-05-24 23:27 ` Jeff King
2012-05-24 23:28 ` [PATCH 4/7] format-patch: use GIT_COMMITTER_EMAIL in message ids Jeff King
` (4 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
Most callers want to see all of "$name <$email> $date", but
a few want only limited parts, omitting the date, or even
the name. We already have IDENT_NO_DATE to handle the date
part, but there's not a good option for getting just the
email. Callers have to done one of:
1. Call ident_default_email; this does not respect
environment variables, nor does it promise to trim
whitespace or other crud from the result.
2. Call git_{committer,author}_info; this returns the name
and email, leaving the caller to parse out the wanted
bits.
This patch adds IDENT_NO_NAME; it stops short of adding
IDENT_NO_EMAIL, as no callers want it (nor are likely to),
and it complicates the error handling of the function.
When no name is requested, the angle brackets (<>) around
the email address are also omitted.
Signed-off-by: Jeff King <peff@peff.net>
---
I worked this up with IDENT_NO_EMAIL, too, but it really just made the
function harder to read. And I doubt any callers are going to want it.
cache.h | 1 +
ident.c | 14 +++++++++-----
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/cache.h b/cache.h
index 0ac0d57..4a76c23 100644
--- a/cache.h
+++ b/cache.h
@@ -889,6 +889,7 @@ enum date_mode parse_date_format(const char *format);
#define IDENT_ERROR_ON_NO_NAME 1
#define IDENT_NO_DATE 2
+#define IDENT_NO_NAME 4
extern const char *git_author_info(int);
extern const char *git_committer_info(int);
extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);
diff --git a/ident.c b/ident.c
index 59beef2..8b5080d 100644
--- a/ident.c
+++ b/ident.c
@@ -269,13 +269,14 @@ const char *fmt_ident(const char *name, const char *email,
char date[50];
int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
int want_date = !(flag & IDENT_NO_DATE);
+ int want_name = !(flag & IDENT_NO_NAME);
- if (!name)
+ if (want_name && !name)
name = ident_default_name();
if (!email)
email = ident_default_email();
- if (!*name) {
+ if (want_name && !*name) {
struct passwd *pw;
if (error_on_no_name) {
@@ -297,10 +298,13 @@ const char *fmt_ident(const char *name, const char *email,
}
strbuf_reset(&ident);
- strbuf_addstr_without_crud(&ident, name);
- strbuf_addstr(&ident, " <");
+ if (want_name) {
+ strbuf_addstr_without_crud(&ident, name);
+ strbuf_addstr(&ident, " <");
+ }
strbuf_addstr_without_crud(&ident, email);
- strbuf_addch(&ident, '>');
+ if (want_name)
+ strbuf_addch(&ident, '>');
if (want_date) {
strbuf_addch(&ident, ' ');
strbuf_addstr_without_crud(&ident, date);
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/7] format-patch: use GIT_COMMITTER_EMAIL in message ids
2012-05-24 23:25 ` Jeff King
` (2 preceding siblings ...)
2012-05-24 23:27 ` [PATCH 3/7] ident: let callers omit name with fmt_indent Jeff King
@ 2012-05-24 23:28 ` Jeff King
2012-05-24 23:28 ` [PATCH 5/7] ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT Jeff King
` (3 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
Before commit 43ae9f4, we generated the tail of a message id
by calling git_committer_info and parsing the email out of
the result. 43ae9f4 changed to use ident_default_email
directly, so we didn't have to bother with parsing. As a
side effect, it meant we no longer used GIT_COMMITTER_EMAIL
at all.
In general, this is probably reasonable behavior. Either the
default email is sane on your system, or you are using
user.email to provide something sane. The exception is if
you rely on GIT_COMMITTER_EMAIL being set all the time to
override the bogus generated email.
This is unlikely to match anybody's real-life setup, but we
do use it in the test environment. And furthermore, it's
what we have always done, and the change in 43ae9f4 was
about cleaning up, not fixing any bug; we should be
conservative and keep the behavior identical.
Signed-off-by: Jeff King <peff@peff.net>
---
This one should fix Michael's test failure. Let me know if it doesn't.
Arguably the call to ident_default_email() in http-push.c should be
converted in the same way. I'm unclear on how that value is actually
used, so it may not matter at all.
builtin/log.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/builtin/log.c b/builtin/log.c
index 8010a40..4538309 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -740,7 +740,8 @@ static void gen_message_id(struct rev_info *info, char *base)
{
struct strbuf buf = STRBUF_INIT;
strbuf_addf(&buf, "%s.%lu.git.%s", base,
- (unsigned long) time(NULL), ident_default_email());
+ (unsigned long) time(NULL),
+ git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE));
info->message_id = strbuf_detach(&buf, NULL);
}
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/7] ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
2012-05-24 23:25 ` Jeff King
` (3 preceding siblings ...)
2012-05-24 23:28 ` [PATCH 4/7] format-patch: use GIT_COMMITTER_EMAIL in message ids Jeff King
@ 2012-05-24 23:28 ` Jeff King
2012-05-24 23:32 ` [PATCH 6/7] ident: reject bogus email addresses with IDENT_STRICT Jeff King
` (2 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
Callers who ask for ERROR_ON_NO_NAME are not so much
concerned that the name will be blank (because, after all,
we will fall back to using the username), but rather it is a
check to make sure that low-quality identities do not end up
in things like commit messages or emails (whereas it is OK
for them to end up in things like reflogs).
When future commits add more quality checks on the identity,
each of these callers would want to use those checks, too.
Rather than modify each of them later to add a new flag,
let's refactor the flag.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/commit.c | 3 +--
builtin/log.c | 2 +-
builtin/merge.c | 4 ++--
builtin/tag.c | 2 +-
builtin/var.c | 4 ++--
cache.h | 2 +-
commit.c | 4 ++--
gpg-interface.c | 2 +-
ident.c | 6 +++---
9 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index a2ec73d..f43eaaf 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -526,8 +526,7 @@ static void determine_author_info(struct strbuf *author_ident)
if (force_date)
date = force_date;
- strbuf_addstr(author_ident, fmt_ident(name, email, date,
- IDENT_ERROR_ON_NO_NAME));
+ strbuf_addstr(author_ident, fmt_ident(name, email, date, IDENT_STRICT));
if (!split_ident_line(&author, author_ident->buf, author_ident->len)) {
export_one("GIT_AUTHOR_NAME", author.name_begin, author.name_end, 0);
export_one("GIT_AUTHOR_EMAIL", author.mail_begin, author.mail_end, 0);
diff --git a/builtin/log.c b/builtin/log.c
index 4538309..d86bca3 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1147,7 +1147,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
if (do_signoff) {
const char *committer;
const char *endpos;
- committer = git_committer_info(IDENT_ERROR_ON_NO_NAME);
+ committer = git_committer_info(IDENT_STRICT);
endpos = strchr(committer, '>');
if (!endpos)
die(_("bogus committer info %s"), committer);
diff --git a/builtin/merge.c b/builtin/merge.c
index 470fc57..dd50a0c 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1447,7 +1447,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
refresh_cache(REFRESH_QUIET);
if (allow_trivial && !fast_forward_only) {
/* See if it is really trivial. */
- git_committer_info(IDENT_ERROR_ON_NO_NAME);
+ git_committer_info(IDENT_STRICT);
printf(_("Trying really trivial in-index merge...\n"));
if (!read_tree_trivial(common->item->object.sha1,
head_commit->object.sha1,
@@ -1490,7 +1490,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
die(_("Not possible to fast-forward, aborting."));
/* We are going to make a new commit. */
- git_committer_info(IDENT_ERROR_ON_NO_NAME);
+ git_committer_info(IDENT_STRICT);
/*
* At this point, we need a real merge. No matter what strategy
diff --git a/builtin/tag.c b/builtin/tag.c
index 4fb6bd7..7b1be85 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -332,7 +332,7 @@ static void create_tag(const unsigned char *object, const char *tag,
sha1_to_hex(object),
typename(type),
tag,
- git_committer_info(IDENT_ERROR_ON_NO_NAME));
+ git_committer_info(IDENT_STRICT));
if (header_len > sizeof(header_buf) - 1)
die(_("tag header too big."));
diff --git a/builtin/var.c b/builtin/var.c
index 99d068a..aedbb53 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -11,7 +11,7 @@ static const char *editor(int flag)
{
const char *pgm = git_editor();
- if (!pgm && flag & IDENT_ERROR_ON_NO_NAME)
+ if (!pgm && flag & IDENT_STRICT)
die("Terminal is dumb, but EDITOR unset");
return pgm;
@@ -55,7 +55,7 @@ static const char *read_var(const char *var)
val = NULL;
for (ptr = git_vars; ptr->read; ptr++) {
if (strcmp(var, ptr->name) == 0) {
- val = ptr->read(IDENT_ERROR_ON_NO_NAME);
+ val = ptr->read(IDENT_STRICT);
break;
}
}
diff --git a/cache.h b/cache.h
index 4a76c23..06413e1 100644
--- a/cache.h
+++ b/cache.h
@@ -887,7 +887,7 @@ unsigned long approxidate_careful(const char *, int *);
unsigned long approxidate_relative(const char *date, const struct timeval *now);
enum date_mode parse_date_format(const char *format);
-#define IDENT_ERROR_ON_NO_NAME 1
+#define IDENT_STRICT 1
#define IDENT_NO_DATE 2
#define IDENT_NO_NAME 4
extern const char *git_author_info(int);
diff --git a/commit.c b/commit.c
index 9ed36c7..8248a99 100644
--- a/commit.c
+++ b/commit.c
@@ -1154,9 +1154,9 @@ int commit_tree_extended(const struct strbuf *msg, unsigned char *tree,
/* Person/date information */
if (!author)
- author = git_author_info(IDENT_ERROR_ON_NO_NAME);
+ author = git_author_info(IDENT_STRICT);
strbuf_addf(&buffer, "author %s\n", author);
- strbuf_addf(&buffer, "committer %s\n", git_committer_info(IDENT_ERROR_ON_NO_NAME));
+ strbuf_addf(&buffer, "committer %s\n", git_committer_info(IDENT_STRICT));
if (!encoding_is_utf8)
strbuf_addf(&buffer, "encoding %s\n", git_commit_encoding);
diff --git a/gpg-interface.c b/gpg-interface.c
index 09ab64a..0863c61 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -30,7 +30,7 @@ const char *get_signing_key(void)
{
if (configured_signing_key)
return configured_signing_key;
- return git_committer_info(IDENT_ERROR_ON_NO_NAME|IDENT_NO_DATE);
+ return git_committer_info(IDENT_STRICT|IDENT_NO_DATE);
}
/*
diff --git a/ident.c b/ident.c
index 8b5080d..c42258f 100644
--- a/ident.c
+++ b/ident.c
@@ -267,7 +267,7 @@ const char *fmt_ident(const char *name, const char *email,
{
static struct strbuf ident = STRBUF_INIT;
char date[50];
- int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
+ int strict = (flag & IDENT_STRICT);
int want_date = !(flag & IDENT_NO_DATE);
int want_name = !(flag & IDENT_NO_NAME);
@@ -279,7 +279,7 @@ const char *fmt_ident(const char *name, const char *email,
if (want_name && !*name) {
struct passwd *pw;
- if (error_on_no_name) {
+ if (strict) {
if (name == git_default_name.buf)
fputs(env_hint, stderr);
die("empty ident name (for <%s>) not allowed", email);
@@ -314,7 +314,7 @@ const char *fmt_ident(const char *name, const char *email,
const char *fmt_name(const char *name, const char *email)
{
- return fmt_ident(name, email, NULL, IDENT_ERROR_ON_NO_NAME | IDENT_NO_DATE);
+ return fmt_ident(name, email, NULL, IDENT_STRICT | IDENT_NO_DATE);
}
const char *git_author_info(int flag)
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 6/7] ident: reject bogus email addresses with IDENT_STRICT
2012-05-24 23:25 ` Jeff King
` (4 preceding siblings ...)
2012-05-24 23:28 ` [PATCH 5/7] ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT Jeff King
@ 2012-05-24 23:32 ` Jeff King
2012-05-24 23:32 ` [PATCH 7/7] format-patch: do not use bogus email addresses in message ids Jeff King
2012-05-25 0:08 ` t4014 broken by 43ae9f47ab: format-patch: use default email for generating " Junio C Hamano
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
If we come up with a hostname like "foo.(none)" because the
user's machine is not fully qualified, we should reject this
in strict mode (e.g., when we are making a commit object),
just as we reject an empty gecos username.
Signed-off-by: Jeff King <peff@peff.net>
---
Note that in conjunction with the previous patch, you can no longer "git
commit" with such a bogus address. I think this is a good thing.
However, it's possible some old-timers might disagree. I remember Linus
a long time ago mentioning that using the machine-name in the committer
line was a _feature_. These days he seems to set user.email to a real
address, though (and I think that is sane these days, because other
tools really want to do use identities as more than just a token. E.g.,
email tools, gpg-signing, etc).
ident.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/ident.c b/ident.c
index c42258f..ca098d9 100644
--- a/ident.c
+++ b/ident.c
@@ -288,6 +288,12 @@ const char *fmt_ident(const char *name, const char *email,
name = pw->pw_name;
}
+ if (strict && email == git_default_email.buf &&
+ !strstr(email, "(none)")) {
+ fputs(env_hint, stderr);
+ die("unable to auto-detect email address (got '%s')", email);
+ }
+
if (want_date) {
if (date_str && date_str[0]) {
if (parse_date(date_str, date, sizeof(date)) < 0)
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 7/7] format-patch: do not use bogus email addresses in message ids
2012-05-24 23:25 ` Jeff King
` (5 preceding siblings ...)
2012-05-24 23:32 ` [PATCH 6/7] ident: reject bogus email addresses with IDENT_STRICT Jeff King
@ 2012-05-24 23:32 ` Jeff King
2012-05-25 0:08 ` t4014 broken by 43ae9f47ab: format-patch: use default email for generating " Junio C Hamano
7 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-24 23:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
We can ask git_committer_info to be strict about coming up
with an email, which will die automatically on a poorly
configured machine. This is better than letting invalid
message-ids into the wild.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/log.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/builtin/log.c b/builtin/log.c
index d86bca3..906dca4 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -741,7 +741,7 @@ static void gen_message_id(struct rev_info *info, char *base)
struct strbuf buf = STRBUF_INIT;
strbuf_addf(&buf, "%s.%lu.git.%s", base,
(unsigned long) time(NULL),
- git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE));
+ git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT));
info->message_id = strbuf_detach(&buf, NULL);
}
--
1.7.10.1.25.g7031a0f
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-24 23:25 ` Jeff King
` (6 preceding siblings ...)
2012-05-24 23:32 ` [PATCH 7/7] format-patch: do not use bogus email addresses in message ids Jeff King
@ 2012-05-25 0:08 ` Junio C Hamano
2012-05-25 0:34 ` Jeff King
7 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2012-05-25 0:08 UTC (permalink / raw)
To: Jeff King; +Cc: Michael Haggerty, git discussion list
Jeff King <peff@peff.net> writes:
> On Thu, May 24, 2012 at 04:15:53PM -0400, Jeff King wrote:
>
>> > If you mean "why environment and not an API call?", then I would have to
>> > agree. ident_committer_email() call, that returns a sanitized version,
>> > would have been a natural way to write this, if it were available.
>>
>> I meant the latter. There is no such call, but I can make one. Let me
>> see how awkward it is.
>
> Here it is.
>
> [1/7]: ident: refactor empty ident error message
>
> This one is only tangentially related. I was going to touch the message
> more in patch 3, but decided not to (details in that patch).
>
> [2/7]: ident: refactor NO_DATE flag in fmt_ident
> [3/7]: ident: let callers omit name with fmt_indent
> [4/7]: format-patch: use GIT_COMMITTER_EMAIL in message ids
>
> These ones should fix Michael's failing test and restore the original
> behavior.
>
> [5/7]: ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
> [6/7]: ident: reject bogus email addresses with IDENT_STRICT
> [7/7]: format-patch: do not use bogus email addresses in message ids
>
> These ones prevent bogus message ids from being generated at all
> (which is an improvement over the previous state).
All looked pretty straightforward and cleanly done.
We might want to further tighten 6/7 to verify user-supplied (i.e. non
default) e-mail for sanity, as I agree with the comment below --- lines of
that patch.
Also the check might want to be further tightened in the RFC 822/2822/5322
sense, but getting it correct will open a huge can of worms; I think the
check in 6/7 is a good place to stop, at least for now.
Thanks.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids
2012-05-25 0:08 ` t4014 broken by 43ae9f47ab: format-patch: use default email for generating " Junio C Hamano
@ 2012-05-25 0:34 ` Jeff King
0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2012-05-25 0:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Michael Haggerty, git discussion list
On Thu, May 24, 2012 at 05:08:15PM -0700, Junio C Hamano wrote:
> > [5/7]: ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
> > [6/7]: ident: reject bogus email addresses with IDENT_STRICT
> > [7/7]: format-patch: do not use bogus email addresses in message ids
> >
> > These ones prevent bogus message ids from being generated at all
> > (which is an improvement over the previous state).
>
> All looked pretty straightforward and cleanly done.
>
> We might want to further tighten 6/7 to verify user-supplied (i.e. non
> default) e-mail for sanity, as I agree with the comment below --- lines of
> that patch.
Yeah. I'd be fine with more tightening, but I wanted to just catch the
most common uncontroversial problem in the initial round. We can build
on top later if we want.
> Also the check might want to be further tightened in the RFC 822/2822/5322
> sense, but getting it correct will open a huge can of worms; I think the
> check in 6/7 is a good place to stop, at least for now.
Yeah, I considered that, but got nervous thinking about the same can of
worms. It is not like people are complaining now, so I'd rather leave
it.
-Peff
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-05-25 0:35 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-24 12:01 t4014 broken by 43ae9f47ab: format-patch: use default email for generating message ids Michael Haggerty
2012-05-24 17:16 ` Jeff King
2012-05-24 20:07 ` Junio C Hamano
2012-05-24 20:15 ` Jeff King
2012-05-24 23:25 ` Jeff King
2012-05-24 23:26 ` [PATCH 1/7] ident: refactor empty ident error message Jeff King
2012-05-24 23:26 ` [PATCH 2/7] ident: refactor NO_DATE flag in fmt_ident Jeff King
2012-05-24 23:27 ` [PATCH 3/7] ident: let callers omit name with fmt_indent Jeff King
2012-05-24 23:28 ` [PATCH 4/7] format-patch: use GIT_COMMITTER_EMAIL in message ids Jeff King
2012-05-24 23:28 ` [PATCH 5/7] ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT Jeff King
2012-05-24 23:32 ` [PATCH 6/7] ident: reject bogus email addresses with IDENT_STRICT Jeff King
2012-05-24 23:32 ` [PATCH 7/7] format-patch: do not use bogus email addresses in message ids Jeff King
2012-05-25 0:08 ` t4014 broken by 43ae9f47ab: format-patch: use default email for generating " Junio C Hamano
2012-05-25 0:34 ` Jeff King
2012-05-24 20:49 ` Michael Haggerty
2012-05-24 21:02 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).