git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Mike Hommey <mh@glandium.org>, git@vger.kernel.org
Subject: Re: [PATCH] Use GIT_COMMITTER_IDENT instead of hardcoded values in import-tars.perl
Date: Mon, 08 Sep 2008 13:40:20 -0700	[thread overview]
Message-ID: <7vzlmie5hn.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <alpine.DEB.1.00.0809081649040.13830@pacific.mpi-cbg.de.mpi-cbg.de> (Johannes Schindelin's message of "Mon, 8 Sep 2008 16:51:45 +0200 (CEST)")

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Sun, 7 Sep 2008, Mike Hommey wrote:
>
>> -my $committer_name = 'T Ar Creator';
>> -my $committer_email = 'tar@example.com';
>> +chomp(my $committer_ident = `git var GIT_COMMITTER_IDENT`);
>> +die 'You need to set user name and email'
>> +	unless ($committer_ident =~ s/(.+ <[^>]+>).*/\1/);
>
> I have at least one script that will be broken by this change in behavior.
>
> To me, the issue is just like git-cvsimport, which sets the committer not 
> to the actual committer, so that two people can end up with identical 
> commit names, even if they cvsimported independently.  I'd like the same 
> behavior for import-tars.  I actually use it that way.

I sense there are conflicting goals here.

cvsimport has partial information about the author (only short account
name and nothing else), and by replicating them without taking them
literally you can achieve reproducibility.  On the other extreme is to use
the authorname mapping file to sacrifice reproducibility with other people
that do not have the identical author mapping file to obtain more readable
resulting history with real names.  You can do both.

With the hardcoded 'T Ar Creator', you do not have any choice but strict
reproducibility without readable names.  With Mike's original patch to
make it in line with git-import.{sh,perl}, you cannot still have both,
because setting GIT_COMMITTER_NAME does not affect what user.name
configuration says.  But with "git var GIT_COMMITTER_IDENT", you could.

This makes me wonder if it might be a better design to:

 * Make fast-import feeders to preserve as much information from the
   source material but not from the environment.  This is half-similar in
   spirit to what cvsimport does---it does not know the timezone so it
   always uses GMT, and it uses the short account name because it is the
   only thing available, but it does not use hardcoded "cvs", and the
   environment can affect it further by setting up an author mapping
   file.  Here I am saying a fast-import feeder shouldn't (and does not
   have to) take the environment into account, if it does not have good
   data in the source material.

   In the context of importing tarballs, zipfiles and an existing directory
   which is a tarball extract, there is not much authorship information in
   the source material (each entry in a tarball may have the owner
   information, but what if your tarball have more than one files, with
   different owners?).

 * Invent a fast-import stream filter that allows you to munge authorship
   and committer information selectively.  Splice that in to the pipeline
   between the feeder and the fast-import, if you want the resulting
   history more readable if desired (e.g. use author mapping file).

   Or you can choose not to use such a filter, and get a reproducible
   result.

If the "filter" turns out to be simple enough, it might even make sense to
make it part of the fast-import itself, but that is an implementation
detail.

      reply	other threads:[~2008-09-08 20:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-24 12:57 [PATCH] Use user.name and user.email in import-tars.perl Mike Hommey
2008-02-24 18:06 ` Junio C Hamano
2008-09-07  8:52   ` [PATCH] Use GIT_COMMITTER_IDENT instead of hardcoded values " Mike Hommey
2008-09-07 17:09     ` Junio C Hamano
2008-09-08 14:51     ` Johannes Schindelin
2008-09-08 20:40       ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vzlmie5hn.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=mh@glandium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).