git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: A Large Angry SCM <gitzilla@gmail.com>
To: Felipe Contreras <felipe.contreras@gmail.com>,
	Michael J Gruber <git@drmicha.warpmail.net>,
	Git Mailing List <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>
Subject: Re: RFD: fast-import is picky with author names (and maybe it should - but how much so?)
Date: Sat, 10 Nov 2012 14:25:57 -0500	[thread overview]
Message-ID: <509EAA45.8020005@gmail.com> (raw)
In-Reply-To: <CAMP44s219Zi2NPt2vA+6Od_sVstFK85OXZK-9K1OCFpVh220+A@mail.gmail.com>

On 11/10/2012 01:43 PM, Felipe Contreras wrote:
> On Sat, Nov 10, 2012 at 6:28 PM, Michael J Gruber
> <git@drmicha.warpmail.net>  wrote:
>> Felipe Contreras venit, vidit, dixit 09.11.2012 15:34:
>>> On Fri, Nov 9, 2012 at 10:28 AM, Michael J Gruber
>>> <git@drmicha.warpmail.net>  wrote:
>>>
>>>> Hg seems to store just anything in the author field ("committer"). The
>>>> various interfaces that are floating around do some behind-the-back
>>>> conversion to git format. The more conversions they do, the better they
>>>> seem to work (no erroring out) but I'm wondering whether it's really a
>>>> good thing, or whether we should encourage a more diligent approach
>>>> which requires a user to map non-conforming author names wilfully.
>>>
>>> So you propose that when somebody does 'git clone hg::hg hg-git' the
>>> thing should fail. I hope you don't think it's too unbecoming for me
>>> to say that I disagree.
>>
>> There is no need to disagree with a proposal I haven't made. I would
>> disagree with the proposal that I haven't made, too.
>
> All right, we shouldn't encourage a more diligent approach which
> requires a user to map author names then.
>
>>> IMO it should be git fast-import the one that converts these bad
>>> authors, not every single tool out there. Maybe throw a warning, but
>>> that's all. Or maybe generate a list of bad authors ready to be filled
>>> out. That way when a project is doing a real conversion, say, when
>>> moving to git, they can run the conversion once and see which authors
>>> are bad and not multiple times, each try taking longer than the next.
>>
>> As Jeff pointed out, git-fast-import expects output conforming to a
>> certain standard, and that's not going to change. import is agnostic to
>> where its import stream is coming from. Only the producer of that stream
>> can have additional information about the provenience of the stream's
>> data which may aid (possibly together with user input or choices) in
>> transforming that into something conforming.
>
> We already know where the import of those streams come from:
> mercurial, bazaar, etc. There's absolutely nothing the tools exporting
> data from those repositories can do, except try to convert all kind of
> weird names--and many tools do it poorly.
>
> So, the options are:
>
> a) Leave the name conversion to the export tools, and when they miss
> some weird corner case, like 'Author<email', let the user face the
> consequences, perhaps after an hour of the process.
>
> We know there are sources of data that don't have git-formatted author
> names, so we know every tool out there must do this checking.
>
> In addition to that, let the export tool decide what to do when one of
> these bad names appear, which in many cases probably means do nothing,
> so the user would not even see that such a bad name was there, which
> might not be what they want.
>
> b) Do the name conversion in fast-import itself, perhaps optionally,
> so if a tool missed some weird corner case, the user does not have to
> face the consequences.
>
> The tool writers don't have to worry about this, so we would not have
> tools out there doing a half-assed job of this.
>
> And what happens when such bad names end up being consistent: warning,
> a scaffold mapping of bad names, etc.
>
>
> One is bad for the users, and the tools writers, only disadvantages,
> the other is good for the users and the tools writers, only
> advantages.
>

c) Do the name conversion, and whatever other cleanup and manipulations 
you're interesting in, in a filter between the exporter and git-fast-import.

  reply	other threads:[~2012-11-10 19:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02 14:43 RFD: fast-import is picky with author names (and maybe it should - but how much so?) Michael J Gruber
2012-11-02 14:47 ` Michael J Gruber
2012-11-08 20:09 ` Jeff King
2012-11-09  9:28   ` Michael J Gruber
2012-11-09 14:34     ` Felipe Contreras
2012-11-10 17:28       ` Michael J Gruber
2012-11-10 18:43         ` Felipe Contreras
2012-11-10 19:25           ` A Large Angry SCM [this message]
2012-11-11 12:41             ` Felipe Contreras
2012-11-11 17:00               ` A Large Angry SCM
2012-11-11 17:15                 ` Jeff King
2012-11-11 17:45                   ` Felipe Contreras
2012-11-11 18:14                     ` Jeff King
2012-11-11 18:48                       ` Felipe Contreras
2012-11-12 21:41                         ` Jeff King
2012-11-12 22:47                           ` Felipe Contreras
2012-11-13 10:15                             ` Michael J Gruber
2012-11-13 18:15                               ` Felipe Contreras
2012-11-11 18:16                   ` A Large Angry SCM
2012-11-11 17:16                 ` Felipe Contreras
2012-11-11 17:39                   ` A Large Angry SCM
2012-11-11 17:49                     ` Felipe Contreras
2012-11-12 17:45                 ` Junio C Hamano
2012-11-12 20:46                   ` Felipe Contreras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=509EAA45.8020005@gmail.com \
    --to=gitzilla@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).