From: A Large Angry SCM <gitzilla@gmail.com>
To: Felipe Contreras <felipe.contreras@gmail.com>,
Michael J Gruber <git@drmicha.warpmail.net>,
Git Mailing List <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>
Subject: Re: RFD: fast-import is picky with author names (and maybe it should - but how much so?)
Date: Sat, 10 Nov 2012 14:25:57 -0500 [thread overview]
Message-ID: <509EAA45.8020005@gmail.com> (raw)
In-Reply-To: <CAMP44s219Zi2NPt2vA+6Od_sVstFK85OXZK-9K1OCFpVh220+A@mail.gmail.com>
On 11/10/2012 01:43 PM, Felipe Contreras wrote:
> On Sat, Nov 10, 2012 at 6:28 PM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>> Felipe Contreras venit, vidit, dixit 09.11.2012 15:34:
>>> On Fri, Nov 9, 2012 at 10:28 AM, Michael J Gruber
>>> <git@drmicha.warpmail.net> wrote:
>>>
>>>> Hg seems to store just anything in the author field ("committer"). The
>>>> various interfaces that are floating around do some behind-the-back
>>>> conversion to git format. The more conversions they do, the better they
>>>> seem to work (no erroring out) but I'm wondering whether it's really a
>>>> good thing, or whether we should encourage a more diligent approach
>>>> which requires a user to map non-conforming author names wilfully.
>>>
>>> So you propose that when somebody does 'git clone hg::hg hg-git' the
>>> thing should fail. I hope you don't think it's too unbecoming for me
>>> to say that I disagree.
>>
>> There is no need to disagree with a proposal I haven't made. I would
>> disagree with the proposal that I haven't made, too.
>
> All right, we shouldn't encourage a more diligent approach which
> requires a user to map author names then.
>
>>> IMO it should be git fast-import the one that converts these bad
>>> authors, not every single tool out there. Maybe throw a warning, but
>>> that's all. Or maybe generate a list of bad authors ready to be filled
>>> out. That way when a project is doing a real conversion, say, when
>>> moving to git, they can run the conversion once and see which authors
>>> are bad and not multiple times, each try taking longer than the next.
>>
>> As Jeff pointed out, git-fast-import expects output conforming to a
>> certain standard, and that's not going to change. import is agnostic to
>> where its import stream is coming from. Only the producer of that stream
>> can have additional information about the provenience of the stream's
>> data which may aid (possibly together with user input or choices) in
>> transforming that into something conforming.
>
> We already know where the import of those streams come from:
> mercurial, bazaar, etc. There's absolutely nothing the tools exporting
> data from those repositories can do, except try to convert all kind of
> weird names--and many tools do it poorly.
>
> So, the options are:
>
> a) Leave the name conversion to the export tools, and when they miss
> some weird corner case, like 'Author<email', let the user face the
> consequences, perhaps after an hour of the process.
>
> We know there are sources of data that don't have git-formatted author
> names, so we know every tool out there must do this checking.
>
> In addition to that, let the export tool decide what to do when one of
> these bad names appear, which in many cases probably means do nothing,
> so the user would not even see that such a bad name was there, which
> might not be what they want.
>
> b) Do the name conversion in fast-import itself, perhaps optionally,
> so if a tool missed some weird corner case, the user does not have to
> face the consequences.
>
> The tool writers don't have to worry about this, so we would not have
> tools out there doing a half-assed job of this.
>
> And what happens when such bad names end up being consistent: warning,
> a scaffold mapping of bad names, etc.
>
>
> One is bad for the users, and the tools writers, only disadvantages,
> the other is good for the users and the tools writers, only
> advantages.
>
c) Do the name conversion, and whatever other cleanup and manipulations
you're interesting in, in a filter between the exporter and git-fast-import.
next prev parent reply other threads:[~2012-11-10 19:26 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-02 14:43 RFD: fast-import is picky with author names (and maybe it should - but how much so?) Michael J Gruber
2012-11-02 14:47 ` Michael J Gruber
2012-11-08 20:09 ` Jeff King
2012-11-09 9:28 ` Michael J Gruber
2012-11-09 14:34 ` Felipe Contreras
2012-11-10 17:28 ` Michael J Gruber
2012-11-10 18:43 ` Felipe Contreras
2012-11-10 19:25 ` A Large Angry SCM [this message]
2012-11-11 12:41 ` Felipe Contreras
2012-11-11 17:00 ` A Large Angry SCM
2012-11-11 17:15 ` Jeff King
2012-11-11 17:45 ` Felipe Contreras
2012-11-11 18:14 ` Jeff King
2012-11-11 18:48 ` Felipe Contreras
2012-11-12 21:41 ` Jeff King
2012-11-12 22:47 ` Felipe Contreras
2012-11-13 10:15 ` Michael J Gruber
2012-11-13 18:15 ` Felipe Contreras
2012-11-11 18:16 ` A Large Angry SCM
2012-11-11 17:16 ` Felipe Contreras
2012-11-11 17:39 ` A Large Angry SCM
2012-11-11 17:49 ` Felipe Contreras
2012-11-12 17:45 ` Junio C Hamano
2012-11-12 20:46 ` Felipe Contreras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=509EAA45.8020005@gmail.com \
--to=gitzilla@gmail.com \
--cc=felipe.contreras@gmail.com \
--cc=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).