From: Felipe Contreras <felipe.contreras@gmail.com>
To: Michael J Gruber <git@drmicha.warpmail.net>
Cc: Jeff King <peff@peff.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: RFD: fast-import is picky with author names (and maybe it should - but how much so?)
Date: Sat, 10 Nov 2012 19:43:18 +0100 [thread overview]
Message-ID: <CAMP44s219Zi2NPt2vA+6Od_sVstFK85OXZK-9K1OCFpVh220+A@mail.gmail.com> (raw)
In-Reply-To: <509E8EB2.7040509@drmicha.warpmail.net>
On Sat, Nov 10, 2012 at 6:28 PM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Felipe Contreras venit, vidit, dixit 09.11.2012 15:34:
>> On Fri, Nov 9, 2012 at 10:28 AM, Michael J Gruber
>> <git@drmicha.warpmail.net> wrote:
>>
>>> Hg seems to store just anything in the author field ("committer"). The
>>> various interfaces that are floating around do some behind-the-back
>>> conversion to git format. The more conversions they do, the better they
>>> seem to work (no erroring out) but I'm wondering whether it's really a
>>> good thing, or whether we should encourage a more diligent approach
>>> which requires a user to map non-conforming author names wilfully.
>>
>> So you propose that when somebody does 'git clone hg::hg hg-git' the
>> thing should fail. I hope you don't think it's too unbecoming for me
>> to say that I disagree.
>
> There is no need to disagree with a proposal I haven't made. I would
> disagree with the proposal that I haven't made, too.
All right, we shouldn't encourage a more diligent approach which
requires a user to map author names then.
>> IMO it should be git fast-import the one that converts these bad
>> authors, not every single tool out there. Maybe throw a warning, but
>> that's all. Or maybe generate a list of bad authors ready to be filled
>> out. That way when a project is doing a real conversion, say, when
>> moving to git, they can run the conversion once and see which authors
>> are bad and not multiple times, each try taking longer than the next.
>
> As Jeff pointed out, git-fast-import expects output conforming to a
> certain standard, and that's not going to change. import is agnostic to
> where its import stream is coming from. Only the producer of that stream
> can have additional information about the provenience of the stream's
> data which may aid (possibly together with user input or choices) in
> transforming that into something conforming.
We already know where the import of those streams come from:
mercurial, bazaar, etc. There's absolutely nothing the tools exporting
data from those repositories can do, except try to convert all kind of
weird names--and many tools do it poorly.
So, the options are:
a) Leave the name conversion to the export tools, and when they miss
some weird corner case, like 'Author <email', let the user face the
consequences, perhaps after an hour of the process.
We know there are sources of data that don't have git-formatted author
names, so we know every tool out there must do this checking.
In addition to that, let the export tool decide what to do when one of
these bad names appear, which in many cases probably means do nothing,
so the user would not even see that such a bad name was there, which
might not be what they want.
b) Do the name conversion in fast-import itself, perhaps optionally,
so if a tool missed some weird corner case, the user does not have to
face the consequences.
The tool writers don't have to worry about this, so we would not have
tools out there doing a half-assed job of this.
And what happens when such bad names end up being consistent: warning,
a scaffold mapping of bad names, etc.
One is bad for the users, and the tools writers, only disadvantages,
the other is good for the users and the tools writers, only
advantages.
--
Felipe Contreras
next prev parent reply other threads:[~2012-11-10 18:43 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-02 14:43 RFD: fast-import is picky with author names (and maybe it should - but how much so?) Michael J Gruber
2012-11-02 14:47 ` Michael J Gruber
2012-11-08 20:09 ` Jeff King
2012-11-09 9:28 ` Michael J Gruber
2012-11-09 14:34 ` Felipe Contreras
2012-11-10 17:28 ` Michael J Gruber
2012-11-10 18:43 ` Felipe Contreras [this message]
2012-11-10 19:25 ` A Large Angry SCM
2012-11-11 12:41 ` Felipe Contreras
2012-11-11 17:00 ` A Large Angry SCM
2012-11-11 17:15 ` Jeff King
2012-11-11 17:45 ` Felipe Contreras
2012-11-11 18:14 ` Jeff King
2012-11-11 18:48 ` Felipe Contreras
2012-11-12 21:41 ` Jeff King
2012-11-12 22:47 ` Felipe Contreras
2012-11-13 10:15 ` Michael J Gruber
2012-11-13 18:15 ` Felipe Contreras
2012-11-11 18:16 ` A Large Angry SCM
2012-11-11 17:16 ` Felipe Contreras
2012-11-11 17:39 ` A Large Angry SCM
2012-11-11 17:49 ` Felipe Contreras
2012-11-12 17:45 ` Junio C Hamano
2012-11-12 20:46 ` Felipe Contreras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMP44s219Zi2NPt2vA+6Od_sVstFK85OXZK-9K1OCFpVh220+A@mail.gmail.com \
--to=felipe.contreras@gmail.com \
--cc=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).