git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Felipe Contreras <felipe.contreras@gmail.com>
Cc: Thomas Rast <trast@inf.ethz.ch>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Antoine Pelisse <apelisse@gmail.com>,
	Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: Re: [PATCH 4/4] fast-import: only store commit objects
Date: Mon, 06 May 2013 12:28:03 +0200	[thread overview]
Message-ID: <518785B3.3050606@alum.mit.edu> (raw)
In-Reply-To: <CAMP44s1R9hAMZ=DQoPiTVi3+40NpADjVFU7tYovZA8W-PWEhhg@mail.gmail.com>

On 05/03/2013 08:23 PM, Felipe Contreras wrote:
> On Fri, May 3, 2013 at 12:56 PM, Thomas Rast <trast@inf.ethz.ch> wrote:
>> Felipe Contreras <felipe.contreras@gmail.com> writes:
> 
>> How do we know that this doesn't break any users of fast-import?  Your
>> comment isn't very reassuring:
>>
>>> the vast majority of them will never be used again
>>
>> So what's with the minority?
> 
> Actually I don't think there's any minority. If the client program
> doesn't store blobs, the blob marks are not used anyway. So there's no
> change.

I haven't been following this conversation in detail, but your proposed
change sounds like something that would break cvs2git [1].  Let me
explain what cvs2git does and why:

CVS stores all of the revisions of a single file in a single filename,v
file in rcsfile(5) format.  The revisions are stored as deltas ordered
so that a single revision can be reconstructed from a single serial read
of the file.

cvs2git reads each of these files once, reconstructing *all* of the
revisions for a file in a single go.  It then pours them into a
git-fast-import stream as blobs and sets a mark on each blob.

Only much later in the conversion does it have enough information to
reconstruct tree-wide commits.  At that time it outputs git-fast-import
data (to a second file) defining the git commits and their ancestry.
The contents are defined by referring to the marks of blobs from the
first git-fast-import stream file.

This strategy speeds up the conversion *enormously*.

So if I understand correctly that you are proposing to stop allowing
marks on blob objects to be set and/or referred to later, then I object
vociferously.

If I've misunderstood then I'll go back into my hole :-)

Michael

[1] http://cvs2svn.tigris.org/cvs2git.html

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2013-05-06 10:28 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-03  4:31 [PATCH 0/4] fast-export: speed improvements Felipe Contreras
2013-05-03  4:31 ` [PATCH 1/4] fast-{import,export}: use get_sha1_hex() directly Felipe Contreras
2013-05-03 21:50   ` Junio C Hamano
2013-05-03  4:31 ` [PATCH 2/4] fast-export: improve speed by skipping blobs Felipe Contreras
2013-05-03 21:51   ` Junio C Hamano
2013-05-03  4:31 ` [PATCH 3/4] fast-export: don't parse all the commits Felipe Contreras
2013-05-03 21:54   ` Junio C Hamano
2013-05-04  0:06     ` Felipe Contreras
2013-05-04 19:22       ` Junio C Hamano
2013-05-03  4:31 ` [PATCH 4/4] fast-import: only store commit objects Felipe Contreras
2013-05-03 17:56   ` Thomas Rast
2013-05-03 18:23     ` Felipe Contreras
2013-05-06 10:28       ` Michael Haggerty [this message]
2013-05-06 10:32         ` Thomas Rast
2013-05-06 10:45           ` Michael Haggerty
2013-05-06 15:18             ` Junio C Hamano
2013-05-06 21:19               ` Felipe Contreras
2013-05-06 21:36                 ` Felipe Contreras
2013-05-07  3:14                   ` Michael Haggerty
2013-05-07  4:32                     ` Johannes Schindelin
2013-05-07  4:36                       ` Felipe Contreras
2013-05-07  2:58                 ` Michael Haggerty
2013-05-07  4:37                   ` Felipe Contreras
2013-05-06 21:04             ` Felipe Contreras
2013-05-07  3:27               ` Michael Haggerty
2013-05-07  4:39                 ` Johannes Schindelin
2013-05-07  4:49                   ` Felipe Contreras
2013-05-07  4:47                 ` Felipe Contreras
2013-05-07  6:47                   ` Michael Haggerty
2013-05-07  7:07                     ` Felipe Contreras
2013-05-07  7:12                 ` Junio C Hamano
2013-05-07  7:34                   ` Michael Haggerty
2013-05-06 12:20           ` Johannes Schindelin
2013-05-06 21:06             ` Felipe Contreras
2013-05-03 22:08     ` Junio C Hamano
2013-05-03 22:19       ` Felipe Contreras
2013-05-03 23:45         ` Junio C Hamano
2013-05-04  0:01           ` Felipe Contreras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=518785B3.3050606@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=apelisse@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=trast@inf.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).