git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tomas Carnecky <tom@dbservice.com>
To: Sverre Rabbelier <srabbelier@gmail.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>,
	git@vger.kernel.org, Ramkumar Ramachandra <artagnon@gmail.com>,
	David Michael Barr <david.barr@cordelta.com>
Subject: Re: [PATCH 5/6] Introduce the git fast-import-helper
Date: Sun, 03 Oct 2010 19:39:03 +0200	[thread overview]
Message-ID: <4CA8BFB7.2050707@dbservice.com> (raw)
In-Reply-To: <AANLkTinZ6NCvKeALDBfP4z=ewkwWVwHBk=C_LmXM7OFh@mail.gmail.com>

On 10/3/10 5:53 PM, Sverre Rabbelier wrote:
>> I only need two new things from fast-import:
>>  1) support non-numeric marks (and even this is maybe not strictly
>> required)
> 
> If this can be avoided, or worked around somehow, it would be a boon
> to performance. The current marks implementation uses a hash table
> index by the mark number, which is O(1), very efficient.

I also use a hash table (struct hash_table from hash.h). It's indexed by
the atom. So it's about equally fast as the existing one but uses
slightly more memory. I measured the speed and fih is about 5% slower
than fi. Also, I found out that setting max_packfile to 32MB makes the
import much faster (from 10 minutes down to 3m to import the sources of
git itself).

>>  2) dump the mark->sha1 mapping immediately after creating the object
>> (I heard there is a patch somewhere that does just that)
> 
> Why do you need that? Wouldn't the "write created object name to
> stdout" not be sufficient?

I do: fprintf(stdout, "mark :%s %s\n", mark, sha1_to_hex(sha1));
One reason why not just write the plain hash is because that's the same
syntax as the fih accepts in its input. This way you can do:
  $ ( cat marks; cat fast-export-stream ) | git fast-import-helper >> marks
and can restart at any time. Also, making the output a bit more
structured allows it to be easily extended in the future.

tom

  reply	other threads:[~2010-10-03 17:43 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-03 11:33 [RFC] New type of remote helpers Tomas Carnecky
2010-10-03 12:21 ` [PATCH 1/6] Remote helper: accept ':<value> <name>' as a response to 'list' Tomas Carnecky
2010-10-05  2:00   ` Jonathan Nieder
2010-10-07 21:17     ` Sverre Rabbelier
2010-10-03 12:21 ` [PATCH 2/6] Allow more than one keepfile in the transport Tomas Carnecky
2010-10-05  2:11   ` Jonathan Nieder
2010-10-03 12:21 ` [PATCH 3/6] Allow the transport fetch command to add additional refs Tomas Carnecky
2010-10-05  2:18   ` Jonathan Nieder
2010-10-03 12:21 ` [PATCH 4/6] Rename get_mode() to decode_tree_mode() and export it Tomas Carnecky
2010-10-05  2:23   ` Jonathan Nieder
2010-10-03 12:21 ` [PATCH 5/6] Introduce the git fast-import-helper Tomas Carnecky
2010-10-03 15:31   ` Jonathan Nieder
2010-10-03 15:45     ` Tomas Carnecky
2010-10-03 15:53       ` Sverre Rabbelier
2010-10-03 17:39         ` Tomas Carnecky [this message]
2010-10-03 23:15           ` Sverre Rabbelier
2010-10-03 12:21 ` [PATCH 6/6] Add git-remote-svn Tomas Carnecky
2010-10-05  2:26   ` Jonathan Nieder
2010-10-03 13:56 ` [RFC] New type of remote helpers Sverre Rabbelier
2010-10-03 15:13   ` Jonathan Nieder
2010-10-03 17:07     ` Ramkumar Ramachandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CA8BFB7.2050707@dbservice.com \
    --to=tom@dbservice.com \
    --cc=artagnon@gmail.com \
    --cc=david.barr@cordelta.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).