From: Ramkumar Ramachandra <artagnon@gmail.com>
To: Drew Northup <drew.northup@maine.edu>
Cc: Jonathan Nieder <jrnieder@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Git List <git@vger.kernel.org>,
David Barr <david.barr@cordelta.com>,
Sverre Rabbelier <srabbelier@gmail.com>
Subject: Re: [PATCH 4/5] fast-export: Introduce --inline-blobs
Date: Sat, 22 Jan 2011 14:54:20 +0530 [thread overview]
Message-ID: <20110122092416.GA7827@kytes> (raw)
In-Reply-To: <1295531623.4298.26.camel@drew-northup.unet.maine.edu>
Hi Drew,
Drew Northup writes:
> On Thu, 2011-01-20 at 10:20 +0530, Ramkumar Ramachandra wrote:
> > Hi,
> > Jonathan Nieder writes:
> > > Junio C Hamano wrote:
> > > > Ramkumar Ramachandra <artagnon@gmail.com> writes:
> > > > Just thinking aloud, but is it possible to write a filter that converts an
> > > > arbitrary G-F-I stream with referenced blobs into a G-F-I stream without
> > > > referenced blobs by inlining all the blobs?
> > >
> > > to avoid complexity in the svn fast-import backend itself.
> > > (Complicating detail: such a filter would presumably take responsibility
> > > for --export-marks, so it might want a way to retrieve commit marks
> > > from its downstream.)
> >
> > This filter will need to persist every blob for the entire lifetime of
> > the program. We can't possibly do it in-memory, so we have to find
> > some way to persist them on-disk and retrieve them very
> > quickly. Jonathan suggested using something like ToyoCabinet earlier-
> > I'll start working and see what I come up with.
>
> Is it worth including the extra dependency? Most systems that I'm in
> frequent contact with already have some lightweight BDB implementation
> already. I don't currently know of any with TokyoCabinet (or
> KyotoCabinet for that matter) already in place. Besides, if all you're
> doing is persisting blobs that you're likely to write out to disk
> eventually anyway you might as well just do so once you have them and
> keep an "index" (not to be confused with the Git Index, just lacking a
> better word right now) of what you have in some standard in-memory
> format (a heap?). From there you can build each commit into the Git
> Index in the proper order once you have the required parts for
> each--perhaps even re-using the blobs you've already dumped to disk
> (mv'ing them or something).
Agreed. I wouldn't like to introduce an extra dependency either. I was
talking about using it for prototyping- if the final version includes
an extra dependency, it's unlikely to get merged into git.git :) The
final design will probably use an in-memory B+ tree, but I haven't
thought about that hard enough.
-- Ram
next prev parent reply other threads:[~2011-01-22 9:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-19 5:44 [RFC PATCH v2 0/5] Towards a Git-to-SVN bridge Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 1/5] date: Expose the time_to_tm function Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 2/5] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra
2011-01-22 0:30 ` Junio C Hamano
2011-01-22 9:45 ` Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 3/5] Build an svn-fi target in contrib/svn-fe Ramkumar Ramachandra
2011-01-19 5:44 ` [PATCH 4/5] fast-export: Introduce --inline-blobs Ramkumar Ramachandra
2011-01-19 19:50 ` Junio C Hamano
2011-01-19 21:48 ` Jonathan Nieder
2011-01-20 4:50 ` Ramkumar Ramachandra
2011-01-20 5:48 ` Jonathan Nieder
2011-01-20 6:28 ` Ramkumar Ramachandra
2011-01-20 13:53 ` Drew Northup
2011-01-22 9:24 ` Ramkumar Ramachandra [this message]
2011-01-22 19:18 ` Jonathan Nieder
2011-01-20 5:41 ` Jonathan Nieder
2011-01-22 0:30 ` Junio C Hamano
2011-01-19 5:44 ` [PATCH 5/5] vcs-svn: Add dir_cache for svnload Ramkumar Ramachandra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110122092416.GA7827@kytes \
--to=artagnon@gmail.com \
--cc=david.barr@cordelta.com \
--cc=drew.northup@maine.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=srabbelier@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.