git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: A Large Angry SCM <gitzilla@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jakub Narebski <jnareb@gmail.com>,
	Christian Couder <chriscool@tuxfamily.org>,
	git@vger.kernel.org
Subject: Re: Google Code: Support for Mercurial and Analysis of Git and Mercurial
Date: Sun, 26 Apr 2009 14:00:24 -0400	[thread overview]
Message-ID: <49F4A138.6040808@gmail.com> (raw)
In-Reply-To: <alpine.DEB.1.00.0904261943070.10279@pacific.mpi-cbg.de>

Johannes Schindelin wrote:
> Hi,
> 
> On Sun, 26 Apr 2009, A Large Angry SCM wrote:
> 
>> Johannes Schindelin wrote:
>>
>>> On Sun, 26 Apr 2009, A Large Angry SCM wrote:
>>>
>>>> Another important criteria was which, both or neither of Git and Hg 
>>>> would actually work and perform well on top of Google Code's 
>>>> underling storage system and except to mention they would be using 
>>>> Bigtable, the report did not discuss this. Git on top of Bigtable 
>>>> will not perform well.
>>> Actually, did we not arrive at the conclusion that it could perform 
>>> well at least with the filesystem layer on top of big table, but even 
>>> better if the big tables stored certain chunks (not really all that 
>>> different from the chunks needed for mirror-sync!)?
>>>
>>> Back when I discussed this with a Googler, it was all too obvious that 
>>> they are not interested (and in the meantime I understand why, see my 
>>> other mail).
>> I don't remember the mirror-sync discussion. But I do remember that when 
>> the discussion turned to implementing a filesystem on top of Bigtable 
>> that would not cause performance problems for Git, my response was that 
>> you'd still be much better off going to GFS directly instead of faking a 
>> filesystem on top of Bigtable without all of the Bigtable limitations.
> 
> Umm, GFS is built on top of Bigtable, no?

Other way around.

>> Bigtable _is_ appealing to implement the Git object store on. It's too 
>> bad the latency in Bigtable would make it horribly slow.
> 
> If you store one object per Bigtable, yes.  If you store a few undelta'd 
> objects there, and then use the pack run to optimize those tables, I think 
> it would not be horribly slow.  Of course, you'd need to do exactly the 
> same optimizations necessary for mirror-sync, but I might have mentioned 
> that already ;-)

But now you have to find where you stored those "few undelta'd objects" 
and then go get the object you're interested in. The only way you can 
win with that scheme is if you can find groups of objects that are 
(almost) always accessed together, for all objects (and still not get 
tripped up by the other limitations of Bigtable).

One method would be to group all of the commit objects into one BT entry 
and then create a BT entry for each commit that contains all the trees 
and blobs. This may be fast enough for some operations but would cause 
the storage requirements to explode.

  reply	other threads:[~2009-04-26 18:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-26  5:03 Google Code: Support for Mercurial and Analysis of Git and Mercurial Christian Couder
2009-04-26  7:12 ` Michael Witten
2009-04-26  8:16 ` Jakub Narebski
2009-04-26  8:23   ` Paolo Ciarrocchi
2009-04-26 10:07     ` Johannes Schindelin
2009-04-26 10:16       ` Jakub Narebski
2009-04-26 10:18       ` Johannes Schindelin
2009-04-26 12:02         ` Alex Blewitt
2009-04-27 20:31           ` Shawn O. Pearce
2009-04-26 10:21       ` Paolo Ciarrocchi
2009-04-26  9:21   ` Matthias Andree
2009-04-26 10:09     ` Jakub Narebski
2009-04-26 11:47       ` Matthias Andree
2009-04-26 19:57       ` Jakub Narebski
2009-04-26 14:54   ` A Large Angry SCM
2009-04-26 16:45     ` Michael Witten
2009-04-26 16:56     ` Johannes Schindelin
2009-04-26 17:33       ` A Large Angry SCM
2009-04-26 17:45         ` Johannes Schindelin
2009-04-26 18:00           ` A Large Angry SCM [this message]
2009-04-26 18:59   ` James Cloos
2009-04-26 10:13 ` Johannes Schindelin
2009-04-26 16:47   ` Michael Witten
2009-04-26 22:24   ` Miles Bader
2009-04-27 21:15   ` Shawn O. Pearce
2009-04-30  0:00     ` Mark Lodato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F4A138.6040808@gmail.com \
    --to=gitzilla@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).