git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Advice on choosing git
@ 2010-05-12  6:31 Noah Silverman
  2010-05-12  9:04 ` Dmitry Potapov
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Noah Silverman @ 2010-05-12  6:31 UTC (permalink / raw)
  To: git

Hi,

I'm looking for both a version control system and backup system.


Up for consideration are Git, Bazaar, and generic Rsync.

In the past, I've just use Rsync to sync up the directories I care
about.  I just sync all the machines with the remote server.  Often,
I'll start working on a file or two at the office, rsync my work to the
server, then rsync them back down to my home machine to keep working at
night.  This works, but doesn't give me any nice VCS features, history,
collaboration, etc.  So clearly it is time to upgrade the system.

I work on both a laptop, and office machine and a home machine.

1) I'd like to keep my documents directory synced between the office and
home machines.
2) I'd like to keep two or three sub directories of this synced with my
laptop
3) We have a server in "the cloud" where I like to keep backups of my
documents.  Just in case.
3) I have a few project where I am the only developer, but want a VCS to
manage my changes.
4) I have 3-4 projects where there are a team of 3 of us and I want to
use a VCS.

In general, I might work on a given project/file on any of my machines
in a given day.  Not everything is a full "branch", but just some
ongoing work.  I've always followed the practice of backup up any
changed files remotely, just in case.  So with a VCS, I don't want a new
version number for everytime I change a file.  As I do incremental work
across three machines, it could quickly turn versioning nightmare.

I guess, that I need just keep some files backed up (and/or synced) as
they're not "working projects".  I will add new documents and
occasionally edit others, but no real need for versioning. Other files
are working projects (possible with collaboration) and need active VCS. 

I've heard amazing things about Git, but have a few concerns.  Hopefully
someone here can offer some suggestions.

1) Size.  THIS IS MY MAIN CONCERN - If I want to sync my home, office,
and server Document directories.  From what I have read, I will
effectively have multiple copies of each item on my hard drive, thus
eating up a lot of space (One of the "working file"and several in the
.git directory.) If I have multiple changes to a file, then I have
several full versions of it on my machine.  This could be a problem for
a directory with 100GB or more, especially on a laptop with limited hard
drive space.  I know Subversion is a dirty word around here, but it
seemed to only annotate and send the changes

2) Sub-directory selection.  On my laptop, I only want a few
sub-directories to be synced up.  I don't need my whole document tree,
but just a few directories of things I work on.

Bazaar also looks like a possible option, but I'm not sure it handles
drive usage better.  Their website has a lengthy manifesto about how
they're better than Git, but I don't have enough experience with either
to make an informed decision.

Any and all suggestions are welcome and appreciated.

Thank You,

--
Noah

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-12  6:31 Advice on choosing git Noah Silverman
@ 2010-05-12  9:04 ` Dmitry Potapov
  2010-05-12  9:15 ` Ramkumar Ramachandra
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Dmitry Potapov @ 2010-05-12  9:04 UTC (permalink / raw)
  To: Noah Silverman; +Cc: git

On Tue, May 11, 2010 at 11:31:34PM -0700, Noah Silverman wrote:
> 
> 1) Size.  THIS IS MY MAIN CONCERN - If I want to sync my home, office,
> and server Document directories.  From what I have read, I will
> effectively have multiple copies of each item on my hard drive, thus
> eating up a lot of space (One of the "working file"and several in the
> .git directory.)

Usually, Git is more efficient in disk space than other DVCS, because
it uses packages to store files. In each package contains deltified
and then gzip data, and this deltification is done not only relatively
to direct ancestor but potentially any suitable candidate (there is some
heuristic to find best). But when you add a new file to the repository
then it is stored just gzip compressed inside .gzip/objects. Such files
are often referred as "loose" in Git documentation. When you have a lot
of loose objects then the garbage collector is activated and packs them
together. Obviously, you can run "git gc" that manually, or to configure
the condition what means too many loose objects.

Even those files that are stored as loose objects is never transfered
separately over network. When you pull or push, all required objects are
packed together in a single package, and this package is sent to the
other side. So, on the other side they will never stored as separate
files. But each push/pull can create a new package, if you have too many
small packages, git-gc will combine them into a single package.

However, if you have huge multi-media files, I am not sure how Git is
good at handling them. There were some improvements to Git recently,
and there is a clone of git that specifically focuses on this problem:
http://caca.zoy.org/wiki/git-bigfile
but I don't know much about it.

> several full versions of it on my machine.  This could be a problem for
> a directory with 100GB or more, especially on a laptop with limited hard
> drive space.  I know Subversion is a dirty word around here, but it
> seemed to only annotate and send the changes

Actually, Subversion is very inefficient in space usage (at least,
when I used it last time). I had a repository where subversion checkout
took much more space than git working tree and the whole repository with
all history combine! Obviously, a centralized VCS do not have to store
the whole history on each client, which saves space, but having the
whole history with you is very handy, and also it avoids the situation
where you have a single point of failure.

BTW, git allows to do a shallow clone to save space by not storying the
whole history (only the specified number of revisions), but I have never
used this feature, and it has some limitations.

> 
> 2) Sub-directory selection.  On my laptop, I only want a few
> sub-directories to be synced up.  I don't need my whole document tree,
> but just a few directories of things I work on.

Synchronization works on what you committed in your repository. At
this level, directories are completely irrelevant. Probably, you
want to have a separate repository for each sub-directory that you
want to synchronize separately, and then you can bundle them together
using git-submodules mechanism or trivial shell script that will
synchronize all of them.

In fact, the basic concept of Git is to treat a single repository
as whole. So, if you have some pieces that are irrelevant, it is
better to store them in separate repositories. It will improve
speed and possible disk usage, because deltifying will have easy
time to find related files, so compression will be better.

> 
> Bazaar also looks like a possible option, but I'm not sure it handles
> drive usage better.  Their website has a lengthy manifesto about how
> they're better than Git, but I don't have enough experience with either
> to make an informed decision.

Well, this manifesto sounds like written by a marketing guy, and it
compares Bazaar to rather old version of Git... So I am not going to
comment on it.

In fact, any meaningful comparison has to consider your workflow. Git
targets fully distributed workflow, which may even have hierarchy of
repositories, while Bazaar focus around more centralized solution and
close to what you have with Subversion. So, people who got used to a
centralized VCS may find Bazaar easier at the beginning, but IMHO,
Git is more flexible and when you learn basic principles everything
feels very natural.

In any case, your main concern was the size of the repository, and
even this marketing piece from Bazaar admits that Git is better at
saving disk space.

Here you can see some comparison of a repository size for Git,
Mercurial, Bazaar:
http://vcscompare.blogspot.com/2008/06/git-mercurial-bazaar-repository-size.html



Dmitry

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-12  6:31 Advice on choosing git Noah Silverman
  2010-05-12  9:04 ` Dmitry Potapov
@ 2010-05-12  9:15 ` Ramkumar Ramachandra
  2010-05-12  9:24 ` Jonathan Nieder
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Ramkumar Ramachandra @ 2010-05-12  9:15 UTC (permalink / raw)
  To: Noah Silverman; +Cc: git, Dmitry Potapov

Hi,

On Wed, May 12, 2010 at 8:31 AM, Noah Silverman <noah@smartmediacorp.com> wrote:
> I'm looking for both a version control system and backup system.

I recommend git for versioning and bup [1] for backup.

> Bazaar also looks like a possible option, but I'm not sure it handles
> drive usage better.  Their website has a lengthy manifesto about how
> they're better than Git, but I don't have enough experience with either
> to make an informed decision.

Scott Chacon maintains this page [2] that compares Git with other
versioning systems. From my personal experience, I find Bazaar to be
quite horrible and slow. Mercurial is user-friendly, but nowhere near
Git in terms of speed, power, and size.

-- Ram

[1] http://github.com/apenwarr/bup
[2] http://whygitisbetterthanx.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-12  6:31 Advice on choosing git Noah Silverman
  2010-05-12  9:04 ` Dmitry Potapov
  2010-05-12  9:15 ` Ramkumar Ramachandra
@ 2010-05-12  9:24 ` Jonathan Nieder
  2010-05-13  0:18 ` Joe Brenner
  2010-05-13 18:20 ` Martin Langhoff
  4 siblings, 0 replies; 13+ messages in thread
From: Jonathan Nieder @ 2010-05-12  9:24 UTC (permalink / raw)
  To: Noah Silverman; +Cc: git, Avery Pennarun

Hi,

Noah Silverman wrote:

> I'm looking for both a version control system and backup system.

I am fond of this question. :)

> I guess, that I need just keep some files backed up (and/or synced) as
> they're not "working projects".  I will add new documents and
> occasionally edit others, but no real need for versioning.

I suggest rsync or unison[1], and to use btrfs locally if you want
snapshots.  I don’t know a good tool for shared snapshots, but that is
probably my ignorance.

In my humble opinion, tools designed for tracking source code, like
git and bzr, are not appropriate for this task.  To illustrate this, I
have put some thoughts about how to cheat git into doing an okay job
in a footnote[4].

> Other files
> are working projects (possible with collaboration) and need active VCS. 

In very small projects, I believe any free DVCS will do.

What tools are you and your collaborators already comfortable with?
I hear it can be hard to unlearn habits from using Subversion when
getting started with Git.  Some other version control systems cater to
that transition better.

As projects scale in size, the speed differences between version
control systems start to matter.  I find myself making larger commits,
looking through history less, and checking email more often when using
certain systems.

> From what I have read, I will
> effectively have multiple copies of each item on my hard drive, thus
> eating up a lot of space (One of the "working file"and several in the
> .git directory.) If I have multiple changes to a file, then I have
> several full versions of it on my machine.

If your files are relatively compressible (or at least rsyncable) and
you pack your the repository occasionally, this should not be a
problem.  The relevant page[2] of the Pro Git book tells probably more
than you wanted to know about this.

Short summary: each file is initially stored in the .git directory as
a compressed file named after its content.  When asked to pack with
the "git gc"[3] command (or automatically if there are too many
unpacked objects around), git puts the data into a larger "pack file",
this time as a delta against some suitable similar blob.

For source code (which is already rather compressible), this tends to
work well.  My local git/.git object repository is about 2½ times the
size of the working copy.

> This could be a problem for
> a directory with 100GB or more, especially on a laptop with limited hard
> drive space.

Yes.  Actually, this point is why I replied.  Using a source code
management system as a backup system generally implies this weird
assumption that even the oldest revisions are always worth keeping.

With big, machine-generated files, that doesn’t make sense to me ---
it is better to be able to throw away some snapshots when you are
running low on space.

> 2) Sub-directory selection.  On my laptop, I only want a few
> sub-directories to be synced up.  I don't need my whole document tree,
> but just a few directories of things I work on.

It requires foresight, but you could use a separate filesystem for
this (possibly loop-mounted) if you want to keep snapshots.  With
some symlinks, this would not require changing the directory
structure.

> Any and all suggestions are welcome and appreciated.

Thanks for the food for thought.
Jonathan

[1] http://www.cis.upenn.edu/~bcpierce/unison/
[2] http://progit.org/book/ch9-4.html
[3] http://www.kernel.org/pub/software/scm/git/docs/git-gc.html
[4]
So, you want to use git as a general backup tool?

 . Files should be compressible.  Set appropriate attributes.  Use
   clean and smudge filters[5] to replace the weird working-copy
   representation with a simpler tracked form.  Use !delta[6] where
   appropriate so git knows not to waste its time.

 . Files should be conducive to de-duplication.  Cut large files
   into slices using rsync’s rolling checksum algorithm[7].

 . Backups should be fault-tolerant.  Use par2[8] or zfec[9] to
   protect pack files, maybe.

 . Sometimes metadata (file owners and modes) is important.  Track a
   "restore" script that sets the appropriate metadata, and update it
   before each commit[10].

 . Files should not change as git reads them (or it will error
   out).  Wait for a quiescent state to backup, or make a
   snapshot some other way and ask git to back up that.

 . Old revisions are not precious.  It would be nice to be able to
   decide when each backed-up tree can expire.  My best suggestion is
   to rely on reflogs[11] instead of the revision graph to represent
   your history so old versions can expire, but getting this to work
   nicely would take some work: there is no built-in mechanism to
   transfer reflogs and associated objects to another repository, for
   example.

[5] http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html#_tt_filter_tt
[6] http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html#_tt_delta_tt
[7] http://github.com/apenwarr/bup
[8] http://parchive.sourceforge.net/
[9] http://allmydata.org/trac/zfec
[10] http://kitenet.net/~joey/code/etckeeper/
[11] http://www.kernel.org/pub/software/scm/git/docs/git-reflog.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-12  6:31 Advice on choosing git Noah Silverman
                   ` (2 preceding siblings ...)
  2010-05-12  9:24 ` Jonathan Nieder
@ 2010-05-13  0:18 ` Joe Brenner
  2010-05-13  0:31   ` Avery Pennarun
  2010-05-13 11:42   ` Matthieu Moy
  2010-05-13 18:20 ` Martin Langhoff
  4 siblings, 2 replies; 13+ messages in thread
From: Joe Brenner @ 2010-05-13  0:18 UTC (permalink / raw)
  To: Noah Silverman; +Cc: git


Noah Silverman <noah@smartmediacorp.com> wrote:

> I'm looking for both a version control system and backup system.

I had a similar thought some time ago.  I thought that putting my life
inside of a distributed version control system (my first thought back
then was Monotone) would also be a convienient way to handle the
laptop-workstation sync problem.

But:

> 1) Size.  THIS IS MY MAIN CONCERN - If I want to sync my home, office,
> and server Document directories.  From what I have read, I will
> effectively have multiple copies of each item on my hard drive, thus
> eating up a lot of space

Pretty much any version control system is going to have this problem,
and it gets really bad if you've got any files that aren't straight text.

You won't get any benefit out of things like "git diff" either.  The
diffs we have (these days at least) don't work well on anything but plain
text.

I suggest you stick to using git down on the project level, where a
project should be limited to things like code development (or writing
projects where you stick to text formats), and give up on any ideas like
putting your entire home directory into a single repository.

As far as mirroring machines go, rsync based solutions actually aren't
that bad, though in addition to the annoying syntax gotchas, I've had
problems with an unreliable laptop clock.  Lately I've been using the
"--size-only" option of rsync, which assumes that if a file is bigger
it must be newer.

I tend to use a perl script something like this, which copies newer
stuff from a given directory to an analogous directory on a remote
machine:

  use File::Basename qw( dirname );
  my $this  = shift;   # e.g. '/home/doom/dev/code
  my $there = shift;   # e.g. 'doom@192.168.1.3'
  my $this_loc = dirname( $this );
  $cmd = "rsync -avz --size-only -e ssh $this $there:$this_loc";
  system( $cmd );

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-13  0:18 ` Joe Brenner
@ 2010-05-13  0:31   ` Avery Pennarun
  2010-05-13 11:48     ` Matthieu Moy
  2010-05-19  0:37     ` Anthony W. Youngman
  2010-05-13 11:42   ` Matthieu Moy
  1 sibling, 2 replies; 13+ messages in thread
From: Avery Pennarun @ 2010-05-13  0:31 UTC (permalink / raw)
  To: Joe Brenner; +Cc: Noah Silverman, git

On Wed, May 12, 2010 at 8:18 PM, Joe Brenner <doom@kzsu.stanford.edu> wrote:
> Noah Silverman <noah@smartmediacorp.com> wrote:
>> 1) Size.  THIS IS MY MAIN CONCERN - If I want to sync my home, office,
>> and server Document directories.  From what I have read, I will
>> effectively have multiple copies of each item on my hard drive, thus
>> eating up a lot of space
>
> Pretty much any version control system is going to have this problem,
> and it gets really bad if you've got any files that aren't straight text.

Note that most people probably don't need to worry about this
nowadays.  Disk $/gigabyte just keeps dropping and is now at
absolutely abysmally small levels.  You can only fill up your disks if
you download tons of movies and/or create tons of VMs.

If you're struggling with a laptop drive that's too small, just buy a
new one for $100 and solve all your problems.

So you're fine with storing multiple copies.  Just make sure your
backup/syncing software has an expiration algorithm so you don't end
up storing *all* the historical copies.

I'd like to adapt bup to support this usage model eventually.
However, I haven't yet written the expiration algorithm and it doesn't
yet support two-way syncing.  The fundamental design allows for this,
though, so it's just a matter of having some free time.  Meanwhile,
you might want to take a look at something like rdiff-backup.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-13  0:18 ` Joe Brenner
  2010-05-13  0:31   ` Avery Pennarun
@ 2010-05-13 11:42   ` Matthieu Moy
  2010-05-13 11:51     ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Matthieu Moy @ 2010-05-13 11:42 UTC (permalink / raw)
  To: Joe Brenner; +Cc: Noah Silverman, git

Joe Brenner <doom@kzsu.stanford.edu> writes:

> You won't get any benefit out of things like "git diff" either.  The
> diffs we have (these days at least) don't work well on anything but plain
> text.

Not totally true. textconv filter is just great when working when
word-processors (with the filter being odt2txt or antiword).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-13  0:31   ` Avery Pennarun
@ 2010-05-13 11:48     ` Matthieu Moy
  2010-05-13 17:31       ` Avery Pennarun
  2010-05-19  0:37     ` Anthony W. Youngman
  1 sibling, 1 reply; 13+ messages in thread
From: Matthieu Moy @ 2010-05-13 11:48 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Joe Brenner, Noah Silverman, git

Avery Pennarun <apenwarr@gmail.com> writes:

> You can only fill up your disks if
> you download tons of movies and/or create tons of VMs.

Right, but if you do so, managing your movies and VMs with Git would
be really bad idea. Typically, you don't want your backup system to
try to diff each movie with each other to save space.

> Just make sure your backup/syncing software has an expiration
> algorithm so you don't end up storing *all* the historical copies.

And this is where Git will be really bad. Removing past revisions
means editing history, and while Git knows how to edit history,
syncing after doing that will be terrible.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-13 11:42   ` Matthieu Moy
@ 2010-05-13 11:51     ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2010-05-13 11:51 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Joe Brenner, Noah Silverman, git

On Thu, May 13, 2010 at 01:42:58PM +0200, Matthieu Moy wrote:

> Joe Brenner <doom@kzsu.stanford.edu> writes:
> 
> > You won't get any benefit out of things like "git diff" either.  The
> > diffs we have (these days at least) don't work well on anything but plain
> > text.
> 
> Not totally true. textconv filter is just great when working when
> word-processors (with the filter being odt2txt or antiword).

I agree. Whoever wrote the textconv code was a genius. ;)

But I did want to note that textconv is just _one_ way of seeing the
data. You can also have git invoke custom diff and merge handlers. I
haven't tried it, but I suspect you may be able to drive the interactive
graphical versioning found in many word processors. I thought somebody
had done some work on this, but I can't seem to dig up a link.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-13 11:48     ` Matthieu Moy
@ 2010-05-13 17:31       ` Avery Pennarun
  0 siblings, 0 replies; 13+ messages in thread
From: Avery Pennarun @ 2010-05-13 17:31 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Joe Brenner, Noah Silverman, git

On Thu, May 13, 2010 at 7:48 AM, Matthieu Moy
<Matthieu.Moy@grenoble-inp.fr> wrote:
> Avery Pennarun <apenwarr@gmail.com> writes:
>> You can only fill up your disks if
>> you download tons of movies and/or create tons of VMs.
>
> Right, but if you do so, managing your movies and VMs with Git would
> be really bad idea. Typically, you don't want your backup system to
> try to diff each movie with each other to save space.

This problem is supposedly solved by the git-bigfiles project.  bup
does things a bit differently, but works well when deduplicating
things like VMs and movies, even though it uses the git repository
format.

>> Just make sure your backup/syncing software has an expiration
>> algorithm so you don't end up storing *all* the historical copies.
>
> And this is where Git will be really bad. Removing past revisions
> means editing history, and while Git knows how to edit history,
> syncing after doing that will be terrible.

Yeah, obviously an SCM doesn't really need history expiration
features, and git's transport protocols are optimized with the
assumption that expiration will never happen.  bup uses a different
protocol so it won't have this problem (but bup doesn't have any
expiration features at all, right now).  rdiff-backup, which I also
mentioned, is efficient in the face of expiration.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-12  6:31 Advice on choosing git Noah Silverman
                   ` (3 preceding siblings ...)
  2010-05-13  0:18 ` Joe Brenner
@ 2010-05-13 18:20 ` Martin Langhoff
  4 siblings, 0 replies; 13+ messages in thread
From: Martin Langhoff @ 2010-05-13 18:20 UTC (permalink / raw)
  To: Noah Silverman; +Cc: git

On Wed, May 12, 2010 at 2:31 AM, Noah Silverman <noah@smartmediacorp.com> wrote:
> 1) I'd like to keep my documents directory synced between the office and
> home machines.
> 2) I'd like to keep two or three sub directories of this synced with my
> laptop
> 3) We have a server in "the cloud" where I like to keep backups of my
> documents.  Just in case.

Unison is a perfect fit for items 1 & 2. Works well over large files,
does not keep history.

The DSCMs tend to fall over with very large (often binary) files. A
few videos included in your presentation, high res TIFF images,
horrendously fat PDFs coming from a third party, even an openoffice
presentation with many imagees, all make the DSCMs choke, because
internally the DSCM wants to load it into RAM to store deltas.

The DSCM assumption is that a source file will fit in ram with ample
space to spare, to be delta'd (for storage) and diff'd.

> 3) I have a few project where I am the only developer, but want a VCS to
> manage my changes.
> 4) I have 3-4 projects where there are a team of 3 of us and I want to
> use a VCS.

git is a perfect fit for 3 & 4. Mercurial is a close competitor.

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-13  0:31   ` Avery Pennarun
  2010-05-13 11:48     ` Matthieu Moy
@ 2010-05-19  0:37     ` Anthony W. Youngman
  2010-05-19  1:12       ` Avery Pennarun
  1 sibling, 1 reply; 13+ messages in thread
From: Anthony W. Youngman @ 2010-05-19  0:37 UTC (permalink / raw)
  To: git

In message 
<AANLkTikc6_jZoMzF1VhfJBSk1DRHCNNP3puPT0Z2Usk5@mail.gmail.com>, Avery 
Pennarun <apenwarr@gmail.com> writes
>On Wed, May 12, 2010 at 8:18 PM, Joe Brenner <doom@kzsu.stanford.edu> wrote:
>> Noah Silverman <noah@smartmediacorp.com> wrote:
>>> 1) Size.  THIS IS MY MAIN CONCERN - If I want to sync my home, office,
>>> and server Document directories.  From what I have read, I will
>>> effectively have multiple copies of each item on my hard drive, thus
>>> eating up a lot of space
>>
>> Pretty much any version control system is going to have this problem,
>> and it gets really bad if you've got any files that aren't straight text.
>
>Note that most people probably don't need to worry about this
>nowadays.  Disk $/gigabyte just keeps dropping and is now at
>absolutely abysmally small levels.  You can only fill up your disks if
>you download tons of movies and/or create tons of VMs.
>
>If you're struggling with a laptop drive that's too small, just buy a
>new one for $100 and solve all your problems.

And create a bunch of new ones. I think you mean "buy yourself a new 
laptop"!

Just because YOUR computer is modern and is happy being fed a new bigger 
hard drive doesn't mean they all are. This computer here has 3/4gig ram. 
Tiny by modern standards but I can't put any more in - it only has three 
slots at 256Mb maximum each. And it's got a 250Gb drive but it can only 
use the first 128Gb (I'm being economical with the truth here, but 
hey...)

Anyways. Why should hundreds of people have to throw out thousands of 
serviceable machines just because a few programmers can't be assed to at 
least TRY to be economical with their usage of resources?

Cheers,
Wol
-- 
Anthony W. Youngman - anthony@thewolery.demon.co.uk

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Advice on choosing git
  2010-05-19  0:37     ` Anthony W. Youngman
@ 2010-05-19  1:12       ` Avery Pennarun
  0 siblings, 0 replies; 13+ messages in thread
From: Avery Pennarun @ 2010-05-19  1:12 UTC (permalink / raw)
  To: Anthony W. Youngman; +Cc: git

On Tue, May 18, 2010 at 8:37 PM, Anthony W. Youngman
<wol@thewolery.demon.co.uk> wrote:
> Just because YOUR computer is modern and is happy being fed a new bigger
> hard drive doesn't mean they all are. This computer here has 3/4gig ram.
> Tiny by modern standards but I can't put any more in - it only has three
> slots at 256Mb maximum each. And it's got a 250Gb drive but it can only use
> the first 128Gb (I'm being economical with the truth here, but hey...)
>
> Anyways. Why should hundreds of people have to throw out thousands of
> serviceable machines just because a few programmers can't be assed to at
> least TRY to be economical with their usage of resources?

It's a tradeoff.  There are a bunch of programs that can sync files
back and forth *without* keeping a history - and those tools are
mostly not used.  IMHO that's because they're too complicated and
dangerous; if something goes wrong with your sync, the
mistakenly-deleted-or-modified files are gone for good.  If I care
enough about my files to want to replicate them for safety, then I
care too much about them to trust them to an unpredictable sync
algorithm.

A version control system like git, on the other hand, makes a
different tradeoff: you can be reasonably sure that it'll *never*
permanently lose data, but to get that assurance, you're going to pay
for it in disk space.

If you want to use yesterday's computers, you're probably going to
have to be satisfied with yesterday's solutions.  AFAIK, home
directory replication has never been adequately solved.  Of course,
someone could still come along and invent an elegant, fast, reliable,
space-efficient, trustworthy solution to this problem.  But I don't
think that person has been along yet.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-05-19  1:13 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-12  6:31 Advice on choosing git Noah Silverman
2010-05-12  9:04 ` Dmitry Potapov
2010-05-12  9:15 ` Ramkumar Ramachandra
2010-05-12  9:24 ` Jonathan Nieder
2010-05-13  0:18 ` Joe Brenner
2010-05-13  0:31   ` Avery Pennarun
2010-05-13 11:48     ` Matthieu Moy
2010-05-13 17:31       ` Avery Pennarun
2010-05-19  0:37     ` Anthony W. Youngman
2010-05-19  1:12       ` Avery Pennarun
2010-05-13 11:42   ` Matthieu Moy
2010-05-13 11:51     ` Jeff King
2010-05-13 18:20 ` Martin Langhoff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).