git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Git vs Monotone
@ 2008-07-31 18:13 Sverre Rabbelier
  2008-07-31 18:33 ` Stephen R. van den Berg
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Sverre Rabbelier @ 2008-07-31 18:13 UTC (permalink / raw)
  To: Git Mailinglist

Heya,

I just read this blog post [0] in which one of the Pidgin devs sheds
his light on their 'tool choice'. In the post he mentions the
following figures:

"I don't mind the database, myself. I have 11 working copies
(checkouts) from my single pidgin database (8 distinct branches, plus
duplicates of the last three branches I worked on or tested with).
Each clean checkout (that is, a checkout prior to running autogen.sh
and building) is approximately 61 MB. If this were SVN, each working
copy would be approximately 122 MB due to svn keeping a pristine copy
of every file to facilitate 'svn diff' and 'svn revert' without
needing to contact the server the working copy was pulled from. Now,
let's add that up. For SVN, I would have 11 times 122 MB, or 1342 MB,
just in working copies. For monotone, I have 11 times 61 MB for the
working copies (671 MB), plus 229 MB for the database, for a grand
total of 900 MB. For me, this is an excellent bargain, as I save 442
MB of disk space thanks to the monotone model. For another compelling
comparison that's sure to ruffle a few feathers, let's compare to git.
If I clone the git mirror of our monotone repository, I find a
checkout size of 148 MB after git-repack--running git-gc also
increased the size by 2 MB, but I'll stick with the initial checkout
size for fairness. If I multiply this by my 11 checkouts, I will have
1628 MB. This is even more compelling for me, as I now save 728 MB of
disk space with monotone."

I'm in the process of cloning the repo myself, and will check if doing
a more aggressive (high --window and --depth values) repack will get
us below that 148, but I'm thinking it's just that big a repo. Anyway,
it seems git is getting screwed over in this post because he is not
taking advantage of git's object-database-sharing capabilities. Am i
right in thinking that with git-new-workdir we would end up at
61*11+148 = 819MB? (Which would actually put us below monotone by
80MB.) Not that I care much whether monotone or git is smaller in disk
size, I'm just curious if we indeed offer this capability? Perhaps
someone with more knowledge of git-new-workdir could shed a light?

[0] http://theflamingbanker.blogspot.com/2008/07/holy-war-of-tool-choice.html

--
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 18:13 Git vs Monotone Sverre Rabbelier
@ 2008-07-31 18:33 ` Stephen R. van den Berg
  2008-07-31 18:52   ` Petr Baudis
  2008-07-31 19:02 ` Jeff King
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Stephen R. van den Berg @ 2008-07-31 18:33 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git Mailinglist

Sverre Rabbelier wrote:
>If I clone the git mirror of our monotone repository, I find a
>checkout size of 148 MB after git-repack--running git-gc also
>increased the size by 2 MB, but I'll stick with the initial checkout
>size for fairness. If I multiply this by my 11 checkouts, I will have
>1628 MB. This is even more compelling for me, as I now save 728 MB of
>disk space with monotone."

You have at least two options to reduce diskspace:
a. Clone once from remote, then clone from that clone, it should
   hardlink the larger packfiles to the initial clone and therefore not
   cost you a lot.
b. Clone once from remote, and create 11 branches inside the new cloned
   repo.  Switch branches while doing development.

Most git users pick b.  It's easier to work with.  Having 11 unpacked
repos means that all the object files in those trees are almost up to
date, but it adds to the complexity of comparing changes and merging
changes between branches.  The compilation speed can be increased with
ccache if need be.
-- 
Sincerely,
           Stephen R. van den Berg.
"There are three types of people in this world: those who make things happen,
 those who watch things happen and those who wonder what happened."

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 18:33 ` Stephen R. van den Berg
@ 2008-07-31 18:52   ` Petr Baudis
  0 siblings, 0 replies; 33+ messages in thread
From: Petr Baudis @ 2008-07-31 18:52 UTC (permalink / raw)
  To: Stephen R. van den Berg; +Cc: Sverre Rabbelier, Git Mailinglist

On Thu, Jul 31, 2008 at 08:33:17PM +0200, Stephen R. van den Berg wrote:
> Sverre Rabbelier wrote:
> >If I clone the git mirror of our monotone repository, I find a
> >checkout size of 148 MB after git-repack--running git-gc also
> >increased the size by 2 MB, but I'll stick with the initial checkout
> >size for fairness. If I multiply this by my 11 checkouts, I will have
> >1628 MB. This is even more compelling for me, as I now save 728 MB of
> >disk space with monotone."
> 
> You have at least two options to reduce diskspace:
> a. Clone once from remote, then clone from that clone, it should
>    hardlink the larger packfiles to the initial clone and therefore not
>    cost you a lot.
> b. Clone once from remote, and create 11 branches inside the new cloned
>    repo.  Switch branches while doing development.
> 
> Most git users pick b.  It's easier to work with.  Having 11 unpacked
> repos means that all the object files in those trees are almost up to
> date, but it adds to the complexity of comparing changes and merging
> changes between branches.  The compilation speed can be increased with
> ccache if need be.

c. Still clone from the remote, but set up alternates to a single
local "reference repository". Then all common objects will be stored
only once in this reference repository. The advantage to (a) is that
your remotes are actually set up sensibly.

(Note that the blog post talks about .git + checkout sizes, in case
someone got confused like I did, counting only .git. :-)

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 18:13 Git vs Monotone Sverre Rabbelier
  2008-07-31 18:33 ` Stephen R. van den Berg
@ 2008-07-31 19:02 ` Jeff King
  2008-07-31 19:11   ` Craig L. Ching
  2008-07-31 19:19   ` Sverre Rabbelier
  2008-07-31 19:17 ` Linus Torvalds
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 33+ messages in thread
From: Jeff King @ 2008-07-31 19:02 UTC (permalink / raw)
  To: sverre; +Cc: Git Mailinglist

On Thu, Jul 31, 2008 at 08:13:59PM +0200, Sverre Rabbelier wrote:

> If I clone the git mirror of our monotone repository, I find a
> checkout size of 148 MB after git-repack--running git-gc also
> increased the size by 2 MB, but I'll stick with the initial checkout
> size for fairness. If I multiply this by my 11 checkouts, I will have
> 1628 MB. This is even more compelling for me, as I now save 728 MB of
> disk space with monotone."

Yikes. This is not even remotely a fair comparison to monotone, which is
keeping a central db.

> I'm in the process of cloning the repo myself, and will check if doing
> a more aggressive (high --window and --depth values) repack will get
> us below that 148, but I'm thinking it's just that big a repo. Anyway,

It's much better than that. I just cloned

  git://github.com/felipec/pidgin-clone.git

and the _whole thing_ is 148M, including the working tree. His object db
is only 88M. So he can do his 11 trees in 61 * 11 + 88 = 759M, saving
141M over monotone.

And I am repacking with insane depth and window right now to see if we
can get it smaller (though really, it is not that big a deal, since the
size is dominated by his 11 working trees).

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Git vs Monotone
  2008-07-31 19:02 ` Jeff King
@ 2008-07-31 19:11   ` Craig L. Ching
  2008-07-31 19:19   ` Sverre Rabbelier
  1 sibling, 0 replies; 33+ messages in thread
From: Craig L. Ching @ 2008-07-31 19:11 UTC (permalink / raw)
  To: Jeff King, sverre; +Cc: Git Mailinglist

 

> -----Original Message-----
> From: git-owner@vger.kernel.org 
> [mailto:git-owner@vger.kernel.org] On Behalf Of Jeff King
> Sent: Thursday, July 31, 2008 2:02 PM
> To: sverre@rabbelier.nl
> Cc: Git Mailinglist
> Subject: Re: Git vs Monotone
> 
> On Thu, Jul 31, 2008 at 08:13:59PM +0200, Sverre Rabbelier wrote:
> 
> > If I clone the git mirror of our monotone repository, I find a 
> > checkout size of 148 MB after git-repack--running git-gc also 
> > increased the size by 2 MB, but I'll stick with the initial 
> checkout 
> > size for fairness. If I multiply this by my 11 checkouts, I 
> will have
> > 1628 MB. This is even more compelling for me, as I now save 
> 728 MB of 
> > disk space with monotone."
> 
> Yikes. This is not even remotely a fair comparison to 
> monotone, which is keeping a central db.
> 
I think it is a fair comparison, but as you point out, the author is
doing the comparison wrong.  Monotone's "central db" (as you call it) is
really equivalent to git's object database.

> > I'm in the process of cloning the repo myself, and will 
> check if doing 
> > a more aggressive (high --window and --depth values) repack 
> will get 
> > us below that 148, but I'm thinking it's just that big a 
> repo. Anyway,
> 
> It's much better than that. I just cloned
> 
>   git://github.com/felipec/pidgin-clone.git
> 
> and the _whole thing_ is 148M, including the working tree. 
> His object db is only 88M. So he can do his 11 trees in 61 * 
> 11 + 88 = 759M, saving 141M over monotone.
> 
Right, that's been my experience too, that git is smaller than monotone.
The author just needs to compare eqivalent concepts ;-)

> -Peff
> --

Cheers,
Craig

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 18:13 Git vs Monotone Sverre Rabbelier
  2008-07-31 18:33 ` Stephen R. van den Berg
  2008-07-31 19:02 ` Jeff King
@ 2008-07-31 19:17 ` Linus Torvalds
  2008-07-31 19:28   ` Craig L. Ching
  2008-07-31 19:48   ` Monotone workflow compared to Git workflow ( was RE: Git vs Monotone) Craig L. Ching
  2008-07-31 19:24 ` Git vs Monotone Theodore Tso
  2008-08-01  7:23 ` Sverre Rabbelier
  4 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-07-31 19:17 UTC (permalink / raw)
  To: sverre; +Cc: Git Mailinglist



On Thu, 31 Jul 2008, Sverre Rabbelier wrote:
> 
> I just read this blog post [0] in which one of the Pidgin devs sheds
> his light on their 'tool choice'. In the post he mentions the
> following figures:

Don't even bother. The guy is apparently not even trying to work with his 
tools, he just has an agenda to push.

Quite frankly, anybody who wants to stay with monotone, we should 
_encourage_ them. They add nothing to any possible project, because they 
are clearly not very intelligent.

The guy is apparently happy using a single database for monotone (which 
apparently has a database that is two times the size of the git one), but 
then doesn't want to use a single database for git, but wants to force a 
full clone for each. Not to mention that in git, you'd normally not do 11 
clones to begin with, you'd just do 11 branches in one repo.

So there is no point discussing things with people like that. If he wants 
to skew things in monotone's favor, he can do it. Let him. 

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 19:02 ` Jeff King
  2008-07-31 19:11   ` Craig L. Ching
@ 2008-07-31 19:19   ` Sverre Rabbelier
  2008-07-31 20:32     ` Jeff King
  1 sibling, 1 reply; 33+ messages in thread
From: Sverre Rabbelier @ 2008-07-31 19:19 UTC (permalink / raw)
  To: Jeff King, Craig L. Ching, Petr Baudis, Stephen R. van den Berg
  Cc: Git Mailinglist

On Thu, Jul 31, 2008 at 21:02, Jeff King <peff@peff.net> wrote:
> and the _whole thing_ is 148M, including the working tree. His object db
> is only 88M. So he can do his 11 trees in 61 * 11 + 88 = 759M, saving
> 141M over monotone.

Yeah, that's rather unfair indeed, counting that way he'd have to add
the 229MB for the Monotone db too ;).

> And I am repacking with insane depth and window right now to see if we
> can get it smaller (though really, it is not that big a deal, since the
> size is dominated by his 11 working trees).

I repacked with --depth=100 and --window=100, I tried out 500 at first
but it was just insanely slow (on a VM with one 2.4Ghz Core
available). This resulted in a .git dir of 76MB. With that dir I did
the following:
$mkdir pidgins
$git clone --no-hardlinks --bare pidgin pidgin-bare
$mv pidgin-bare pidgins
$cd pidgins
$for i in 1 2 3 4 5 6 7 8 9 10 11; do git clone pidgin-bare pidgin$i; done
$ du -sh .
742M    .

So... monotone, eat your heart out ;).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 18:13 Git vs Monotone Sverre Rabbelier
                   ` (2 preceding siblings ...)
  2008-07-31 19:17 ` Linus Torvalds
@ 2008-07-31 19:24 ` Theodore Tso
  2008-08-01  7:23 ` Sverre Rabbelier
  4 siblings, 0 replies; 33+ messages in thread
From: Theodore Tso @ 2008-07-31 19:24 UTC (permalink / raw)
  To: sverre; +Cc: Git Mailinglist

On Thu, Jul 31, 2008 at 08:13:59PM +0200, Sverre Rabbelier wrote:
> 
> I just read this blog post [0] in which one of the Pidgin devs sheds
> his light on their 'tool choice'. In the post he mentions the
> following figures:

The main thing this proves was that the Pidgin devs were most familiar
with Monotone, and weren't sufficiently familiar with git; hence, they
didn't know how to do a fair comparison.  First of all, sure, if they
are willing to use a single working directory and want to switch
between branches using "git checkout", that works well.  But suppose
they really want separate working directories.  The simplist and
easist way is to use "git clone -s".

So if they do:

git clone git://github.com/felipec/pidgin-clone.git pidgin
git clone -s pidgin clone-1
git clone -s pidgin clone-2
git clone -s pidgin clone-3
git clone -s pidgin clone-4
git clone -s pidgin clone-5
git clone -s pidgin clone-6
git clone -s pidgin clone-7
git clone -s pidgin clone-8
git clone -s pidgin clone-9
git clone -s pidgin clone-10

The net disk usage is 746 megabytes, as compared to the 900 megabytes
claimed in the blog post.  The main difference is the git database is
only takes 87 megabytes, compared to the 229 megabytes for the
Monotone database.  The main issue is the pidgin developers simply
didn't know how to use the -s flag so they didn't need to duplicate
the git database for every single clone.

Shrug; whatever, I've always said the biggest issue for any tool is
what the developers are familiar with.  It may be that monotone was
the right choice for the pidgin core developers, if they weren't
familiar enough with git.

						- Ted

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Git vs Monotone
  2008-07-31 19:17 ` Linus Torvalds
@ 2008-07-31 19:28   ` Craig L. Ching
  2008-07-31 19:52     ` Linus Torvalds
  2008-07-31 19:48   ` Monotone workflow compared to Git workflow ( was RE: Git vs Monotone) Craig L. Ching
  1 sibling, 1 reply; 33+ messages in thread
From: Craig L. Ching @ 2008-07-31 19:28 UTC (permalink / raw)
  To: Linus Torvalds, sverre; +Cc: Git Mailinglist

 

> [mailto:git-owner@vger.kernel.org] On Behalf Of Linus Torvalds
> Sent: Thursday, July 31, 2008 2:18 PM
> Subject: Re: Git vs Monotone
> 
> On Thu, 31 Jul 2008, Sverre Rabbelier wrote:
> > 
> The guy is apparently happy using a single database for 
> monotone (which apparently has a database that is two times 
> the size of the git one), but then doesn't want to use a 
> single database for git, but wants to force a full clone for 
> each. Not to mention that in git, you'd normally not do 11 
> clones to begin with, you'd just do 11 branches in one repo.
> 

Having come from monotone to git recently, I have to say that it isn't
immediately obvious how you get the single database for git a la
monotone (with remotes that point to the right place, etc.).  At first,
I also thought that you didn't share the object database on clones and I
had to discover that myself.  It's possible that I'm just an idiot too
;-)

> So there is no point discussing things with people like that. 
> If he wants to skew things in monotone's favor, he can do it. 
> Let him. 
> 

It's possible he's doing that, but it's also possible he just isn't that
familiar with git.

> 			Linus
> --

Cheers,
Craig

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 19:17 ` Linus Torvalds
  2008-07-31 19:28   ` Craig L. Ching
@ 2008-07-31 19:48   ` Craig L. Ching
  2008-07-31 20:09     ` Linus Torvalds
  2008-07-31 20:57     ` Sean Estabrooks
  1 sibling, 2 replies; 33+ messages in thread
From: Craig L. Ching @ 2008-07-31 19:48 UTC (permalink / raw)
  To: Linus Torvalds, sverre; +Cc: Git Mailinglist

 

> -----Original Message-----
> From: git-owner@vger.kernel.org 
> [mailto:git-owner@vger.kernel.org] On Behalf Of Linus Torvalds
> Sent: Thursday, July 31, 2008 2:18 PM

> single database for git, but wants to force a full clone for 
> each. Not to mention that in git, you'd normally not do 11 
> clones to begin with, you'd just do 11 branches in one repo.
> 

I have a question about this.  I asked this awhile back and didn't
really get any satisfactory answers except to use git-new-workdir, which
makes git behave a lot like monotone.  In our workflow, we do create
branches for nearly everything, but we do find that we have a need to
keep the build artifacts of those branches isolated from each other
because rebuilding is expensive.  IOW, we have this sort of workflow:

git checkout A
[work on A, build, test, do some commits]
git checkout B
[work on B, build, test, do some commits]
git checkout A
[work on A, re-build, test, do some commits]

We find ourselves constantly having to shift gears and work on other
things in the middle of whatever it is we're currently working on.  For
instance, in the scenario above, A might be branch that contains a
feature going into our next release.  B might be a bugfix and takes
priority over A, so you have to leave A as-is and start work on B.  When
I come back to work on A, I have to rebuild A to continue working, and
that's just too expensive for us.  So we use the monotone-like
new-workdir which allows us to save those build artifacts.

So, that said, I ask again, am I missing something?  Is there a better
way to do this?  How do the kernel developers do this, surely they're
switching branches back and forth having to build in-between?

> 			Linus
> --
> To unsubscribe from this list: send the line "unsubscribe 
> git" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Cheers,
Craig

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Git vs Monotone
  2008-07-31 19:28   ` Craig L. Ching
@ 2008-07-31 19:52     ` Linus Torvalds
  2008-07-31 20:24       ` Junio C Hamano
                         ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-07-31 19:52 UTC (permalink / raw)
  To: Craig L. Ching; +Cc: sverre, Git Mailinglist



On Thu, 31 Jul 2008, Craig L. Ching wrote:
> 
> It's possible he's doing that, but it's also possible he just isn't that
> familiar with git.

Possible. But it really sounded like he didn't even try. Because quite 
frankly, if he had even bothered to _try_, he wouldn't have gotten the 
numbers he got.

The fact is, even without "-s", a local clone will do hardlinks for the 
database. And since the original pack-file is marked as a 'keep' file, 
that original pack-file won't even be broken apart.

So literally, if he had just bothered to even _try_ the git setup, he'd 
have noticed that git actually uses less disk than monotone would do. But 
it sounds like he didn't even try it.

So completely ignoring the fact that you could do a single database with 
git, and completely ignoring the fact that with git you'd probably use 
branches for at least some of those 11 repos anyway, he'd _still_ have had 
less disk space used by git unless he would do something intentionally odd 
(like clone all the repositories over the network separately).

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 19:48   ` Monotone workflow compared to Git workflow ( was RE: Git vs Monotone) Craig L. Ching
@ 2008-07-31 20:09     ` Linus Torvalds
  2008-07-31 20:18       ` Shawn O. Pearce
                         ` (3 more replies)
  2008-07-31 20:57     ` Sean Estabrooks
  1 sibling, 4 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-07-31 20:09 UTC (permalink / raw)
  To: Craig L. Ching; +Cc: sverre, Git Mailinglist



On Thu, 31 Jul 2008, Craig L. Ching wrote:
> 
> We find ourselves constantly having to shift gears and work on other
> things in the middle of whatever it is we're currently working on.  For
> instance, in the scenario above, A might be branch that contains a
> feature going into our next release.  B might be a bugfix and takes
> priority over A, so you have to leave A as-is and start work on B.  When
> I come back to work on A, I have to rebuild A to continue working, and
> that's just too expensive for us.  So we use the monotone-like
> new-workdir which allows us to save those build artifacts.
> 
> So, that said, I ask again, am I missing something?  Is there a better
> way to do this?  How do the kernel developers do this, surely they're
> switching branches back and forth having to build in-between?

Sure, if you want to keep the build tree around, you would probably not 
use branches. 

But yes, then you'd likely do "git clone -s" with some single "common 
point" or use "git worktree". And even if you don't use "-s", you should 
_still_ effectively share at least all the old history (which tends to be 
the bulk) thanks to even a default "git clone" will just hardlink the 
pack-files.

So literally, if you do

	git clone <cntral-repo-over-network> <local>

and then do

	git clone <local> <otherlocal>
	git clone <local> <thirdlocal>

then all of those will all share the initial pack-file on-disk. Try it.

(You may then want to edit the "origin" branch info in the .git/config to 
point to the network one etc, of course).

Oh, and to make sure I'm not lying I actually did test this, but I also 
noticed that "git clone" no longer marks the initial pack-file with 
"keep", so it looks like "git gc" will then break the link. That's sad. I 
wonder when that changed, or maybe I'm just confused and it never did.

Junio?

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:09     ` Linus Torvalds
@ 2008-07-31 20:18       ` Shawn O. Pearce
  2008-07-31 20:37       ` Craig L. Ching
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Shawn O. Pearce @ 2008-07-31 20:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Craig L. Ching, sverre, Git Mailinglist

Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> Oh, and to make sure I'm not lying I actually did test this, but I also 
> noticed that "git clone" no longer marks the initial pack-file with 
> "keep", so it looks like "git gc" will then break the link. That's sad. I 
> wonder when that changed, or maybe I'm just confused and it never did.

It was a bug in git-clone that we were recording the .keep file on
initial clone.  We left the lock file in place after the fetch pack
call was done, but didn't remove it after the refs were updated.

If we want to go back to .keep'ing the original pack creating
during clone it probably should be threshold based.  For many
smaller projects with only a 25M pack (or less) there is no point
in .keep'ing that first pack.  For larger projects where the pack
is over a few hundred megabytes, then yea, maybe there is value
in .keep'ing it during clone.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 19:52     ` Linus Torvalds
@ 2008-07-31 20:24       ` Junio C Hamano
  2008-07-31 20:30         ` Linus Torvalds
  2008-08-23 19:23         ` Felipe Contreras
  2008-07-31 20:42       ` Blum, Robert
  2008-08-01  9:57       ` David Kastrup
  2 siblings, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2008-07-31 20:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Craig L. Ching, sverre, Git Mailinglist

Linus Torvalds <torvalds@linux-foundation.org> writes:

> ... And since the original pack-file is marked as a 'keep' file,
> that original pack-file won't even be broken apart.

Oops, isn't that something we fixed recently as a "bug"?

> So completely ignoring the fact that you could do a single database with 
> git, and completely ignoring the fact that with git you'd probably use 
> branches for at least some of those 11 repos anyway, he'd _still_ have had 
> less disk space used by git unless he would do something intentionally odd 
> (like clone all the repositories over the network separately).

Well, people are not perfect and they are free to express their opinions
based on faulty understanding of reality on their blogs.  The right things
to do are (1) ignore them on the list and not waste many people's time,
and/or (2) educate them, but in private or in a circle where many other
similar ignorants benefit from such education.  That is not here but
perhaps on #monotone channel?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 20:24       ` Junio C Hamano
@ 2008-07-31 20:30         ` Linus Torvalds
  2008-08-23 19:23         ` Felipe Contreras
  1 sibling, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-07-31 20:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Craig L. Ching, sverre, Git Mailinglist



On Thu, 31 Jul 2008, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > ... And since the original pack-file is marked as a 'keep' file,
> > that original pack-file won't even be broken apart.
> 
> Oops, isn't that something we fixed recently as a "bug"?

Ehh, apparently. I had thought it was a feature (not that it was me who 
implemented it), and didn't realize that others thought it was a bug. 
Oops.

The default *.keep file was _wonderful_ for cloning a large tree onto a 
small machine. It did exactly the right thing (never mind any shared 
repositories - it just made repacking much more reasonable).

So maybe it was unintentional (a "bug"), but I had always seen it as being 
something good.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 19:19   ` Sverre Rabbelier
@ 2008-07-31 20:32     ` Jeff King
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff King @ 2008-07-31 20:32 UTC (permalink / raw)
  To: sverre
  Cc: Craig L. Ching, Petr Baudis, Stephen R. van den Berg,
	Git Mailinglist

On Thu, Jul 31, 2008 at 09:19:41PM +0200, Sverre Rabbelier wrote:

> I repacked with --depth=100 and --window=100, I tried out 500 at first
> but it was just insanely slow (on a VM with one 2.4Ghz Core
> available). This resulted in a .git dir of 76MB. With that dir I did
> the following:

I tried 200/200 and got a 74M packfile. So I think we're getting into
diminishing returns.

> $ du -sh .
> 742M    .
> 
> So... monotone, eat your heart out ;).

:)

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:09     ` Linus Torvalds
  2008-07-31 20:18       ` Shawn O. Pearce
@ 2008-07-31 20:37       ` Craig L. Ching
  2008-07-31 20:54       ` Björn Steinbrink
  2008-07-31 21:40       ` Linus Torvalds
  3 siblings, 0 replies; 33+ messages in thread
From: Craig L. Ching @ 2008-07-31 20:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailinglist


> From: Linus Torvalds [mailto:torvalds@linux-foundation.org] 
> Sent: Thursday, July 31, 2008 3:09 PM
> 
> Sure, if you want to keep the build tree around, you would 
> probably not use branches. 
> 

I think we'd still use branches, but we just need to isolate their
workdirs from each other.

> But yes, then you'd likely do "git clone -s" with some single 
> "common point" or use "git worktree". And even if you don't 
> use "-s", you should _still_ effectively share at least all 
> the old history (which tends to be the bulk) thanks to even a 
> default "git clone" will just hardlink the pack-files.
> 
> So literally, if you do
> 
> 	git clone <cntral-repo-over-network> <local>
> 
> and then do
> 
> 	git clone <local> <otherlocal>
> 	git clone <local> <thirdlocal>
> 
> then all of those will all share the initial pack-file 
> on-disk. Try it.
> 
> (You may then want to edit the "origin" branch info in the 
> .git/config to point to the network one etc, of course).
> 

Yes, thank you for the explanation.  Having used git a fair amount now,
that makes perfect sense to me, in fact, it sounds a lot like
git-new-workdir, but I think I'll change our use of git-new-workdir to
something more "core" git.  It seems to me that maybe this is something
that could be documented more prominently?  Or maybe it is and I've just
missed it.  This would have saved me a lot of time originally to be
sure.

> Oh, and to make sure I'm not lying I actually did test this, 
> but I also noticed that "git clone" no longer marks the 
> initial pack-file with "keep", so it looks like "git gc" will 
> then break the link. That's sad. I wonder when that changed, 
> or maybe I'm just confused and it never did.
> 

What's the consequence of that then?  Because of that, would you say
"don't gc your master local repo until all derived repos are merged?"
If that link is broken is it just a loss of space? Or is it more?

> 		Linus
> 

Thanks again!

Cheers,
Craig

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Git vs Monotone
  2008-07-31 19:52     ` Linus Torvalds
  2008-07-31 20:24       ` Junio C Hamano
@ 2008-07-31 20:42       ` Blum, Robert
  2008-08-10 22:15         ` Robin Rosenberg
  2008-08-01  9:57       ` David Kastrup
  2 siblings, 1 reply; 33+ messages in thread
From: Blum, Robert @ 2008-07-31 20:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git@vger.kernel.org


>The fact is, even without "-s", a local clone will do hardlinks for the
>database. And since the original pack-file is marked as a 'keep' file,
>that original pack-file won't even be broken apart.

Then again, Pidgin is, among other things, a Windows project. I.e. hardlinks are not exactly trivial. There's a good chance nobody jumped through the hoops of junction points for git on win32... (Somebody correct me if I'm wrong)

>So literally, if he had just bothered to even _try_ the git setup, he'd
>have noticed that git actually uses less disk than monotone would do. But
>it sounds like he didn't even try it.

Well, he *did* try it, for *one* repository. He just didn't know that there's a better way than having 11 clones. And I lay the blame for that squarely at the git documentation ;)

Yes, I know, why don't I make it better...?

Because I'm fairly new to git and would feel like an idiot 'documenting' something that I feel I've only scratched the surface of. I do expect to write a few uninformed rants on my blog, though. And maybe at some point, I can contribute to actual docs :)

Either way, it's another interesting data point for all of us still comparing DVCSs. I just wish he had comments on his blog so somebody could inform him that he's mistaken...


 - Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:09     ` Linus Torvalds
  2008-07-31 20:18       ` Shawn O. Pearce
  2008-07-31 20:37       ` Craig L. Ching
@ 2008-07-31 20:54       ` Björn Steinbrink
  2008-07-31 21:10         ` Avery Pennarun
                           ` (2 more replies)
  2008-07-31 21:40       ` Linus Torvalds
  3 siblings, 3 replies; 33+ messages in thread
From: Björn Steinbrink @ 2008-07-31 20:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Craig L. Ching, sverre, Git Mailinglist

On 2008.07.31 13:09:09 -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 31 Jul 2008, Craig L. Ching wrote:
> > 
> > We find ourselves constantly having to shift gears and work on other
> > things in the middle of whatever it is we're currently working on.  For
> > instance, in the scenario above, A might be branch that contains a
> > feature going into our next release.  B might be a bugfix and takes
> > priority over A, so you have to leave A as-is and start work on B.  When
> > I come back to work on A, I have to rebuild A to continue working, and
> > that's just too expensive for us.  So we use the monotone-like
> > new-workdir which allows us to save those build artifacts.
> > 
> > So, that said, I ask again, am I missing something?  Is there a better
> > way to do this?  How do the kernel developers do this, surely they're
> > switching branches back and forth having to build in-between?
> 
> Sure, if you want to keep the build tree around, you would probably not 
> use branches. 
> 
> But yes, then you'd likely do "git clone -s" with some single "common 
> point" or use "git worktree". And even if you don't use "-s", you should 
> _still_ effectively share at least all the old history (which tends to be 
> the bulk) thanks to even a default "git clone" will just hardlink the 
> pack-files.
> 
> So literally, if you do
> 
> 	git clone <cntral-repo-over-network> <local>

Hum, I guess I'm just missing something and prepare to get flamed, but
wouldn't you want that one to be bare? Otherwise, the other clones won't
see all of the original repo's branches, right?

Maybe even better:

mkdir local-mirror
cd local-mirror
git --bare init
git remote add -f --mirror origin <central-repo-over-network>

A cronjob (or whatever) could keep the local mirror up-to-date and the
other repos can fetch from there. Pushing would need to go to a
different remote then though.. Humm... Maybe not worth the trouble for a
bit of additional object sharing.

Björn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 19:48   ` Monotone workflow compared to Git workflow ( was RE: Git vs Monotone) Craig L. Ching
  2008-07-31 20:09     ` Linus Torvalds
@ 2008-07-31 20:57     ` Sean Estabrooks
  2008-07-31 21:22       ` Theodore Tso
  1 sibling, 1 reply; 33+ messages in thread
From: Sean Estabrooks @ 2008-07-31 20:57 UTC (permalink / raw)
  To: Craig L. Ching; +Cc: Linus Torvalds, sverre, Git Mailinglist

On Thu, 31 Jul 2008 14:48:21 -0500
"Craig L. Ching" <cching@mqsoftware.com> wrote:

> I have a question about this.  I asked this awhile back and didn't
> really get any satisfactory answers except to use git-new-workdir, which
> makes git behave a lot like monotone.  In our workflow, we do create
> branches for nearly everything, but we do find that we have a need to
> keep the build artifacts of those branches isolated from each other
> because rebuilding is expensive.  IOW, we have this sort of workflow:

Is there a problem using git-new-workdir?  It sounds like it does
exactly what you want.

> We find ourselves constantly having to shift gears and work on other
> things in the middle of whatever it is we're currently working on.  For
> instance, in the scenario above, A might be branch that contains a
> feature going into our next release.  B might be a bugfix and takes
> priority over A, so you have to leave A as-is and start work on B.  When
> I come back to work on A, I have to rebuild A to continue working, and
> that's just too expensive for us.  So we use the monotone-like
> new-workdir which allows us to save those build artifacts.
> 
> So, that said, I ask again, am I missing something?  Is there a better
> way to do this?  How do the kernel developers do this, surely they're
> switching branches back and forth having to build in-between?

A decent build system will only compile the source files that actually
changed when switching branches.  Couple that with a compiler cache
(such as ccache) and switching between branches in the kernel or git
project usually isn't prohibitively time consuming.

Sean

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:54       ` Björn Steinbrink
@ 2008-07-31 21:10         ` Avery Pennarun
  2008-07-31 21:13         ` Linus Torvalds
  2008-07-31 21:43         ` Martin Langhoff
  2 siblings, 0 replies; 33+ messages in thread
From: Avery Pennarun @ 2008-07-31 21:10 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Linus Torvalds, Craig L. Ching, sverre, Git Mailinglist

On 7/31/08, Björn Steinbrink <B.Steinbrink@gmx.de> wrote:
>  Maybe even better:
>
>  mkdir local-mirror
>  cd local-mirror
>  git --bare init
>  git remote add -f --mirror origin <central-repo-over-network>
>
>  A cronjob (or whatever) could keep the local mirror up-to-date and the
>  other repos can fetch from there. Pushing would need to go to a
>  different remote then though.. Humm... Maybe not worth the trouble for a
>  bit of additional object sharing.

What would be *really* great is if we could find a way for multiple
local clones to share the same objects, refs, and configuration - ie.
without pushing and pulling between them at all.  Then they could
*all* point at the remote upstream repo through "origin", and
pushing/pulling with that repo would update the objects and refs for
all the local repos.

I'm not sure of the best way to do this, though.  In particular, it
seems like having multiple work trees checked out on the same ref
could be problematic.

Is that just what git-new-workdir is for?  (It seems to be
undocumented so it's hard to tell.)  And what about this
.gitlink/.gitfile stuff I've heard about?  Could I use that to have
multiple work trees share the same .git folder?

Thanks,

Avery

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:54       ` Björn Steinbrink
  2008-07-31 21:10         ` Avery Pennarun
@ 2008-07-31 21:13         ` Linus Torvalds
  2008-07-31 21:43         ` Martin Langhoff
  2 siblings, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-07-31 21:13 UTC (permalink / raw)
  To: Björn Steinbrink; +Cc: Craig L. Ching, sverre, Git Mailinglist



On Thu, 31 Jul 2008, Björn Steinbrink wrote:
> > 
> > So literally, if you do
> > 
> > 	git clone <cntral-repo-over-network> <local>
> 
> Hum, I guess I'm just missing something and prepare to get flamed, but
> wouldn't you want that one to be bare? Otherwise, the other clones won't
> see all of the original repo's branches, right?

Making it bare might be a good idea for other reasons too (it makes it 
much more obvious that it's a "local clone" and is somehow special). But 
it's really a matter of taste - and the project - exactly how you do it. 

For example, the kernel only has a single master branch in the top repo, 
so there it really doesn't matter, and yes, I'm more kernel-oriented than 
anything else, of course.

But I don't think it's exactly wrong to have the initial clone be a real 
repository that you do work in. Quite often the history really is the 
_bulk_ of the database by far (at least with projects that have big enough 
repositories for this to even matter in the first place!), and as long as 
you just download that once and share that thing, you're already ahead of 
the game and the rest is really just details.

> Maybe even better:
> 
> mkdir local-mirror
> cd local-mirror
> git --bare init
> git remote add -f --mirror origin <central-repo-over-network>
> 
> A cronjob (or whatever) could keep the local mirror up-to-date and the
> other repos can fetch from there.

Heh. You can certainly do it many ways. I suspect the _easiest_ model is 
actually to do one single local repo that is special (and perhaps bare), 
and then you can clone all the other ones with

	git clone --reference <local-reference> <remote> <new-local>

because that will automatically set up the new local repo to have the 
local reference as an alternates thing, and will avoid downloading 
unnecessary stuff.

So my point about the eleven repos was not that it's the best way to do 
one remote clone and then eleven local ones - my point was that even if 
you do that _stupid_ thing, you'd have seen sharing without even knowing 
what you really did.

If you want to explicitly share, I think the local bare reference and 
using "git clone --reference" is the best way. It sets up a special 
link-file (it's just a text-file that git knows about, so it should work 
fine under Windows too - no need for filesystem support) in 
.git/objects/info/alternates.

IOW, git-clone --reference works like "git clone -s", but does so with one 
special local database, while allowing you to clone from anywhere. Very 
convenient.

And no, I don't think we document all these "tricks" very well. Partly 
because people are _already_ complaining about how git can do so many 
things ;) But partly because if you don't know what you're doing, the 
"tricks" are often things you really need to understand, and can be a bit 
dangerous otherwise.

For example, the "git clone -s" (or --reference) thing is *very* useful, 
but one result of other repositories then sharing a database with the 
reference one is that suddenly the reference repo is very special. You 
must not remove it (obviously!), but you also must not rebase it and prune 
it etc.

So all the normal git workflows are at least designed to be _safe_ even in 
the absense of people not knowing what they are doing. The duplication may 
be using harddisk space, but

 - quite often the checkout is actually an even bigger issue, and the git 
   repo is small enough that lots of people don't really worry.

 - duplicating the repo also means that you cannot _possibly_ screw up 
   other people/repos and does give you a kind of backup (even if 
   same-disk backups are obviously of dubious use: they shouldn't be your 
   _primary_ backup, but having multiple copies on a single disk still 
   protects against a _lot_ of problems)

so... It's a trade-off.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:57     ` Sean Estabrooks
@ 2008-07-31 21:22       ` Theodore Tso
  0 siblings, 0 replies; 33+ messages in thread
From: Theodore Tso @ 2008-07-31 21:22 UTC (permalink / raw)
  To: Sean Estabrooks; +Cc: Craig L. Ching, Linus Torvalds, sverre, Git Mailinglist

On Thu, Jul 31, 2008 at 04:57:24PM -0400, Sean Estabrooks wrote:
> > We find ourselves constantly having to shift gears and work on other
> > things in the middle of whatever it is we're currently working on.  For
> > instance, in the scenario above, A might be branch that contains a
> > feature going into our next release.  B might be a bugfix and takes
> > priority over A, so you have to leave A as-is and start work on B.  When
> > I come back to work on A, I have to rebuild A to continue working, and
> > that's just too expensive for us.  So we use the monotone-like
> > new-workdir which allows us to save those build artifacts.
>
> A decent build system will only compile the source files that actually
> changed when switching branches.  Couple that with a compiler cache
> (such as ccache) and switching between branches in the kernel or git
> project usually isn't prohibitively time consuming.

That being said, if the bugfix is on a "maint" branch, and one of the
things that has changed is a header file that forces most of the
project to be recompiled, a separate work directory may be more
convenient.  Of course, a separate work directory (whether created
using "git clone -s" or "git-new-workdir" means more disk space and it
means greater use of the page cache or a slowdown while the different
sets of sources get paged in and out.  Of course, you could hack
git-work-dir to use cp -rl to initially copy the working directory
using hard links, and then when the new branch is checked out, if most
of the files haven't changed, the files in the working directory could
be shared too.  A lot depends on how much you want to squeeze the last
bit of hard drive and speed optimization, and how big your project is.

       	    	      	    		      - Ted

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:09     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2008-07-31 20:54       ` Björn Steinbrink
@ 2008-07-31 21:40       ` Linus Torvalds
  2008-08-01  2:50         ` Dmitry Torokhov
  3 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2008-07-31 21:40 UTC (permalink / raw)
  To: Craig L. Ching; +Cc: sverre, Git Mailinglist



On Thu, 31 Jul 2008, Linus Torvalds wrote:
> 
> Sure, if you want to keep the build tree around, you would probably not 
> use branches. 

Side note: it's often faster to recompile, if your project has a good 
build system.

For example, for the kernel, I can literally rebuild my whole kernel 
(which is just what I use on _that_ machine) in about 16 seconds. This is 
_not_ using ccache or anything else - it's rebuilding the whole tree with 
-j16.

It turns out that using multiple build trees would actually slow things 
down, because then the source code wouldn't fit in memory any more. If I 
have to actually read the source code from the disk, my nice 16-second 
compile goes up to a minute or more.

Now, the thing you should take away from this is:

 - kernel people have cool toys, and CPU's that are faster than what you 
   have. Nyaah, nyaah.

 - disk is slow. REALLY slow. If you can share most of a single source 
   tree and thus keep it in memory, you're ahead.

 - even large projects can have a fast build cycle if your build chain 
   doesn't suck. The kernel is larger than most, but a _lot_ of build 
   systems don't parallelize or use horribly inefficient tools, so they 
   take much longer to build. 

The last part is the thing that people often stumble on. For example, I 
can literally compile the kernel a hell of a lot faster than I can do 
"make doc" on the git tree! Even just trying a "make -j16" when building 
the git documentation is really really really painful. I suspect I'd need 
a ton more memory for that horror.

So if your workflow involves xml (I think the doc build for git is all 
xsltproc - along with asciidoc written in python or something), you're 
screwed. But in the kernel we've actually cared pretty deeply about build 
times, and as a result it's actually very pleasant to switch branches and 
just rebuild. Even if some core header file has changed, it's _still_ ok 
if you've got enough CPU.

(I just tested - I can do a "make doc" for git in just under a minute from 
a clean tree. Ouch. That really is three times longer than my kernel 
build - as long as I brought the kernel and compiler into memory first ;)

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 20:54       ` Björn Steinbrink
  2008-07-31 21:10         ` Avery Pennarun
  2008-07-31 21:13         ` Linus Torvalds
@ 2008-07-31 21:43         ` Martin Langhoff
  2 siblings, 0 replies; 33+ messages in thread
From: Martin Langhoff @ 2008-07-31 21:43 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Linus Torvalds, Craig L. Ching, sverre, Git Mailinglist

On Fri, Aug 1, 2008 at 8:54 AM, Björn Steinbrink <B.Steinbrink@gmx.de> wrote:
>> So literally, if you do
>>
>>       git clone <cntral-repo-over-network> <local>
>
> Hum, I guess I'm just missing something and prepare to get flamed, but
> wouldn't you want that one to be bare? Otherwise, the other clones won't
> see all of the original repo's branches, right?

Yes, that's why

   git clone --reference /path/to/fat/checkout/.git/  <central-repo>

is far better. Each "thin" checkout sees the central repo normally,
but they borrow the object store from the referenced local "fat"
checkout.

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-07-31 21:40       ` Linus Torvalds
@ 2008-08-01  2:50         ` Dmitry Torokhov
  2008-08-01  3:02           ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Dmitry Torokhov @ 2008-08-01  2:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Craig L. Ching, sverre, Git Mailinglist

On Thu, Jul 31, 2008 at 02:40:55PM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 31 Jul 2008, Linus Torvalds wrote:
> > 
> > Sure, if you want to keep the build tree around, you would probably not 
> > use branches. 
> 
> Side note: it's often faster to recompile, if your project has a good 
> build system.
> 
> For example, for the kernel, I can literally rebuild my whole kernel 
> (which is just what I use on _that_ machine) in about 16 seconds. This is 
> _not_ using ccache or anything else - it's rebuilding the whole tree with 
> -j16.
> 

Is it after make mrproper (wow)? Or is it when your branches are
"recent"? Because for me (and well, I dont have that beefy boxes as you
do) swithing between "for-linus" and "next" that based off a revision in
vicinity of 2.6.xx-rc1 and "work" which tracks the tip of your tree
takes time to rebuild.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-08-01  2:50         ` Dmitry Torokhov
@ 2008-08-01  3:02           ` Linus Torvalds
  2008-08-01  3:59             ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2008-08-01  3:02 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: Craig L. Ching, sverre, Git Mailinglist



On Thu, 31 Jul 2008, Dmitry Torokhov wrote:

> > For example, for the kernel, I can literally rebuild my whole kernel 
> > (which is just what I use on _that_ machine) in about 16 seconds. This is 
> > _not_ using ccache or anything else - it's rebuilding the whole tree with 
> > -j16.
> 
> Is it after make mrproper (wow)?

Yeah. It's after doing

	git clean -dqfx
	make oldconfig

where I tend to use "git clean -dqfx" instead of "make mrproper" these 
days. 

Note that my "oldconfig" really only does the things I need, so this is 
_not_ a "allmodconfig" or anything like that. That would take much longer. 
It only has the drivers I use, and the stuff I actually need (it's not a 
embedded kernel in any way, but it's definitely pared down config exactly 
because I like being able to rebuild my kernels without wasting time on 
thousands of drivers that I can't use anyway).

Other people can do the "does it compile?" testing. Not worth my time, I 
feel ;)

> Because for me (and well, I dont have that beefy boxes as you do) 
> swithing between "for-linus" and "next" that based off a revision in 
> vicinity of 2.6.xx-rc1 and "work" which tracks the tip of your tree 
> takes time to rebuild.

Well, the difference really is the beefy box. And the fact that I hate 
modules, and I hate building stuff that I don't actually need. 

I literally turn off CONFIG_MODULES entirely. 

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Monotone workflow compared to Git workflow ( was RE: Git vs Monotone)
  2008-08-01  3:02           ` Linus Torvalds
@ 2008-08-01  3:59             ` Linus Torvalds
  0 siblings, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2008-08-01  3:59 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: Craig L. Ching, sverre, Git Mailinglist



On Thu, 31 Jul 2008, Linus Torvalds wrote:
> 
> Well, the difference really is the beefy box.

Btw, the fact that I have a beefy box really wasn't the point. The fact 
that I can build the kernel three times quicker than I can build the git 
documentation _was_ kind of the point. A lot of projects have horrible 
build rules - makefiles that don't parallelize well or just tools that 
suck dead baby donkeys through a straw.

I often get the feeling that I can compile the kernel faster than I can 
run "./configure" on most of the other projects I ever compile.

So I'd heartily encourage projects to try to make their build lean and 
mean. It actually then allows you to be more efficient, and gives the 
option of using more efficient development models, where "use multiple 
branches in the same tree" is just one example of that.

Of course, I have to admit that git itself isn't exactly a stellar 
example. I can compile git itself in basically zero time, but those docs 
really take a loooong time.

Just one more reason for me to stay away from documentation.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 18:13 Git vs Monotone Sverre Rabbelier
                   ` (3 preceding siblings ...)
  2008-07-31 19:24 ` Git vs Monotone Theodore Tso
@ 2008-08-01  7:23 ` Sverre Rabbelier
  2008-08-01 18:00   ` Daniel Barkalow
  4 siblings, 1 reply; 33+ messages in thread
From: Sverre Rabbelier @ 2008-08-01  7:23 UTC (permalink / raw)
  To: Git Mailinglist

On Thu, Jul 31, 2008 at 20:13, Sverre Rabbelier <alturin@gmail.com> wrote:
> I just read this blog post [0] in which one of the Pidgin devs sheds
> his light on their 'tool choice'. In the post he mentions the
> following figures:

> [0] http://theflamingbanker.blogspot.com/2008/07/holy-war-of-tool-choice.html

I have poked him on #pidgin, and he has added the following:

"Note: It's come to my attention that I had missed the ability to
share a git database across multiple working copies. In that scenario,
the total size of the database and 11 working copies is slightly under
750 MB, and thus a space savings in the neighborhood of 150 MB over
monotone. It had been my understanding that I needed a copy of the
database per working copy. I stand corrected. I don't use git on a
daily basis, as the projects I work with currently use CVS, SVN, or
monotone, so I am bound to miss finer details of git here and there.
There are other reasons I prefer to stick with monotone, but I won't
get into them here, as they're not important to the point of this
post."

So I'm happy ;).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 19:52     ` Linus Torvalds
  2008-07-31 20:24       ` Junio C Hamano
  2008-07-31 20:42       ` Blum, Robert
@ 2008-08-01  9:57       ` David Kastrup
  2 siblings, 0 replies; 33+ messages in thread
From: David Kastrup @ 2008-08-01  9:57 UTC (permalink / raw)
  To: git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, 31 Jul 2008, Craig L. Ching wrote:
>> 
>> It's possible he's doing that, but it's also possible he just isn't that
>> familiar with git.
>
> Possible. But it really sounded like he didn't even try. Because quite 
> frankly, if he had even bothered to _try_, he wouldn't have gotten the 
> numbers he got.
>
> The fact is, even without "-s", a local clone will do hardlinks for the 
> database.

That means that git takes up less disk space.  It does not mean that it
looks like it.

If you do a df before and afterwards, you'll notice (but that does not
seem reliable as other changed might happen in the file system).  If you
do "du" into the individual clones, you won't notice it.

It is quite plausible that he might have tried it, but misinterpreted
the results.

It is a similar situation with size estimates when sparse files are
involved: they may take up less space than what it looks like.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-08-01  7:23 ` Sverre Rabbelier
@ 2008-08-01 18:00   ` Daniel Barkalow
  0 siblings, 0 replies; 33+ messages in thread
From: Daniel Barkalow @ 2008-08-01 18:00 UTC (permalink / raw)
  To: sverre; +Cc: Git Mailinglist

On Fri, 1 Aug 2008, Sverre Rabbelier wrote:

> On Thu, Jul 31, 2008 at 20:13, Sverre Rabbelier <alturin@gmail.com> wrote:
> > I just read this blog post [0] in which one of the Pidgin devs sheds
> > his light on their 'tool choice'. In the post he mentions the
> > following figures:
> 
> > [0] http://theflamingbanker.blogspot.com/2008/07/holy-war-of-tool-choice.html
> 
> I have poked him on #pidgin, and he has added the following:
> 
> "Note: It's come to my attention that I had missed the ability to
> share a git database across multiple working copies. In that scenario,
> the total size of the database and 11 working copies is slightly under
> 750 MB, and thus a space savings in the neighborhood of 150 MB over
> monotone. It had been my understanding that I needed a copy of the
> database per working copy. I stand corrected. I don't use git on a
> daily basis, as the projects I work with currently use CVS, SVN, or
> monotone, so I am bound to miss finer details of git here and there.
> There are other reasons I prefer to stick with monotone, but I won't
> get into them here, as they're not important to the point of this
> post."

Did he retry the size calculation? I think someone on the list tried it 
and found that the clone, including the checkout, was (for them) the size 
that he thought was just the database; if you're used to having the clone 
equivalent be effectively --bare by default, it's an easy mistake, 
especially if you don't think it's possible for the entire project history 
to be smaller than a checkout.

Not that it actually matters to the comparison of monotone and SVN that 
was the actual point, but still, git is often more space-efficient than 
SVN even just on the client, even without any sharing between branches, 
just because uncompressed source is (relatively) huge. Which does, in a 
way, contribute to the point that SVN have a vast quantity of per-branch
overhead.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 20:42       ` Blum, Robert
@ 2008-08-10 22:15         ` Robin Rosenberg
  0 siblings, 0 replies; 33+ messages in thread
From: Robin Rosenberg @ 2008-08-10 22:15 UTC (permalink / raw)
  To: Blum, Robert; +Cc: Linus Torvalds, git@vger.kernel.org

torsdagen den 31 juli 2008 22.42.08 skrev Blum, Robert:
> 
> >The fact is, even without "-s", a local clone will do hardlinks for the
> >database. And since the original pack-file is marked as a 'keep' file,
> >that original pack-file won't even be broken apart.
> 
> Then again, Pidgin is, among other things, a Windows project. I.e. hardlinks are not exactly trivial. There's a good chance nobody jumped through the hoops of junction points for git on win32... (Somebody correct me if I'm wrong)

Windows does hardlinks for files since NT 3.51 on NTFS. Cygwin supports it too.  Symbolic links are another story. 

-- robin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Git vs Monotone
  2008-07-31 20:24       ` Junio C Hamano
  2008-07-31 20:30         ` Linus Torvalds
@ 2008-08-23 19:23         ` Felipe Contreras
  1 sibling, 0 replies; 33+ messages in thread
From: Felipe Contreras @ 2008-08-23 19:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Craig L. Ching, sverre, Git Mailinglist

On Thu, Jul 31, 2008 at 11:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> ... And since the original pack-file is marked as a 'keep' file,
>> that original pack-file won't even be broken apart.
>
> Oops, isn't that something we fixed recently as a "bug"?
>
>> So completely ignoring the fact that you could do a single database with
>> git, and completely ignoring the fact that with git you'd probably use
>> branches for at least some of those 11 repos anyway, he'd _still_ have had
>> less disk space used by git unless he would do something intentionally odd
>> (like clone all the repositories over the network separately).
>
> Well, people are not perfect and they are free to express their opinions
> based on faulty understanding of reality on their blogs.  The right things
> to do are (1) ignore them on the list and not waste many people's time,
> and/or (2) educate them, but in private or in a circle where many other
> similar ignorants benefit from such education.  That is not here but
> perhaps on #monotone channel?

Hm, joined late to the discussion.

I had a lengthy discussion on pidgin's mailing list regarding my
analysis of monotone [1]. I didn't go very well. I don't think they
want to be educated about git.

It turns out they evaluated git as an option in the 1.0 days and they
disregarded it mainly because of the size of the repo; they didn't run
'git gc'. I fail to understand why they didn't drop in #git or asked
in the mailing list. That should tell you enough about their informed
decisions.

Anyway, that blog post was probably a way to justify their choice
after the discussion. With the added note now there's nothing that
makes git a bad choice for them, but surely they will find another
equally flawed reason.

Best regards.

[1] http://pidgin.im/pipermail/devel/2008-July/006308.html

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2008-08-23 19:24 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-31 18:13 Git vs Monotone Sverre Rabbelier
2008-07-31 18:33 ` Stephen R. van den Berg
2008-07-31 18:52   ` Petr Baudis
2008-07-31 19:02 ` Jeff King
2008-07-31 19:11   ` Craig L. Ching
2008-07-31 19:19   ` Sverre Rabbelier
2008-07-31 20:32     ` Jeff King
2008-07-31 19:17 ` Linus Torvalds
2008-07-31 19:28   ` Craig L. Ching
2008-07-31 19:52     ` Linus Torvalds
2008-07-31 20:24       ` Junio C Hamano
2008-07-31 20:30         ` Linus Torvalds
2008-08-23 19:23         ` Felipe Contreras
2008-07-31 20:42       ` Blum, Robert
2008-08-10 22:15         ` Robin Rosenberg
2008-08-01  9:57       ` David Kastrup
2008-07-31 19:48   ` Monotone workflow compared to Git workflow ( was RE: Git vs Monotone) Craig L. Ching
2008-07-31 20:09     ` Linus Torvalds
2008-07-31 20:18       ` Shawn O. Pearce
2008-07-31 20:37       ` Craig L. Ching
2008-07-31 20:54       ` Björn Steinbrink
2008-07-31 21:10         ` Avery Pennarun
2008-07-31 21:13         ` Linus Torvalds
2008-07-31 21:43         ` Martin Langhoff
2008-07-31 21:40       ` Linus Torvalds
2008-08-01  2:50         ` Dmitry Torokhov
2008-08-01  3:02           ` Linus Torvalds
2008-08-01  3:59             ` Linus Torvalds
2008-07-31 20:57     ` Sean Estabrooks
2008-07-31 21:22       ` Theodore Tso
2008-07-31 19:24 ` Git vs Monotone Theodore Tso
2008-08-01  7:23 ` Sverre Rabbelier
2008-08-01 18:00   ` Daniel Barkalow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).