Git development

Git development
 help / color / mirror / Atom feed

* Re: Git and GCC
From: Nicolas Pitre @ 2007-12-06 19:08 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Linus Torvalds, Jeff King, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com>

On Thu, 6 Dec 2007, Jon Smirl wrote:

> On 12/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >
> >
> > On Thu, 6 Dec 2007, Jeff King wrote:
> > >
> > > What is really disappointing is that we saved only about 20% of the
> > > time. I didn't sit around watching the stages, but my guess is that we
> > > spent a long time in the single threaded "writing objects" stage with a
> > > thrashing delta cache.
> >
> > I don't think you spent all that much time writing the objects. That part
> > isn't very intensive, it's mostly about the IO.
> >
> > I suspect you may simply be dominated by memory-throughput issues. The
> > delta matching doesn't cache all that well, and using two or more cores
> > isn't going to help all that much if they are largely waiting for memory
> > (and quite possibly also perhaps fighting each other for a shared cache?
> > Is this a Core 2 with the shared L2?)
> 
> When I lasted looked at the code, the problem was in evenly dividing
> the work. I was using a four core machine and most of the time one
> core would end up with 3-5x the work of the lightest loaded core.
> Setting pack.threads up to 20 fixed the problem. With a high number of
> threads I was able to get a 4hr pack to finished in something like
> 1:15.

But as far as I know you didn't try my latest incarnation which has been
available in Git's master branch for a few months already.


Nicolas

^ permalink raw reply

* Re: [PATCH] gc --aggressive: make it really aggressive
From: J.C. Pizarro @ 2007-12-06 19:07 UTC (permalink / raw)
  To: David Kastrup, Johannes Schindelin
  Cc: Pierre Habouzit, Linus Torvalds, Daniel Berlin, David Miller,
	ismail, gcc, git, gitster

On 2007/12/06, David Kastrup <dak@gnu.org> wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > However, I think that --aggressive should be aggressive, and if you
> > decide to run it on a machine which lacks the muscle to be aggressive,
> > well, you should have known better.
>
> That's a rather cheap shot.  "you should have known better" than
> expecting to be able to use a documented command and option because the
> git developers happened to have a nicer machine...
>
> _How_ is one supposed to have known better?
>
> --
> David Kastrup, Kriemhildstr. 15, 44793 Bochum

In GIT, the --aggressive option doesn't make it aggressive.
In GCC, the -Wall option doesn't enable all warnings.

                                                           #
It's a "Tie one to one" with the similar reputations.   #######
                             To have a rest in peace.      #
                                                           #
   J.C.Pizarro                                             #

^ permalink raw reply

* Re: git guidance
From: Al Boldi @ 2007-12-07 18:55 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Phillip Susi, Linus Torvalds, Jing Xue, linux-kernel, git
In-Reply-To: <47583E57.9050208@op5.se>

Andreas Ericsson wrote:
> Al Boldi wrote:
> > Phillip Susi wrote:
> >> Al Boldi wrote:
> >>> IOW, git currently only implements the server-side use-case, but fails
> >>> to deliver on the client-side.  By introducing a git-client manager
> >>> that handles the transparency needs of a single user, it should be
> >>> possible to clearly isolate update semantics for both the client and
> >>> the server, each handling their specific use-case.
> >>
> >> Any talk of client or server makes no sense since git does not use a
> >> client/server model.
> >
> > Whether git uses the client/server model or not does not matter; what
> > matters is that there are two distinct use-cases at work here:  one on
> > the server/repository, and the other on the client.
>
> Git is distributed. The repository is everywhere. No server is actually
> needed. Many use one anyway since it can be convenient. It's not, however,
> necessary.

When you read server, don't read it as localized; a server can be 
distributed.  What distinguishes a server from an engine is that it has to 
handle a multi-user use-case.  How that is implemented, locally or remotely 
or distributed, is another issue.

> >> If you wish to use a centralized repository, then
> >> git can be set up to transparently push/pull to/from said repository if
> >> you wish via hooks or cron jobs.
> >
> > Again, this only handles the interface to/from the server/repository,
> > but once you pulled the sources, it leaves you without Version Control
> > on the client.
>
> No, that's CVS, SVN and other centralized scm's. With git you have perfect
> version control on each peer. That's the entire idea behind "fully
> distributed".

As explained before in this thread, replicating the git tree on the client 
still doesn't provide the required transparency.

> > By pulling the sources into a git-client manager mounted on some dir, it
> > should be possible to let the developer work naturally/transparently in
> > a readable/writeable manner, and only require his input when reverting
> > locally or committing to the server/repository.
>
> How is that different from what every SCM, including git, is doing today?
> The user needs to tell the scm when it's time to take a snapshot of the
> current state. Git is distributed though, so committing is usually not the
> same as publishing. Is that lack of a single command to commit and publish
> what's nagging you? If it's not, I completely fail to see what you're
> getting at, unless you've only ever looked at repositories without a
> worktree attached, or you think that git should work like an editor's
> "undo" functionality, which would be quite insane.

You need to re-read the thread.


Thanks!

--
Al


^ permalink raw reply

* Re: Git and GCC
From: Jon Smirl @ 2007-12-06 18:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff King, Nicolas Pitre, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <alpine.LFD.0.9999.0712061030560.13796@woody.linux-foundation.org>

On 12/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 6 Dec 2007, Jeff King wrote:
> >
> > What is really disappointing is that we saved only about 20% of the
> > time. I didn't sit around watching the stages, but my guess is that we
> > spent a long time in the single threaded "writing objects" stage with a
> > thrashing delta cache.
>
> I don't think you spent all that much time writing the objects. That part
> isn't very intensive, it's mostly about the IO.
>
> I suspect you may simply be dominated by memory-throughput issues. The
> delta matching doesn't cache all that well, and using two or more cores
> isn't going to help all that much if they are largely waiting for memory
> (and quite possibly also perhaps fighting each other for a shared cache?
> Is this a Core 2 with the shared L2?)

When I lasted looked at the code, the problem was in evenly dividing
the work. I was using a four core machine and most of the time one
core would end up with 3-5x the work of the lightest loaded core.
Setting pack.threads up to 20 fixed the problem. With a high number of
threads I was able to get a 4hr pack to finished in something like
1:15.

A scheme where each core could work a minute without communicating to
the other cores would be best. It would also be more efficient if the
cores could avoid having sync points between them.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Git and GCC
From: Linus Torvalds @ 2007-12-06 18:45 UTC (permalink / raw)
  To: NightStrike; +Cc: Daniel Berlin, David Miller, ismail, gcc, git
In-Reply-To: <b609cb3b0712061024rc48022bhc3fbfba02061dd94@mail.gmail.com>

On Thu, 6 Dec 2007, NightStrike wrote:
> 
> No disrespect is meant by this reply.  I am just curious (and I am
> probably misunderstanding something)..  Why remove all of the
> documentation entirely?  Wouldn't it be better to just document it
> more thoroughly?

Well, part of it is that I don't think "--aggressive" as it is implemented 
right now is really almost *ever* the right answer. We could change the 
implementation, of course, but generally the right thing to do is to not 
use it (tweaking the "--window" and "--depth" manually for the repacking 
is likely the more natural thing to do).

The other part of the answer is that, when you *do* want to do what that 
"--aggressive" tries to achieve, it's such a special case event that while 
it should probably be documented, I don't think it should necessarily be 
documented where it is now (as part of "git gc"), but as part of a much 
more technical manual for "deep and subtle tricks you can play".

> I thought you did a fine job in this post in explaining its purpose, 
> when to use it, when not to, etc.  Removing the documention seems 
> counter-intuitive when you've already gone to the trouble of creating 
> good documentation here in this post.

I'm so used to writing emails, and I *like* trying to explain what is 
going on, so I have no problems at all doing that kind of thing. However, 
trying to write a manual or man-page or other technical documentation is 
something rather different.

IOW, I like explaining git within the _context_ of a discussion or a 
particular problem/issue. But documentation should work regardless of 
context (or at least set it up), and that's the part I am not so good at.

In other words, if somebody (hint hint) thinks my explanation was good and 
readable, I'd love for them to try to turn it into real documentation by 
editing it up and creating enough context for it! But I'm nort personally 
very likely to do that. I'd just send Junio the patch to remove a 
misleading part of the documentation we have.

		Linus

^ permalink raw reply

* [PATCH] Let git-help prefer man-pages installed with this version of git
From: Sergei Organov @ 2007-12-06 18:33 UTC (permalink / raw)
  To: git; +Cc: gitster

Prepend $(prefix)/share/man to the MANPATH environment variable
before invoking 'man' from help.c:show_man_page().

Signed-off-by: Sergei Organov <osv@javad.com>
---
 Makefile |    5 ++++-
 help.c   |   21 +++++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 999391e..3030d31 100644
--- a/Makefile
+++ b/Makefile
@@ -154,6 +154,7 @@ STRIP ?= strip
 
 prefix = $(HOME)
 bindir = $(prefix)/bin
+mandir = $(prefix)/share/man
 gitexecdir = $(bindir)
 sharedir = $(prefix)/share
 template_dir = $(sharedir)/git-core/templates
@@ -744,6 +745,7 @@ ETC_GITCONFIG_SQ = $(subst ','\'',$(ETC_GITCONFIG))
 
 DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
 bindir_SQ = $(subst ','\'',$(bindir))
+mandir_SQ = $(subst ','\'',$(mandir))
 gitexecdir_SQ = $(subst ','\'',$(gitexecdir))
 template_dir_SQ = $(subst ','\'',$(template_dir))
 prefix_SQ = $(subst ','\'',$(prefix))
@@ -790,7 +792,8 @@ git$X: git.o $(BUILTIN_OBJS) $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ git.o \
 		$(BUILTIN_OBJS) $(ALL_LDFLAGS) $(LIBS)
 
-help.o: common-cmds.h
+help.o: help.c common-cmds.h GIT-CFLAGS
+	$(QUIET_CC)$(CC) -o $*.o -c $(ALL_CFLAGS) '-DGIT_MAN_PATH="$(mandir_SQ)"' $<
 
 git-merge-subtree$X: git-merge-recursive$X
 	$(QUIET_BUILT_IN)$(RM) $@ && ln git-merge-recursive$X $@
diff --git a/help.c b/help.c
index 37a9c25..9f843c9 100644
--- a/help.c
+++ b/help.c
@@ -8,6 +8,8 @@
 #include "exec_cmd.h"
 #include "common-cmds.h"
 
+static const char *builtin_man_path = GIT_MAN_PATH;
+
 /* most GUI terminals set COLUMNS (although some don't export it) */
 static int term_columns(void)
 {
@@ -239,6 +241,24 @@ void list_common_cmds_help(void)
 	}
 }
 
+static void setup_man_path(void)
+{
+	const char *old_path = getenv("MANPATH");
+	struct strbuf new_path;
+
+	strbuf_init(&new_path, 0);
+
+	strbuf_addstr(&new_path, builtin_man_path);
+	if (old_path) {
+		strbuf_addch(&new_path, ':');
+		strbuf_addstr(&new_path, old_path);
+	}
+
+	setenv("MANPATH", new_path.buf, 1);
+
+	strbuf_release(&new_path);
+}
+
 static void show_man_page(const char *git_cmd)
 {
 	const char *page;
@@ -254,6 +274,7 @@ static void show_man_page(const char *git_cmd)
 		page = p;
 	}
 
+	setup_man_path();
 	execlp("man", "man", page, NULL);
 }
 
-- 
1.5.3.4

^ permalink raw reply related

* Re: git help error
From: Sergei Organov @ 2007-12-06 18:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Aneesh Kumar, Git Mailing List
In-Reply-To: <7vr6i245b4.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Sergei Organov <osv@javad.com> writes:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>>> "Aneesh Kumar" <aneesh.kumar@gmail.com> writes:
>>>
>>>> git help gives me the below error.
>>>>
>>>> [master@git]$ git help add
>>>> No manual entry for git-add
>>>> See 'man 7 undocumented' for help when manual pages are not available.
>>>> [master@git]$
>>>>
>>>> I have the git binaries installed via --prefix
>>>>
>>>> ./configure --prefix=/home/kvaneesh/bin-local/git/
>>>> and to see the man page i have to say
>>>>
>>>> man -M /home/kvaneesh/bin-local/git/share/man/
>>> ...
>>> When you run "man" from the command line, can you say
>>>
>>>      $ man git-add
>>>
>>> and make it work?  If it fails the same way, then what you are missing
>>> is MANPATH environment variable, isn't it?
>>
>> I think what the OP asked for makes sense. git-help should better find
>> corresponding version of manual pages automatically. This way, if one
>> invokes different versions of git-help, he will get corresponding
>> version of help text.
>
> I do not necessarily agree.  Read what Aneesh wrote originally again,
> and read what he _didn't_ write.
>
> Not only he needs to run his "man" with -M (and my point was that it is
> not the only way, by the way), he needs to futz with his $PATH to
> include $HOME/bin-local/git for _his_ installation to work.

My point is that he doesn't need to tweak his $PATH, because he can
simply say:

$ ~/bin-local/git/bin/git help add

Then, I'd expect that the manual page that is installed along with the
version that

$ ~/bin-local/git/bin/git --version

reports is displayed, not any random version of git-add manual page
found using default 'man' rules.

> I think my suggestion to use $MANPATH is in line with what he is already
> doing.  If you install things in non-standard places, you can use
> environments to adjust to what you did, and that's the reason PATH and
> MANPATH environments are supported by your tools.

Yes, but provided you have more than one version of git installed, it's
inconvenient to tweak both PATH and MANPATH to use one or another. It
would be more convenient and consistent if

$ ~/git.old/bin/git help add
$ ~/git.new/bin/git help add

were render different versions of git-add manual page, each
corresponding to the right version of git.

> Having said that, I do not mind accepting a patch that prepends the
> nonlocal path to MANPATH in help.c::show_man_page().

OK, patch will follow shortly.

-- 
Sergei.

^ permalink raw reply

* Re: Git and GCC
From: Linus Torvalds @ 2007-12-06 18:35 UTC (permalink / raw)
  To: Jeff King
  Cc: Nicolas Pitre, Jon Smirl, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <20071206173946.GA10845@sigill.intra.peff.net>

On Thu, 6 Dec 2007, Jeff King wrote:
> 
> What is really disappointing is that we saved only about 20% of the 
> time. I didn't sit around watching the stages, but my guess is that we 
> spent a long time in the single threaded "writing objects" stage with a 
> thrashing delta cache.

I don't think you spent all that much time writing the objects. That part 
isn't very intensive, it's mostly about the IO.

I suspect you may simply be dominated by memory-throughput issues. The 
delta matching doesn't cache all that well, and using two or more cores 
isn't going to help all that much if they are largely waiting for memory 
(and quite possibly also perhaps fighting each other for a shared cache? 
Is this a Core 2 with the shared L2?)

			Linus

^ permalink raw reply

* Re: Git and GCC
From: Linus Torvalds @ 2007-12-06 18:29 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: David Miller, ismail, gcc, git
In-Reply-To: <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com>

On Thu, 6 Dec 2007, Daniel Berlin wrote:
>
> I worked on Monotone and other systems that use object stores. for a 
> little while :) In particular, I believe GIT's original object store was 
> based on Monotone, IIRC.

Yes and no. 

Monotone does what git does for the blobs. But there is a big difference 
in how git then does it for everything else too, ie trees and history. 
Tree being in that object store in particular are very important, and one 
of the biggest deals for deltas (actually, for two reasons: most of the 
time they don't change AT ALL if some subdirectory gets no changes and you 
don't need any delta, and even when they do change, it's usually going to 
delta very well, since it's usually just a small part that changes).

> > And then it's going to take forever and a day (ie a "do it overnight"
> > thing). But the end result is that everybody downstream from that
> > repository will get much better packs, without having to spend any effort
> > on it themselves.
> 
> If your forever and a day is spent figuring out which deltas to use,
> you can reduce this significantly.

It's almost all about figuring out the delta. Which is why *not* using 
"-f" (or "--aggressive") is such a big deal for normal operation, because 
then you just skip it all.

		Linus

^ permalink raw reply

* Re: Git and GCC
From: NightStrike @ 2007-12-06 18:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Berlin, David Miller, ismail, gcc, git
In-Reply-To: <alpine.LFD.0.9999.0712052132450.13796@woody.linux-foundation.org>

On 12/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 6 Dec 2007, Daniel Berlin wrote:
> >
> > Actually, it turns out that git-gc --aggressive does this dumb thing
> > to pack files sometimes regardless of whether you converted from an
> > SVN repo or not.
> I'll send a patch to Junio to just remove the "git gc --aggressive"
> documentation. It can be useful, but it generally is useful only when you
> really understand at a very deep level what it's doing, and that
> documentation doesn't help you do that.

No disrespect is meant by this reply.  I am just curious (and I am
probably misunderstanding something)..  Why remove all of the
documentation entirely?  Wouldn't it be better to just document it
more thoroughly?  I thought you did a fine job in this post in
explaining its purpose, when to use it, when not to, etc.  Removing
the documention seems counter-intuitive when you've already gone to
the trouble of creating good documentation here in this post.

^ permalink raw reply

* Re: git guidance
From: Andreas Ericsson @ 2007-12-06 18:24 UTC (permalink / raw)
  To: Al Boldi; +Cc: Phillip Susi, Linus Torvalds, Jing Xue, linux-kernel, git
In-Reply-To: <200712072035.47359.a1426z@gawab.com>

Al Boldi wrote:
> Phillip Susi wrote:
>> Al Boldi wrote:
>>> IOW, git currently only implements the server-side use-case, but fails
>>> to deliver on the client-side.  By introducing a git-client manager that
>>> handles the transparency needs of a single user, it should be possible
>>> to clearly isolate update semantics for both the client and the server,
>>> each handling their specific use-case.
>> Any talk of client or server makes no sense since git does not use a
>> client/server model.
> 
> Whether git uses the client/server model or not does not matter; what matters 
> is that there are two distinct use-cases at work here:  one on the 
> server/repository, and the other on the client.  
> 

Git is distributed. The repository is everywhere. No server is actually needed.
Many use one anyway since it can be convenient. It's not, however, necessary.

>> If you wish to use a centralized repository, then
>> git can be set up to transparently push/pull to/from said repository if
>> you wish via hooks or cron jobs.
> 
> Again, this only handles the interface to/from the server/repository, but 
> once you pulled the sources, it leaves you without Version Control on the 
> client.
> 

No, that's CVS, SVN and other centralized scm's. With git you have perfect
version control on each peer. That's the entire idea behind "fully
distributed".

> By pulling the sources into a git-client manager mounted on some dir, it 
> should be possible to let the developer work naturally/transparently in a 
> readable/writeable manner, and only require his input when reverting locally 
> or committing to the server/repository.
> 

How is that different from what every SCM, including git, is doing today? The
user needs to tell the scm when it's time to take a snapshot of the current
state. Git is distributed though, so committing is usually not the same as
publishing. Is that lack of a single command to commit and publish what's
nagging you? If it's not, I completely fail to see what you're getting at,
unless you've only ever looked at repositories without a worktree attached,
or you think that git should work like an editor's "undo" functionality,
which would be quite insane.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* git-clean and empty pathspec
From: Nguyen Thai Ngoc Duy @ 2007-12-06 18:21 UTC (permalink / raw)
  To: Git Mailing List

"git clean -n" would not remove directories while "git clean -n -- ''"
(two single quotes) would. Is there anything wrong with it?
-- 
Duy

^ permalink raw reply

* Re: Git and GCC
From: Daniel Berlin @ 2007-12-06 18:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Miller, ismail, gcc, git
In-Reply-To: <alpine.LFD.0.9999.0712052132450.13796@woody.linux-foundation.org>

On 12/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 6 Dec 2007, Daniel Berlin wrote:
> >
> > Actually, it turns out that git-gc --aggressive does this dumb thing
> > to pack files sometimes regardless of whether you converted from an
> > SVN repo or not.
>
> Absolutely. git --aggressive is mostly dumb. It's really only useful for
> the case of "I know I have a *really* bad pack, and I want to throw away
> all the bad packing decisions I have done".
>
> To explain this, it's worth explaining (you are probably aware of it, but
> let me go through the basics anyway) how git delta-chains work, and how
> they are so different from most other systems.
>
I worked on Monotone and other systems that use object stores. for a
little while :)
In particular, I believe GIT's original object store was based on
Monotone, IIRC.

> In other SCM's, a delta-chain is generally fixed. It might be "forwards"
> or "backwards", and it might evolve a bit as you work with the repository,
> but generally it's a chain of changes to a single file represented as some
> kind of single SCM entity. In CVS, it's obviously the *,v file, and a lot
> of other systems do rather similar things.

>
> Git also does delta-chains, but it does them a lot more "loosely". There
> is no fixed entity. Delta's are generated against any random other version
> that git deems to be a good delta candidate (with various fairly
> successful heursitics), and there are absolutely no hard grouping rules.

Sure. SVN actually supports this (surprisingly), it just never happens
to choose delta bases that aren't related by ancestry.  (IE it would
have absolutely no problem with you using random other parts of the
repository as delta bases, and i've played with it before).

I actually advocated we move towards an object store model, as
ancestry can be a  crappy way of approximating similarity when you
have a lot of branches.

> So the equivalent of "git gc --aggressive" - but done *properly* - is to
> do (overnight) something like
>
>         git repack -a -d --depth=250 --window=250
>
I gave this a try overnight, and it definitely helps a lot.
Thanks!

> And then it's going to take forever and a day (ie a "do it overnight"
> thing). But the end result is that everybody downstream from that
> repository will get much better packs, without having to spend any effort
> on it themselves.
>

If your forever and a day is spent figuring out which deltas to use,
you can reduce this significantly.
If it is spent writing out the data, it's much harder. :)

^ permalink raw reply

* Re: Git and GCC
From: Nicolas Pitre @ 2007-12-06 18:02 UTC (permalink / raw)
  To: Jeff King
  Cc: Jon Smirl, Daniel Berlin, Harvey Harrison, David Miller, ismail,
	gcc, git
In-Reply-To: <20071206173946.GA10845@sigill.intra.peff.net>

On Thu, 6 Dec 2007, Jeff King wrote:

> On Thu, Dec 06, 2007 at 09:18:39AM -0500, Nicolas Pitre wrote:
> 
> > > The downside is that the threading partitions the object space, so the
> > > resulting size is not necessarily as small (but I don't know that
> > > anybody has done testing on large repos to find out how large the
> > > difference is).
> > 
> > Quick guesstimate is in the 1% ballpark.
> 
> Fortunately, we now have numbers. Harvey Harrison reported repacking the
> gcc repo and getting these results:
> 
> > /usr/bin/time git repack -a -d -f --window=250 --depth=250
> >
> > 23266.37user 581.04system 7:41:25elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (419835major+123275804minor)pagefaults 0swaps
> >
> > -r--r--r-- 1 hharrison hharrison  29091872 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.idx
> > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack
> 
> I tried the threaded repack with pack.threads = 3 on a dual-processor
> machine, and got:
> 
>   time git repack -a -d -f --window=250 --depth=250
> 
>   real    309m59.849s
>   user    377m43.948s
>   sys     8m23.319s
> 
>   -r--r--r-- 1 peff peff  28570088 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.idx
>   -r--r--r-- 1 peff peff 339922573 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.pack
> 
> So it is about 5% bigger.

Right.  I should probably revisit that idea of finding deltas across 
partition boundaries to mitigate that loss.  And those partitions could 
be made coarser as well to reduce the number of such partition gaps 
(just increase the value of chunk_size on line 1648 in 
builtin-pack-objects.c).

> What is really disappointing is that we saved
> only about 20% of the time. I didn't sit around watching the stages, but
> my guess is that we spent a long time in the single threaded "writing
> objects" stage with a thrashing delta cache.

Maybe you should run the non threaded repack on the same machine to have 
a good comparison.  And if you have only 2 CPUs, you will have better 
performances with pack.threads = 2, otherwise there'll be wasteful task 
switching going on.

And of course, if the delta cache is being trashed, that might be due to 
the way the existing pack was previously packed.  Hence the current pack 
might impact object _access_ when repacking them.  So for a really 
really fair performance comparison, you'd have to preserve the original 
pack and swap it back before each repack attempt.

Nicolas

^ permalink raw reply

* Re: Git and GCC
From: Jeff King @ 2007-12-06 17:39 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Jon Smirl, Daniel Berlin, Harvey Harrison, David Miller, ismail,
	gcc, git
In-Reply-To: <alpine.LFD.0.99999.0712060915590.555@xanadu.home>

On Thu, Dec 06, 2007 at 09:18:39AM -0500, Nicolas Pitre wrote:

> > The downside is that the threading partitions the object space, so the
> > resulting size is not necessarily as small (but I don't know that
> > anybody has done testing on large repos to find out how large the
> > difference is).
> 
> Quick guesstimate is in the 1% ballpark.

Fortunately, we now have numbers. Harvey Harrison reported repacking the
gcc repo and getting these results:

> /usr/bin/time git repack -a -d -f --window=250 --depth=250
>
> 23266.37user 581.04system 7:41:25elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (419835major+123275804minor)pagefaults 0swaps
>
> -r--r--r-- 1 hharrison hharrison  29091872 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.idx
> -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack

I tried the threaded repack with pack.threads = 3 on a dual-processor
machine, and got:

  time git repack -a -d -f --window=250 --depth=250

  real    309m59.849s
  user    377m43.948s
  sys     8m23.319s

  -r--r--r-- 1 peff peff  28570088 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.idx
  -r--r--r-- 1 peff peff 339922573 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.pack

So it is about 5% bigger. What is really disappointing is that we saved
only about 20% of the time. I didn't sit around watching the stages, but
my guess is that we spent a long time in the single threaded "writing
objects" stage with a thrashing delta cache.

-Peff

^ permalink raw reply

* Re: git guidance
From: Al Boldi @ 2007-12-07 17:35 UTC (permalink / raw)
  To: Phillip Susi; +Cc: Linus Torvalds, Jing Xue, linux-kernel, git
In-Reply-To: <4755D2E8.5050402@cfl.rr.com>

Phillip Susi wrote:
> Al Boldi wrote:
> > IOW, git currently only implements the server-side use-case, but fails
> > to deliver on the client-side.  By introducing a git-client manager that
> > handles the transparency needs of a single user, it should be possible
> > to clearly isolate update semantics for both the client and the server,
> > each handling their specific use-case.
>
> Any talk of client or server makes no sense since git does not use a
> client/server model.

Whether git uses the client/server model or not does not matter; what matters 
is that there are two distinct use-cases at work here:  one on the 
server/repository, and the other on the client.  

> If you wish to use a centralized repository, then
> git can be set up to transparently push/pull to/from said repository if
> you wish via hooks or cron jobs.

Again, this only handles the interface to/from the server/repository, but 
once you pulled the sources, it leaves you without Version Control on the 
client.

By pulling the sources into a git-client manager mounted on some dir, it 
should be possible to let the developer work naturally/transparently in a 
readable/writeable manner, and only require his input when reverting locally 
or committing to the server/repository.

Thanks!

--
Al

^ permalink raw reply

* Re: [BUG/RFC git-gui] password for push/pull in case of git+ssh://repo
From: Thomas Harning @ 2007-12-06 17:11 UTC (permalink / raw)
  To: Ivo Alxneit; +Cc: git
In-Reply-To: <1196951517.3294.24.camel@localhost.localdomain>

Ivo Alxneit wrote:
> when i use git-gui (0.9.0) to push/pull to/from a git+ssh://repo i have
> to supply my password to ssh. i get the password prompt from ssh on the
> controlling shell. as i often use several shells and git-gui might run
> in the background it is rather bothering to find the correct shell where
> ssh expects the password. could this be changed (in a safe way) in
> git-gui e.g. like pinentry pops up a window when gpg is used to sign
> emails?
>
> p.s. please cc me. i have not subscribed to the list
>
> thanks
>   
I know this doesn't answer the problem exactly, but if you use ssh keys 
and some sort of key management utility (such as Keychain or maybe Gnome 
keyring?), you can dodge the password entry problem and never have to 
enter a password (pending you register your ssh key with the server [ex: 
ssh-copy-id servername])


Another option that might answer your problem (and could be potentially 
integrated into git-gui) is the usage of the SSH_ASKPASS environment 
variable.

SSH_ASKPASS is a program to execute to get the passphrase that works by 
reading console output from the program.

^ permalink raw reply

* Re: [PATCH] gc --aggressive: make it really aggressive
From: David Kastrup @ 2007-12-06 17:05 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Pierre Habouzit, Linus Torvalds, Daniel Berlin, David Miller,
	ismail, gcc, git, gitster
In-Reply-To: <Pine.LNX.4.64.0712061552550.27959@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> However, I think that --aggressive should be aggressive, and if you
> decide to run it on a machine which lacks the muscle to be aggressive,
> well, you should have known better.

That's a rather cheap shot.  "you should have known better" than
expecting to be able to use a documented command and option because the
git developers happened to have a nicer machine...

_How_ is one supposed to have known better?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply

* Re: [PATCH 2/3] git-help: add -w|--web option to display html man page in a browser.
From: Junio C Hamano @ 2007-12-06 17:05 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Theodore Tso, Jakub Narebski, Alex Riesen, Andreas Ericsson,
	Matthieu Moy, Eric Wong
In-Reply-To: <20071202060755.4d6d5ec8.chriscool@tuxfamily.org>

Christian Couder <chriscool@tuxfamily.org> writes:

> diff --git a/help.c b/help.c
> index 0f1cb71..ecc8c66 100644
> --- a/help.c
> +++ b/help.c
> @@ -297,7 +303,14 @@ int cmd_help(int argc, const char **argv, const char *prefix)
>  	const char *help_cmd = argc > 1 ? argv[1] : NULL;
>  	check_help_cmd(help_cmd);
>  
> -	if (!strcmp(help_cmd, "--info") || !strcmp(help_cmd, "-i")) {
> +	if (!strcmp(help_cmd, "--web") || !strcmp(help_cmd, "-w")) {
> +		help_cmd = argc > 2 ? argv[2] : NULL;
> +		check_help_cmd(help_cmd);
> +
> +		show_html_page(help_cmd);
> +	}
> +
> +	else if (!strcmp(help_cmd, "--info") || !strcmp(help_cmd, "-i")) {
>  		help_cmd = argc > 2 ? argv[2] : NULL;
>  		check_help_cmd(help_cmd);

Isn't this "check-help-cmd" duplication ugly, by the way?

^ permalink raw reply

* Re: [PATCH 2/3] git-help: add -w|--web option to display html man page in a browser.
From: Junio C Hamano @ 2007-12-06 17:03 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Theodore Tso, Jakub Narebski, Alex Riesen, Andreas Ericsson,
	Matthieu Moy, Eric Wong
In-Reply-To: <20071202060755.4d6d5ec8.chriscool@tuxfamily.org>

Christian Couder <chriscool@tuxfamily.org> writes:

> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index d886641..3e01718 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -29,6 +29,7 @@ DOC_MAN7=$(patsubst %.txt,%.7,$(MAN7_TXT))
>  
>  prefix?=$(HOME)
>  bindir?=$(prefix)/bin
> +htmldir?=$(prefix)/share/doc/git-doc
>  mandir?=$(prefix)/share/man
>  man1dir=$(mandir)/man1
>  man5dir=$(mandir)/man5

Doing this and then ...

> diff --git a/Makefile b/Makefile
> index a5a40ce..9204bfe 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -807,6 +808,7 @@ $(patsubst %.sh,%,$(SCRIPT_SH)) : % : %.sh
>  	    -e 's|@@PERL@@|$(PERL_PATH_SQ)|g' \
>  	    -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
>  	    -e 's/@@NO_CURL@@/$(NO_CURL)/g' \
> +	    -e 's|@@PREFIX@@|$(prefix_SQ)|g' \
>  	    $@.sh >$@+ && \
>  	chmod +x $@+ && \
>  	mv $@+ $@
> ...
> diff --git a/git-browse-help.sh b/git-browse-help.sh
> new file mode 100755
> index 0000000..11f8bfa
> --- /dev/null
> +++ b/git-browse-help.sh
> @@ -0,0 +1,154 @@
> +#!/bin/sh
> ...
> +USAGE='[--browser=browser|--tool=browser] [cmd to display] ...'
> +SUBDIRECTORY_OK=Yes
> +OPTIONS_SPEC=
> +. git-sh-setup
> +
> +PREFIX="@@PREFIX@@"
> +GIT_VERSION="@@GIT_VERSION@@"
> +
> +# Directories that may contain html documentation:
> +install_html_dir="$PREFIX/share/doc/git-doc"
> +rpm_dir="$PREFIX/share/doc/git-core-$GIT_VERSION"

... doing this is wrong. People can set htmldir to somewhere other than
$(prefix)/share/doc/git-doc while building and installing, but you are
not telling the munged script where it is.

> +init_browser_path() {
> +	browser_path=`git config browser.$1.path`
> +	test -z "$browser_path" && browser_path=$1
> +}

Please do not contaminate the config file with something the user can
easily use a lot more standardized way (iow $PATH) to configure to his
taste.

I'd suggest dropping this bit.

^ permalink raw reply

* Re: Difference in how "git status" and "git diff --name-only" lists filenames
From: Gustaf Hendeby @ 2007-12-06 16:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vodd9x7fu.fsf@gitster.siamese.dyndns.org>

On Dec 2, 2007 7:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
> "Gustaf Hendeby" <hendeby@gmail.com> writes:
> > A while ago 'git status' was patched to report relative pathnames.  (I
> > like that change it makes cut'n'paste easier.)  However, 'git diff
> > --name-only' and 'git diff --name-status' (other commands as well),
> > which gives in a sense similar output has not been changed the same
> > way.  Is this intentionally, or just because no one has stepped up and
> > provided a patch?  If the difference is to stay, maybe this should be
> > reflected in the help texts to avoid any confusion.
>
> The commands output from diff always talks about paths relative to the
> tree root, and scripts rely on it.  The recent change made exceptions to
> the status command.  I agree an additional documentation to git-status
> would be beneficial.
>
> Having said that, a switch --relative-name might be an option.  It could
> be argued that doing it the other way around (like --full-name option to
> ls-files does), defaulting to relative to cwd, would have been a getter
> approach if we were doing git from scratch, though.  We may still want
> to do so in the longer run, but that would be a huge interface change
> that would impact a lot of peoples' scripts.
>
>
> diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
> index 8fd0fc6..b0cb6bc 100644
> --- a/Documentation/git-status.txt
> +++ b/Documentation/git-status.txt
> @@ -40,6 +40,10 @@ OUTPUT
>  The output from this command is designed to be used as a commit
>  template comments, and all the output lines are prefixed with '#'.
>
> +The paths mentioned in the output, unlike many other git commands, are
> +made relative to the current directory, if you are working in a
> +subdirectory (this is on purpose, to help cutting and pasting).
> +
>
>  CONFIGURATION
>  -------------
>

Thank you for your timely answer and the good explanation.  Sorry for
my late response!  I think that the addition to the documentation that
you suggest sounds good, and would be useful.  Do you want me to do
anything else about this?

/Gustaf

^ permalink raw reply

* Re: [PATCH] gc --aggressive: make it really aggressive
From: Linus Torvalds @ 2007-12-06 16:19 UTC (permalink / raw)
  To: Harvey Harrison
  Cc: Johannes Schindelin, Daniel Berlin, David Miller, ismail, gcc,
	Git Mailing List, Junio C Hamano
In-Reply-To: <1196955059.13633.3.camel@brick>

On Thu, 6 Dec 2007, Harvey Harrison wrote:
> 
> 7:41:25elapsed 86%CPU

Heh. And this is why you want to do it exactly *once*, and then just 
export the end result for others ;)

> -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack

But yeah, especially if you allow longer delta chains, the end result can 
be much smaller (and what makes the one-time repack more expensive is the 
window size, not the delta chain - you could make the delta chains longer 
with no cost overhead at packing time)

HOWEVER. 

The longer delta chains do make it potentially much more expensive to then 
use old history. So there's a trade-off. And quite frankly, a delta depth 
of 250 is likely going to cause overflows in the delta cache (which is 
only 256 entries in size *and* it's a hash, so it's going to start having 
hash conflicts long before hitting the 250 depth limit).

So when I said "--depth=250 --window=250", I chose those numbers more as 
an example of extremely aggressive packing, and I'm not at all sure that 
the end result is necessarily wonderfully usable. It's going to save disk 
space (and network bandwidth - the delta's will be re-used for the network 
protocol too!), but there are definitely downsides too, and using long 
delta chains may simply not be worth it in practice.

(And some of it might just want to have git tuning, ie if people think 
that long deltas are worth it, we could easily just expand on the delta 
hash, at the cost of some more memory used!)

That said, the good news is that working with *new* history will not be 
affected negatively, and if you want to be _really_ sneaky, there are ways 
to say "create a pack that contains the history up to a version one year 
ago, and be very aggressive about those old versions that we still want to 
have around, but do a separate pack for newer stuff using less aggressive 
parameters"

So this is something that can be tweaked, although we don't really have 
any really nice interfaces for stuff like that (ie the git delta cache 
size is hardcoded in the sources and cannot be set in the config file, and 
the "pack old history more aggressively" involves some manual scripting 
and knowing how "git pack-objects" works rather than any nice simple 
command line switch).

So the thing to take away from this is:
 - git is certainly flexible as hell
 - .. but to get the full power you may need to tweak things
 - .. happily you really only need to have one person to do the tweaking, 
   and the tweaked end results will be available to others that do not 
   need to know/care.

And whether the difference between 320MB and 500MB is worth any really 
involved tweaking (considering the potential downsides), I really don't 
know. Only testing will tell.

			Linus

^ permalink raw reply

* Re: builtin command's prefix question
From: Junio C Hamano @ 2007-12-06 16:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nguyen Thai Ngoc Duy, Git Mailing List, madduck
In-Reply-To: <Pine.LNX.4.64.0712061547070.27959@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Thu, 6 Dec 2007, Nguyen Thai Ngoc Duy wrote:
>
>> On Dec 6, 2007 6:22 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> > However, if we define setup() to behave this way when GIT_DIR is not 
>> > defined and GIT_WORK_TREE is:
>> >
>> >  (1) internally pretend as if GIT_DIR was specified to be the
>> >      directory where the command was started from (iow, do getcwd()
>> >      once upon startup);
>> >
>> >  (2) chdir to GIT_WORK_TREE (which means "callers of setup() always
>> >      run from the top of the work tree");
>> >
>> >  (3) set prefix to NULL;
>> 
>> (1) is fine by me, even if it goes up to find a gitdir. But (3), no, 
>> prefix should be set as relative path from worktree top directory to 
>> user current directory, not NULL.
>
> If you expect "git <command> <filespec>" to work correctly from GIT_DIR, 
> you will _have_ to set the prefix to NULL.

That depends on the definition of "correctly".

As I said, I think the above "rule-looking things" implement an insane
behaviour where you are in one directory, and use that <filespec> to
name things relative to some other directory whose location is
completely unrelated to the directory you are in.  IOW, not a good set
of rules, and I do not necessarily agree with the statement that says
such a behaviour is working "correctly from GIT_DIR".

^ permalink raw reply

* Re: [PATCH] gc --aggressive: make it really aggressive
From: Johannes Schindelin @ 2007-12-06 15:56 UTC (permalink / raw)
  To: Harvey Harrison
  Cc: Linus Torvalds, Daniel Berlin, David Miller, ismail, gcc, git,
	gitster
In-Reply-To: <1196955059.13633.3.camel@brick>

Hi,

On Thu, 6 Dec 2007, Harvey Harrison wrote:

> -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26
> pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack

Wow.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] gc --aggressive: make it really aggressive
From: Johannes Schindelin @ 2007-12-06 15:55 UTC (permalink / raw)
  To: Pierre Habouzit
  Cc: Linus Torvalds, Daniel Berlin, David Miller, ismail, gcc, git,
	gitster
In-Reply-To: <20071206142254.GD5959@artemis.madism.org>

Hi,

On Thu, 6 Dec 2007, Pierre Habouzit wrote:

> On Thu, Dec 06, 2007 at 12:03:38PM +0000, Johannes Schindelin wrote:
> > 
> > The default was not to change the window or depth at all.  As 
> > suggested by Jon Smirl, Linus Torvalds and others, default to
> > 
> > 	--window=250 --depth=250
> 
>   well, this will explode on many quite reasonnably sized systems. This 
> should also use a memory-limit that could be auto-guessed from the 
> system total physical memory (50% of the actual memory could be a good 
> idea e.g.).
> 
>   On very large repositories, using that on the e.g. linux kernel, swaps 
> like hell on a machine with 1Go of ram, and almost nothing running on it 
> (less than 200Mo of ram actually used)

Yes.

However, I think that --aggressive should be aggressive, and if you decide 
to run it on a machine which lacks the muscle to be aggressive, well, you 
should have known better.

The upside: if you run this on a strong machine and clone it to a weak 
machine, you'll still have the benefit of a small pack (and you should 
mark it as .keep, too, to keep the benefit...)

Ciao,
Dscho

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox