Git development
 help / color / mirror / Atom feed
* Re: Git and GCC
From: Jon Smirl @ 2007-12-06 22:11 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linus Torvalds, Jeff King, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <alpine.LFD.0.99999.0712061645120.555@xanadu.home>

On 12/6/07, Nicolas Pitre <nico@cam.org> wrote:
> On Thu, 6 Dec 2007, Jon Smirl wrote:
>
> > On 12/6/07, Nicolas Pitre <nico@cam.org> wrote:
> > > > When I lasted looked at the code, the problem was in evenly dividing
> > > > the work. I was using a four core machine and most of the time one
> > > > core would end up with 3-5x the work of the lightest loaded core.
> > > > Setting pack.threads up to 20 fixed the problem. With a high number of
> > > > threads I was able to get a 4hr pack to finished in something like
> > > > 1:15.
> > >
> > > But as far as I know you didn't try my latest incarnation which has been
> > > available in Git's master branch for a few months already.
> >
> > I've deleted all my giant packs. Using the kernel pack:
> > 4GB Q6600
> >
> > Using the current thread pack code I get these results.
> >
> > The interesting case is the last one. I set it to 15 threads and
> > monitored with 'top'.
> > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and
> > 74-100% was 100% CPU. It never used all for cores. The only other
> > things running were top and my desktop. This is the same load
> > balancing problem I observed earlier.
>
> Well, that's possible with a window 25 times larger than the default.
>
> The load balancing is solved with a master thread serving relatively
> small object list segments to any work thread that finished with its
> previous segment.  But the size for those segments is currently fixed to
> window * 1000 which is way too large when window == 250.
>
> I have to find a way to auto-tune that segment size somehow.

That would be nice. Threading is most important on the giant
pack/window combinations. The normal case is fast enough that I don't
real notice it. These giant pack/window combos can run 8-10 hours.

>
> But with the default window size there should not be any such noticeable
> load balancing problem.

I only spend 30 seconds in the compression phase without making the
window larger. It's not long enough to really see what is going on.

>
> Note that threading only happens in the compression phase.  The count
> and write phase are hardly paralleled.
>
>
> Nicolas
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: XML parsing error from gitweb at freedesktop.org
From: Lars Hjemli @ 2007-12-06 22:11 UTC (permalink / raw)
  To: Adam Mercer; +Cc: git
In-Reply-To: <799406d60712061334q33d2dba5r5496ba21069a4547@mail.gmail.com>

On Dec 6, 2007 10:34 PM, Adam Mercer <ramercer@gmail.com> wrote:
> as cgit displays the commit OK
>
> <http://cgit.freedesktop.org/xorg/xserver/commit/?h=xorg-server-1.2-apple&id=48e6a75fbdd0fee86e364f02ace83f20b312a2b2>
>
> leads me to think that the problem lies in gitweb.  Could this be a
> problem with gitweb?
>

Actually, it's a problem in both cgit and gitweb, but you need to look
a bit harder to find the error in cgit. It seems that gitweb on
freedesktop.org doesn't detect a file rename (due to diff.renamelimit
maybe?), so it shows the full source of the offending files, while in
cgit you'll need to go from the diff to either the old or the new
sourcefile to get the same error:

http://cgit.freedesktop.org/xorg/xserver/tree/hw/darwin/quartz/applewmExt.h?h=xorg-server-1.2-apple&id=141f69dc3d8d6e7d8ff65607f43700ac11243041
http://cgit.freedesktop.org/xorg/xserver/diff/hw/xquartz/applewm.c?h=xorg-server-1.2-apple&id=48e6a75fbdd0fee86e364f02ace83f20b312a2b2

The problem is a number of unencoded ascii char 12.

--
larsh

^ permalink raw reply

* Re: Git and GCC
From: Nicolas Pitre @ 2007-12-06 22:08 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Linus Torvalds, Jeff King, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com>

On Thu, 6 Dec 2007, Jon Smirl wrote:

> On 12/6/07, Nicolas Pitre <nico@cam.org> wrote:
> > > When I lasted looked at the code, the problem was in evenly dividing
> > > the work. I was using a four core machine and most of the time one
> > > core would end up with 3-5x the work of the lightest loaded core.
> > > Setting pack.threads up to 20 fixed the problem. With a high number of
> > > threads I was able to get a 4hr pack to finished in something like
> > > 1:15.
> >
> > But as far as I know you didn't try my latest incarnation which has been
> > available in Git's master branch for a few months already.
> 
> I've deleted all my giant packs. Using the kernel pack:
> 4GB Q6600
> 
> Using the current thread pack code I get these results.
> 
> The interesting case is the last one. I set it to 15 threads and
> monitored with 'top'.
> For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and
> 74-100% was 100% CPU. It never used all for cores. The only other
> things running were top and my desktop. This is the same load
> balancing problem I observed earlier.

Well, that's possible with a window 25 times larger than the default.

The load balancing is solved with a master thread serving relatively 
small object list segments to any work thread that finished with its 
previous segment.  But the size for those segments is currently fixed to 
window * 1000 which is way too large when window == 250.

I have to find a way to auto-tune that segment size somehow.

But with the default window size there should not be any such noticeable 
load balancing problem.

Note that threading only happens in the compression phase.  The count 
and write phase are hardly paralleled.


Nicolas

^ permalink raw reply

* Re: how to create v2 patch
From: Andreas Ericsson @ 2007-12-06 22:03 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: Mike Hommey, Pascal Obry, git
In-Reply-To: <475855D6.201@imap.cc>

Tilman Schmidt wrote:
> Am 01.12.2007 14:43 schrieb Mike Hommey:
>> On Sat, Dec 01, 2007 at 02:17:39PM +0100, Pascal Obry wrote:
>>> Tilman Schmidt a écrit :
>>>> I have produced a patch, submitted it to LKML, received a few
>>>> comments, committed appropriate changes to my local git tree,
>>>> and now want to submit a revised patch. How do I do that?
>>>> If I just run git-format-patch again, it produces my original
>>>> patch plus a second one containing my updates, but what I need
>>>> is a single new patch replacing the first one.
>>> Can't you merge both of your changes in your local repository? I would
>>> do that with an interactive rebase.
>> Or just git commit --amend when committing.
> 
> Hmm. But wouldn't each of these approaches lead to my original
> commit being removed from my git repository? And isn't removing
> commits that have already been published strongly discouraged?
> 

The term "published" means different things for different projects.
For the Linux kernel, "published" is when your commit ends up in a
repository that Linus pulls from.

So long as you're getting suggestions to fix up your patch, it's
safe to assume it hasn't been accepted into one of those repos, and
you can safely --amend the offending commit(s).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: git guidance
From: Phillip Susi @ 2007-12-06 21:46 UTC (permalink / raw)
  To: Al Boldi; +Cc: Andreas Ericsson, Linus Torvalds, Jing Xue, linux-kernel, git
In-Reply-To: <200712072155.04643.a1426z@gawab.com>

Al Boldi wrote:
> When you read server, don't read it as localized; a server can be 
> distributed.  What distinguishes a server from an engine is that it has to 
> handle a multi-user use-case.  How that is implemented, locally or remotely 
> or distributed, is another issue.

And again, git handles both use cases, so what's your point?

> As explained before in this thread, replicating the git tree on the client 
> still doesn't provide the required transparency.

It has been pointed out to you that it DOES.  Either that or nobody else 
understands your nebulous use of "transparency" so maybe you should 
define it like we've been asking you.  Furthermore, the comment you 
replied to said nothing about transparency, nor did your comment it was 
in reply to; rather it was pointing out the fact that your statement 
that the git can not perform version control on the client is patently 
false.

>> How is that different from what every SCM, including git, is doing today?
>> The user needs to tell the scm when it's time to take a snapshot of the
>> current state. Git is distributed though, so committing is usually not the
>> same as publishing. Is that lack of a single command to commit and publish
>> what's nagging you? If it's not, I completely fail to see what you're
>> getting at, unless you've only ever looked at repositories without a
>> worktree attached, or you think that git should work like an editor's
>> "undo" functionality, which would be quite insane.
> 
> You need to re-read the thread.

Perhaps you should.  We have been trying to get you to explain how you 
think git isn't "transparent" while at the same time pointing out how we 
think it is.  You have failed to demonstrate any evidence to back up 
your claims, all of which have been shown to be false.

^ permalink raw reply

* Re: Git and GCC
From: Jon Smirl @ 2007-12-06 21:39 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linus Torvalds, Jeff King, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <alpine.LFD.0.99999.0712061403000.555@xanadu.home>

On 12/6/07, Nicolas Pitre <nico@cam.org> wrote:
> > When I lasted looked at the code, the problem was in evenly dividing
> > the work. I was using a four core machine and most of the time one
> > core would end up with 3-5x the work of the lightest loaded core.
> > Setting pack.threads up to 20 fixed the problem. With a high number of
> > threads I was able to get a 4hr pack to finished in something like
> > 1:15.
>
> But as far as I know you didn't try my latest incarnation which has been
> available in Git's master branch for a few months already.

I've deleted all my giant packs. Using the kernel pack:
4GB Q6600

Using the current thread pack code I get these results.

The interesting case is the last one. I set it to 15 threads and
monitored with 'top'.
For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and
74-100% was 100% CPU. It never used all for cores. The only other
things running were top and my desktop. This is the same load
balancing problem I observed earlier. Much more clock time was spent
in the 2/1 core phases than the 3 core one.

Threaded, threads = 5

jonsmirl@terra:/home/linux$ time git repack -a -d -f
Counting objects: 648366, done.
Compressing objects: 100% (647457/647457), done.
Writing objects: 100% (648366/648366), done.
Total 648366 (delta 528994), reused 0 (delta 0)

real    1m31.395s
user    2m59.239s
sys     0m3.048s
jonsmirl@terra:/home/linux$

12 seconds counting
53 seconds compressing
38 seconds writing

Without threads,

jonsmirl@terra:/home/linux$ time git repack -a -d -f
warning: no threads support, ignoring pack.threads
Counting objects: 648366, done.
Compressing objects: 100% (647457/647457), done.
Writing objects: 100% (648366/648366), done.
Total 648366 (delta 528999), reused 0 (delta 0)

real    2m54.849s
user    2m51.267s
sys     0m1.412s
jonsmirl@terra:/home/linux$

Threaded, threads = 5

jonsmirl@terra:/home/linux$ time git repack -a -d -f --depth=250 --window=250
Counting objects: 648366, done.
Compressing objects: 100% (647457/647457), done.
Writing objects: 100% (648366/648366), done.
Total 648366 (delta 539080), reused 0 (delta 0)

real    9m18.032s
user    19m7.484s
sys     0m3.880s
jonsmirl@terra:/home/linux$

jonsmirl@terra:/home/linux/.git/objects/pack$ ls -l
total 182156
-r--r--r-- 1 jonsmirl jonsmirl  15561848 2007-12-06 16:15
pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.idx
-r--r--r-- 1 jonsmirl jonsmirl 170768761 2007-12-06 16:15
pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.pack
jonsmirl@terra:/home/linux/.git/objects/pack$

Non-threaded:

jonsmirl@terra:/home/linux$ time git repack -a -d -f --depth=250 --window=250
warning: no threads support, ignoring pack.threads
Counting objects: 648366, done.
Compressing objects: 100% (647457/647457), done.
Writing objects: 100% (648366/648366), done.
Total 648366 (delta 539080), reused 0 (delta 0)

real    18m51.183s
user    18m46.538s
sys     0m1.604s
jonsmirl@terra:/home/linux$


jonsmirl@terra:/home/linux/.git/objects/pack$ ls -l
total 182156
-r--r--r-- 1 jonsmirl jonsmirl  15561848 2007-12-06 15:33
pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.idx
-r--r--r-- 1 jonsmirl jonsmirl 170768761 2007-12-06 15:33
pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.pack
jonsmirl@terra:/home/linux/.git/objects/pack$

Threaded, threads = 15

jonsmirl@terra:/home/linux$ time git repack -a -d -f --depth=250 --window=250
Counting objects: 648366, done.
Compressing objects: 100% (647457/647457), done.
Writing objects: 100% (648366/648366), done.
Total 648366 (delta 539080), reused 0 (delta 0)

real    9m18.325s
user    19m14.340s
sys     0m3.996s
jonsmirl@terra:/home/linux$

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: how to create v2 patch
From: Pascal Obry @ 2007-12-06 21:38 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: Mike Hommey, git
In-Reply-To: <475855D6.201@imap.cc>

Tilman Schmidt a écrit :
> Hmm. But wouldn't each of these approaches lead to my original
> commit being removed from my git repository? And isn't removing
> commits that have already been published strongly discouraged?

They won't be removed, just changed/merged... and that's what you were
looking for or I did not understand your question! This is not bad
practice as it is done on YOUR repository. Of course this should never
be done on a pushed/published changeset.

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|              http://www.obry.net
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595

^ permalink raw reply

* XML parsing error from gitweb at freedesktop.org
From: Adam Mercer @ 2007-12-06 21:34 UTC (permalink / raw)
  To: git

Hi

In looking through some commits on the freedesktop.org gitweb I
encountered an error trying to view the following commit

<http://gitweb.freedesktop.org/?p=xorg/xserver.git;a=commitdiff;h=48e6a75fbdd0fee86e364f02ace83f20b312a2b2>

In viewing the page under Safari on Mac OS X Leopard the following
error is displayed:

This page contains the following errors:
error on line 7998 at column 129634: internal error

Camino & Firefox both display:

XML Parsing Error: not well-formed
Location: http://gitweb.freedesktop.org/?p=xorg/xserver.git;a=commitdiff;h=48e6a75fbdd0fee86e364f02ace83f20b312a2b2
Line Number 7998, Column 24:

<div class="diff rem">- </div>
-----------------------^

as cgit displays the commit OK

<http://cgit.freedesktop.org/xorg/xserver/commit/?h=xorg-server-1.2-apple&id=48e6a75fbdd0fee86e364f02ace83f20b312a2b2>

leads me to think that the problem lies in gitweb.  Could this be a
problem with gitweb?

Cheers

Adam

^ permalink raw reply

* Re: Git and GCC
From: Junio C Hamano @ 2007-12-06 21:02 UTC (permalink / raw)
  To: Jon Loeliger
  Cc: Linus Torvalds, Daniel Berlin, David Miller, ismail, gcc,
	Git List
In-Reply-To: <7vk5nrd1yq.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Jon Loeliger <jdl@freescale.com> writes:
>
>> I'd like to learn more about that.  Can someone point me to
>> either more documentation on it?  In the absence of that,
>> perhaps a pointer to the source code that implements it?
>
> See Documentation/technical/pack-heuristics.txt,

A somewhat funny thing about this is ...

$ git show --stat --summary b116b297
commit b116b297a80b54632256eb89dd22ea2b140de622
Author: Jon Loeliger <jdl@jdl.com>
Date:   Thu Mar 2 19:19:29 2006 -0600

    Added Packing Heursitics IRC writeup.
    
    Signed-off-by: Jon Loeliger <jdl@jdl.com>
    Signed-off-by: Junio C Hamano <junkio@cox.net>

 Documentation/technical/pack-heuristics.txt |  466 +++++++++++++++++++++++++++
 1 files changed, 466 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/technical/pack-heuristics.txt

^ permalink raw reply

* Re: how to create v2 patch
From: Jan Hudec @ 2007-12-06 20:44 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: Mike Hommey, Pascal Obry, git
In-Reply-To: <475855D6.201@imap.cc>

On Thu, Dec 06, 2007 at 21:04:38 +0100, Tilman Schmidt wrote:
> Am 01.12.2007 14:43 schrieb Mike Hommey:
> > On Sat, Dec 01, 2007 at 02:17:39PM +0100, Pascal Obry wrote:
> >> Tilman Schmidt a écrit :
> >>> I have produced a patch, submitted it to LKML, received a few
> >>> comments, committed appropriate changes to my local git tree,
> >>> and now want to submit a revised patch. How do I do that?
> >>> If I just run git-format-patch again, it produces my original
> >>> patch plus a second one containing my updates, but what I need
> >>> is a single new patch replacing the first one.
> >> Can't you merge both of your changes in your local repository? I would
> >> do that with an interactive rebase.
> > 
> > Or just git commit --amend when committing.
> 
> Hmm. But wouldn't each of these approaches lead to my original
> commit being removed from my git repository? And isn't removing
> commits that have already been published strongly discouraged?

Removing commits that you already published is strongly discouraged. But
patch is not a commit. A v2 (short for 'second version') patch means a patch,
that should be applied /instead/ of the previous. The previous patch -- and
the commit it was generated from as well as any commit generated by applying
it -- should indeed be replaced by the new version.

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

^ permalink raw reply

* Re: Git and GCC. Why not with fork, exec and pipes like in linux?
From: J.C. Pizarro @ 2007-12-06 20:37 UTC (permalink / raw)
  To: Jon Smirl, Linus Torvalds
  Cc: Jeff King, Nicolas Pitre, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <998d0e4a0712061125h3d44139ctb7f5600bc8467292@mail.gmail.com>

On 2007/12/6, J.C. Pizarro <jcpiza@gmail.com>, i wrote:
> For multicores CPUs, don't divide the work in threads.
> To divide the work in processes!
>
> Tips, tricks and hacks: to use fork, exec, pipes and another IPC mechanisms like
> mutexes, shared memory's IPC, file locks, pipes, semaphores, RPCs, sockets, etc.
> to access concurrently and parallely to the filelocked database.

I'm sorry, we don't need exec. We need fork, pipes and another IPC mechanisms
because it so shares easy the C code for parallelism.

Thanks to Linus because GIT is implemented in C language to interact with
system calls of the kernel written in C.

> For Intel Quad Core e.g., x4 cores, it need a parent process and 4
> child processes linked to the parent with pipes.

For peak performance (e.g 99.9% usage), the minimum number of child
processes should be more than 4, normally between e.g. 6 and 10 processes
depending on the statistics of idle's stalls of the cores.

> The parent process can be
> * no-threaded using select/epoll/libevent
> * threaded using Pth (GNU Portable Threads), NPTL (from RedHat) or whatever.

Note: there is a little design's problem with slowdown of I/O bandwith when
the parent is multithreaded and the children MUST to be multithreaded that
we can't avoid them to be non-multithreaded for maximum I/O bandwith.

The "finding of the smallest spanning forest with deltas" consumes a lot of
CPU, so if it scales well in a CPU x4 cores then it can to reduce 4
hours to 1 hour.

   J.C.Pizarro :)

^ permalink raw reply

* Re: how to create v2 patch
From: Tilman Schmidt @ 2007-12-06 20:04 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Pascal Obry, git
In-Reply-To: <20071201134321.GA10997@glandium.org>

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

Am 01.12.2007 14:43 schrieb Mike Hommey:
> On Sat, Dec 01, 2007 at 02:17:39PM +0100, Pascal Obry wrote:
>> Tilman Schmidt a écrit :
>>> I have produced a patch, submitted it to LKML, received a few
>>> comments, committed appropriate changes to my local git tree,
>>> and now want to submit a revised patch. How do I do that?
>>> If I just run git-format-patch again, it produces my original
>>> patch plus a second one containing my updates, but what I need
>>> is a single new patch replacing the first one.
>> Can't you merge both of your changes in your local repository? I would
>> do that with an interactive rebase.
> 
> Or just git commit --amend when committing.

Hmm. But wouldn't each of these approaches lead to my original
commit being removed from my git repository? And isn't removing
commits that have already been published strongly discouraged?

Thx
T.

-- 
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Yes, I have searched Google!


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply

* Re: git guidance
From: Johannes Schindelin @ 2007-12-06 20:22 UTC (permalink / raw)
  To: Al Boldi
  Cc: Andreas Ericsson, Phillip Susi, Linus Torvalds, Jing Xue,
	linux-kernel, git
In-Reply-To: <200712072155.04643.a1426z@gawab.com>

Hi,

On Fri, 7 Dec 2007, Al Boldi wrote:

> Andreas Ericsson wrote:
> > Al Boldi wrote:
> >
> > > By pulling the sources into a git-client manager mounted on some 
> > > dir, it should be possible to let the developer work 
> > > naturally/transparently in a readable/writeable manner, and only 
> > > require his input when reverting locally or committing to the 
> > > server/repository.
> >
> > How is that different from what every SCM, including git, is doing 
> > today? The user needs to tell the scm when it's time to take a 
> > snapshot of the current state. Git is distributed though, so 
> > committing is usually not the same as publishing. Is that lack of a 
> > single command to commit and publish what's nagging you? If it's not, 
> > I completely fail to see what you're getting at, unless you've only 
> > ever looked at repositories without a worktree attached, or you think 
> > that git should work like an editor's "undo" functionality, which 
> > would be quite insane.
> 
> You need to re-read the thread.

I don't know why you write that, and then say thanks.  Clearly, what you 
wrote originally, and what Andreas pointed out, were quite obvious 
indicators that git already does what you suggest.

You _do_ work "transparently" (whatever you understand by that overused 
term) in the working directory, unimpeded by git.

And whenever it is time to revert or commit, you cry for help, invoking 
git.

So either you succeeded in making yourself misunderstood, or Andreas had 
quite the obvious and correct comment for you.

Not that diffcult,
Dscho

^ permalink raw reply

* Re: [PATCH 3/3] Color support for "git-add -i"
From: Junio C Hamano @ 2007-12-06 20:18 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: git
In-Reply-To: <5055E3DF-E01D-41B5-9F59-DAD69885CAE8@wincent.com>

Wincent Colaiuta <win@wincent.com> writes:

> What are the other options?
>
> - Make git-add--interactive part of builtin-add so as to be able to  
> use the core.whitespace code directly? (ideally yes and at some point  
> in the future it seems inevitable that this will happen, but it will  
> require a fair bit of work)
>
> - Fork a second "git diff-files" process to capture the colorized  
> version of the output? (may set off the "kludge" alarm)
>
> - Something else?

 - Realize that whitespace clean-up can be risky and change semantics
   depending on the material you are editing.  Do not do the clean-up
   during "add -i", but before.  IOW, add an alias that does an
   equivalent of:

	git diff HEAD >tmp
        git apply -R <tmp
        git apply --whitespace=fix <tmp

   then encourage users to clean-up their whitespace gotchas early (and
   test the result!), before even attempting to run "git add -i".

^ permalink raw reply

* Re: [PATCH] Let git-help prefer man-pages installed with this version of git
From: Johannes Schindelin @ 2007-12-06 20:09 UTC (permalink / raw)
  To: Sergei Organov; +Cc: git, gitster
In-Reply-To: <87hciv7jkt.fsf@osv.gnss.ru>

Hi,

On Thu, 6 Dec 2007, Sergei Organov wrote:

> Prepend $(prefix)/share/man to the MANPATH environment variable before 
> invoking 'man' from help.c:show_man_page().

This commit message is severely lacking.  Why would you _ever_ prefer the 
installed man pages before invoking "man", which should find them anyway?

Ciao,
Dscho

^ permalink raw reply

* Re: Git and GCC
From: Junio C Hamano @ 2007-12-06 20:04 UTC (permalink / raw)
  To: Jon Loeliger
  Cc: Linus Torvalds, Daniel Berlin, David Miller, ismail, gcc,
	Git List
In-Reply-To: <1196968371.18340.30.camel@ld0161-tx32>

Jon Loeliger <jdl@freescale.com> writes:

> On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote:
>
>> Git also does delta-chains, but it does them a lot more "loosely". There 
>> is no fixed entity. Delta's are generated against any random other version 
>> that git deems to be a good delta candidate (with various fairly 
>> successful heursitics), and there are absolutely no hard grouping rules.
>
> I'd like to learn more about that.  Can someone point me to
> either more documentation on it?  In the absence of that,
> perhaps a pointer to the source code that implements it?

See Documentation/technical/pack-heuristics.txt,
but the document predates and does not talk about delta
reusing, which was covered here:

    http://thread.gmane.org/gmane.comp.version-control.git/16223/focus=16267

> I guess one question I posit is, would it be more accurate
> to think of this as a "delta net" in a weighted graph rather
> than a "delta chain"?

Yes.

^ permalink raw reply

* Re: [PATCH 3/3] Color support for "git-add -i"
From: Wincent Colaiuta @ 2007-12-06 19:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <1196906706-11170-3-git-send-email-gitster@pobox.com>

El 6/12/2007, a las 3:05, Junio C Hamano escribió:

> +sub colored_diff_hunk {
> +	my ($text) = @_;
> +	# return the text, so that it can be passed to print()
> +	my @ret;
> +	for (@$text) {
> +		if (!$diff_use_color) {
> +			push @ret, $_;
> +			next;
> +		}
> +
> +		if (/^\+/) {
> +			push @ret, colored($new_color, $_);
> +		} elsif (/^\-/) {
> +			push @ret, colored($old_color, $_);
> +		} elsif (/^\@/) {
> +			push @ret, colored($fraginfo_color, $_);
> +		} elsif (/^ /) {
> +			push @ret, colored($normal_color, $_);
> +		} else {
> +			push @ret, colored($metainfo_color, $_);
> +		}
> +	}
> +	return @ret;
> +}


My one concern here is that as Git's awareness of whitespace problems  
becomes more sophisticated it will be harder and harder to do diff  
colorization in a way that matches that which is performed by the  
builtin diff tools. Here I'm talking about the whole core.whitespace  
series which allows the user to define per-path attributes specifying  
what kinds of things are to be considered whitespace errors; so far we  
have three classes of error proposed as far as I know: trailing  
whitespace, spaces before tabs, and indents with non-tabs.

I think it's very important that "git add --interactive" be 100%  
consistent with the other tools here because in many cases the  
previewing you do during an interactive session is what you rely upon  
to review whether a change should be committed. In other words, you  
don't want to think stuff is ok because "git add --interactive" leads  
you to believe that it's ok when it really isn't, and you don't want  
to have to run "git diff" as a separate step either just to double  
check.

We can replicate the core.whitespace logic here but it's likely to be  
an error prone process as it involves repeating the same logic in two  
different places using two different implementations in two different  
languages.

What are the other options?

- Make git-add--interactive part of builtin-add so as to be able to  
use the core.whitespace code directly? (ideally yes and at some point  
in the future it seems inevitable that this will happen, but it will  
require a fair bit of work)

- Fork a second "git diff-files" process to capture the colorized  
version of the output? (may set off the "kludge" alarm)

- Something else?

Cheers,
Wincent

^ permalink raw reply

* Re: Git and GCC
From: Linus Torvalds @ 2007-12-06 19:39 UTC (permalink / raw)
  To: Jon Loeliger; +Cc: Daniel Berlin, David Miller, ismail, gcc, Git List
In-Reply-To: <1196968371.18340.30.camel@ld0161-tx32>



On Thu, 6 Dec 2007, Jon Loeliger wrote:
>
> On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote:
> > Git also does delta-chains, but it does them a lot more "loosely". There 
> > is no fixed entity. Delta's are generated against any random other version 
> > that git deems to be a good delta candidate (with various fairly 
> > successful heursitics), and there are absolutely no hard grouping rules.
> 
> I'd like to learn more about that.  Can someone point me to
> either more documentation on it?  In the absence of that,
> perhaps a pointer to the source code that implements it?

Well, in a very real sense, what the delta code does is:
 - just list every single object in the whole repository
 - walk over each object, trying to find another object that it can be 
   written as a delta against
 - write out the result as a pack-file

That's simplified: we may not walk _all_ objects, for example: only a 
global repack does that (and most pack creations are actually for pushign 
and pulling between two repositories, so we only walk the objects that are 
in the source but not the destination repository).

The interesting phase is the "walk each object, try to find a delta" part. 
In particular, you don't want to try to find a delta by comparing each 
object to every other object out there (that would be O(n^2) in objects, 
and with a fairly high constant cost too!). So what it does is to sort the 
objects by a few heuristics (type of object, base name that object was 
found as when traversing a tree and size, and how recently it was found in 
the history).

And then over that sorted list, it tries to find deltas between entries 
that are "close" to each other (and that's where the "--window=xyz" thing 
comes in - it says how big the window is for objects being close. A 
smaller window generates somewhat less good deltas, but takes a lot less 
effort to generate).

The source is in git/builtin-pack-objects.c, with the core of it being

 - try_delta() - try to generate a *single* delta when given an object 
   pair.

 - find_deltas() - do the actual list traversal

 - prepare_pack() and type_size_sort() - create the delta sort list from 
   the list of objects.

but that whole file is probably some of the more opaque parts of git.

> I guess one question I posit is, would it be more accurate
> to think of this as a "delta net" in a weighted graph rather
> than a "delta chain"?

It's certainly not a simple chain, it's more of a set of acyclic directed 
graphs in the object list. And yes, it's weigted by the size of the delta 
between objects, and the optimization problem is kind of akin to finding 
the smallest spanning tree (well, forest - since you do *not* want to 
create one large graph, you also want to make the individual trees shallow 
enough that you don't have excessive delta depth).

There are good algorithms for finding minimum spanning trees, but this one 
is complicated by the fact that the biggest cost (by far!) is the 
calculation of the weights itself. So rather than really worry about 
finding the minimal tree/forest, the code needs to worry about not having 
to even calculate all the weights!

(That, btw, is a common theme. A lot of git is about traversing graphs, 
like the revision graph. And most of the trivial graph problems all assume 
that you have the whole graph, but since the "whole graph" is the whole 
history of the repository, those algorithms are totally worthless, since 
they are fundamentally much too expensive - if we have to generate the 
whole history, we're already screwed for a big project. So things like 
revision graph calculation, the main performance issue is to avoid having 
to even *look* at parts of the graph that we don't need to see!)

			Linus

^ permalink raw reply

* Re: Git and GCC. Why not with fork, exec and pipes like in linux?
From: J.C. Pizarro @ 2007-12-06 19:25 UTC (permalink / raw)
  To: Jon Smirl, Linus Torvalds
  Cc: Jeff King, Nicolas Pitre, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git

On 2007/12/06, "Jon Smirl" <jonsmirl@gmail.com> wrote:
> On 12/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > On Thu, 6 Dec 2007, Jeff King wrote:
> > >
> > > What is really disappointing is that we saved only about 20% of the
> > > time. I didn't sit around watching the stages, but my guess is that we
> > > spent a long time in the single threaded "writing objects" stage with a
> > > thrashing delta cache.
> >
> > I don't think you spent all that much time writing the objects. That part
> > isn't very intensive, it's mostly about the IO.
> >
> > I suspect you may simply be dominated by memory-throughput issues. The
> > delta matching doesn't cache all that well, and using two or more cores
> > isn't going to help all that much if they are largely waiting for memory
> > (and quite possibly also perhaps fighting each other for a shared cache?
> > Is this a Core 2 with the shared L2?)
>
> When I lasted looked at the code, the problem was in evenly dividing
> the work. I was using a four core machine and most of the time one
> core would end up with 3-5x the work of the lightest loaded core.
> Setting pack.threads up to 20 fixed the problem. With a high number of
> threads I was able to get a 4hr pack to finished in something like
> 1:15.
>
> A scheme where each core could work a minute without communicating to
> the other cores would be best. It would also be more efficient if the
> cores could avoid having sync points between them.
>
> --
> Jon Smirl
> jonsmirl@gmail.com

For multicores CPUs, don't divide the work in threads.
To divide the work in processes!

Tips, tricks and hacks: to use fork, exec, pipes and another IPC mechanisms like
mutexes, shared memory's IPC, file locks, pipes, semaphores, RPCs, sockets, etc.
to access concurrently and parallely to the filelocked database.

For Intel Quad Core e.g., x4 cores, it need a parent process and 4
child processes
linked to the parent with pipes.

The parent process can be
* no-threaded using select/epoll/libevent
* threaded using Pth (GNU Portable Threads), NPTL (from RedHat) or whatever.

   J.C.Pizarro

^ permalink raw reply

* Re: [PATCH 3/2] core.whitespace: documentation updates.
From: J. Bruce Fields @ 2007-12-06 19:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vodd4fb2f.fsf@gitster.siamese.dyndns.org>

On Thu, Dec 06, 2007 at 01:04:56AM -0800, Junio C Hamano wrote:
> "J. Bruce Fields" <bfields@fieldses.org> writes:
> 
> > On Sat, Nov 24, 2007 at 01:42:46PM -0800, Junio C Hamano wrote:
> >> "J. Bruce Fields" <bfields@fieldses.org> writes:
> >> 
> >> > I'd still prefer this to be a gitattributes thing rather than a config
> >> > variable[1].  Last time I raised this you said something to the effect
> >> > of "I think you're right, let's fix that before it's merged."  Would you
> >> > like me to work on that?
> >> 
> >> Ah, I forgot about that, and you are right.  Go wild.
> >
> > OK, I will go wild, but... very slowly.
> 
> How wild are you these days ;-)?  I know December is a busy time for
> everybody, and I ended up doing this myself while I was writing up the
> API documentation for gitattributes.

Wow, thanks!  Yes, I haven't done a thing on this.

> -- >8 --
> [PATCH] Use gitattributes to define per-path whitespace rule
> 
> The `core.whitespace` configuration variable allows you to define what
> `diff` and `apply` should consider whitespace errors for all paths in
> the project (See gitlink:git-config[1]).  This attribute gives you finer
> control per path.

That looks like what I'd hoped for.

I'll set aside some time this weekend to play around with it.  (Is there
any other piece left that needs to be done?)

--b.

> 
> For example, if you have these in the .gitattributes:
> 
>     frotz   whitespace
>     nitfol  -whitespace
>     xyzzy   whitespace=-trailing
> 
> all types of whitespace problems known to git are noticed in path 'frotz'
> (i.e. diff shows them in diff.whitespace color, and apply warns about
> them), no whitespace problem is noticed in path 'nitfol', and the
> default types of whitespace problems except "trailing whitespace" are
> noticed for path 'xyzzy'.  A project with mixed Python and C might want
> to have:
> 
>     *.c    whitespace
>     *.py   whitespace=-indent-with-non-tab
> 
> in its toplevel .gitattributes file.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/gitattributes.txt |   31 +++++++++++++
>  Makefile                        |    2 +-
>  builtin-apply.c                 |   36 +++++++++------
>  cache.h                         |    4 +-
>  config.c                        |   50 +--------------------
>  diff.c                          |   19 +++++---
>  environment.c                   |    2 +-
>  t/t4019-diff-wserror.sh         |   47 +++++++++++++++++++
>  t/t4124-apply-ws-rule.sh        |   20 ++++++++-
>  ws.c                            |   96 +++++++++++++++++++++++++++++++++++++++
>  10 files changed, 233 insertions(+), 74 deletions(-)
>  create mode 100644 ws.c
> 
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 20cf8ff..c4bcbb9 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -360,6 +360,37 @@ When left unspecified, the driver itself is used for both
>  internal merge and the final merge.
>  
>  
> +Checking whitespace errors
> +~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +`whitespace`
> +^^^^^^^^^^^^
> +
> +The `core.whitespace` configuration variable allows you to define what
> +`diff` and `apply` should consider whitespace errors for all paths in
> +the project (See gitlink:git-config[1]).  This attribute gives you finer
> +control per path.
> +
> +Set::
> +
> +	Notice all types of potential whitespace errors known to git.
> +
> +Unset::
> +
> +	Do not notice anything as error.
> +
> +Unspecified::
> +
> +	Use the value of `core.whitespace` configuration variable to
> +	decide what to notice as error.
> +
> +String::
> +
> +	Specify a comma separate list of common whitespace problems to
> +	notice in the same format as `core.whitespace` configuration
> +	variable.
> +
> +
>  EXAMPLE
>  -------
>  
> diff --git a/Makefile b/Makefile
> index 042f79e..ac6b079 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -312,7 +312,7 @@ LIB_OBJS = \
>  	alloc.o merge-file.o path-list.o help.o unpack-trees.o $(DIFF_OBJS) \
>  	color.o wt-status.o archive-zip.o archive-tar.o shallow.o utf8.o \
>  	convert.o attr.o decorate.o progress.o mailmap.o symlinks.o remote.o \
> -	transport.o bundle.o walker.o parse-options.o
> +	transport.o bundle.o walker.o parse-options.o ws.o
>  
>  BUILTIN_OBJS = \
>  	builtin-add.o \
> diff --git a/builtin-apply.c b/builtin-apply.c
> index 57efcd5..ee3ef60 100644
> --- a/builtin-apply.c
> +++ b/builtin-apply.c
> @@ -144,6 +144,7 @@ struct patch {
>  	unsigned int old_mode, new_mode;
>  	int is_new, is_delete;	/* -1 = unknown, 0 = false, 1 = true */
>  	int rejected;
> +	unsigned ws_rule;
>  	unsigned long deflate_origlen;
>  	int lines_added, lines_deleted;
>  	int score;
> @@ -898,7 +899,7 @@ static int find_header(char *line, unsigned long size, int *hdrsize, struct patc
>  	return -1;
>  }
>  
> -static void check_whitespace(const char *line, int len)
> +static void check_whitespace(const char *line, int len, unsigned ws_rule)
>  {
>  	const char *err = "Adds trailing whitespace";
>  	int seen_space = 0;
> @@ -910,14 +911,14 @@ static void check_whitespace(const char *line, int len)
>  	 * this function.  That is, an addition of an empty line would
>  	 * check the '+' here.  Sneaky...
>  	 */
> -	if ((whitespace_rule & WS_TRAILING_SPACE) && isspace(line[len-2]))
> +	if ((ws_rule & WS_TRAILING_SPACE) && isspace(line[len-2]))
>  		goto error;
>  
>  	/*
>  	 * Make sure that there is no space followed by a tab in
>  	 * indentation.
>  	 */
> -	if (whitespace_rule & WS_SPACE_BEFORE_TAB) {
> +	if (ws_rule & WS_SPACE_BEFORE_TAB) {
>  		err = "Space in indent is followed by a tab";
>  		for (i = 1; i < len; i++) {
>  			if (line[i] == '\t') {
> @@ -935,7 +936,7 @@ static void check_whitespace(const char *line, int len)
>  	 * Make sure that the indentation does not contain more than
>  	 * 8 spaces.
>  	 */
> -	if ((whitespace_rule & WS_INDENT_WITH_NON_TAB) &&
> +	if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
>  	    (8 < len) && !strncmp("+        ", line, 9)) {
>  		err = "Indent more than 8 places with spaces";
>  		goto error;
> @@ -1001,7 +1002,7 @@ static int parse_fragment(char *line, unsigned long size,
>  		case '-':
>  			if (apply_in_reverse &&
>  			    ws_error_action != nowarn_ws_error)
> -				check_whitespace(line, len);
> +				check_whitespace(line, len, patch->ws_rule);
>  			deleted++;
>  			oldlines--;
>  			trailing = 0;
> @@ -1009,7 +1010,7 @@ static int parse_fragment(char *line, unsigned long size,
>  		case '+':
>  			if (!apply_in_reverse &&
>  			    ws_error_action != nowarn_ws_error)
> -				check_whitespace(line, len);
> +				check_whitespace(line, len, patch->ws_rule);
>  			added++;
>  			newlines--;
>  			trailing = 0;
> @@ -1318,6 +1319,10 @@ static int parse_chunk(char *buffer, unsigned long size, struct patch *patch)
>  	if (offset < 0)
>  		return offset;
>  
> +	patch->ws_rule = whitespace_rule(patch->new_name
> +					 ? patch->new_name
> +					 : patch->old_name);
> +
>  	patchsize = parse_single_patch(buffer + offset + hdrsize,
>  				       size - offset - hdrsize, patch);
>  
> @@ -1568,7 +1573,8 @@ static void remove_last_line(const char **rbuf, int *rsize)
>  	*rsize = offset + 1;
>  }
>  
> -static int apply_line(char *output, const char *patch, int plen)
> +static int apply_line(char *output, const char *patch, int plen,
> +		      unsigned ws_rule)
>  {
>  	/*
>  	 * plen is number of bytes to be copied from patch,
> @@ -1593,7 +1599,7 @@ static int apply_line(char *output, const char *patch, int plen)
>  	/*
>  	 * Strip trailing whitespace
>  	 */
> -	if ((whitespace_rule & WS_TRAILING_SPACE) &&
> +	if ((ws_rule & WS_TRAILING_SPACE) &&
>  	    (1 < plen && isspace(patch[plen-1]))) {
>  		if (patch[plen] == '\n')
>  			add_nl_to_tail = 1;
> @@ -1610,12 +1616,12 @@ static int apply_line(char *output, const char *patch, int plen)
>  		char ch = patch[i];
>  		if (ch == '\t') {
>  			last_tab_in_indent = i;
> -			if ((whitespace_rule & WS_SPACE_BEFORE_TAB) &&
> +			if ((ws_rule & WS_SPACE_BEFORE_TAB) &&
>  			    0 <= last_space_in_indent)
>  			    need_fix_leading_space = 1;
>  		} else if (ch == ' ') {
>  			last_space_in_indent = i;
> -			if ((whitespace_rule & WS_INDENT_WITH_NON_TAB) &&
> +			if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
>  			    last_tab_in_indent < 0 &&
>  			    8 <= i)
>  				need_fix_leading_space = 1;
> @@ -1629,7 +1635,7 @@ static int apply_line(char *output, const char *patch, int plen)
>  		int consecutive_spaces = 0;
>  		int last = last_tab_in_indent + 1;
>  
> -		if (whitespace_rule & WS_INDENT_WITH_NON_TAB) {
> +		if (ws_rule & WS_INDENT_WITH_NON_TAB) {
>  			/* have "last" point at one past the indent */
>  			if (last_tab_in_indent < last_space_in_indent)
>  				last = last_space_in_indent + 1;
> @@ -1671,7 +1677,7 @@ static int apply_line(char *output, const char *patch, int plen)
>  }
>  
>  static int apply_one_fragment(struct strbuf *buf, struct fragment *frag,
> -			      int inaccurate_eof)
> +			      int inaccurate_eof, unsigned ws_rule)
>  {
>  	int match_beginning, match_end;
>  	const char *patch = frag->patch;
> @@ -1730,7 +1736,7 @@ static int apply_one_fragment(struct strbuf *buf, struct fragment *frag,
>  		case '+':
>  			if (first != '+' || !no_add) {
>  				int added = apply_line(new + newsize, patch,
> -						       plen);
> +						       plen, ws_rule);
>  				newsize += added;
>  				if (first == '+' &&
>  				    added == 1 && new[newsize-1] == '\n')
> @@ -1953,12 +1959,14 @@ static int apply_fragments(struct strbuf *buf, struct patch *patch)
>  {
>  	struct fragment *frag = patch->fragments;
>  	const char *name = patch->old_name ? patch->old_name : patch->new_name;
> +	unsigned ws_rule = patch->ws_rule;
> +	unsigned inaccurate_eof = patch->inaccurate_eof;
>  
>  	if (patch->is_binary)
>  		return apply_binary(buf, patch);
>  
>  	while (frag) {
> -		if (apply_one_fragment(buf, frag, patch->inaccurate_eof)) {
> +		if (apply_one_fragment(buf, frag, inaccurate_eof, ws_rule)) {
>  			error("patch failed: %s:%ld", name, frag->oldpos);
>  			if (!apply_with_reject)
>  				return -1;
> diff --git a/cache.h b/cache.h
> index 3f42827..9cc6268 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -610,6 +610,8 @@ void shift_tree(const unsigned char *, const unsigned char *, unsigned char *, i
>  #define WS_SPACE_BEFORE_TAB	02
>  #define WS_INDENT_WITH_NON_TAB	04
>  #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB)
> -extern unsigned whitespace_rule;
> +extern unsigned whitespace_rule_cfg;
> +extern unsigned whitespace_rule(const char *);
> +extern unsigned parse_whitespace_rule(const char *);
>  
>  #endif /* CACHE_H */
> diff --git a/config.c b/config.c
> index d5b9766..2500e0d 100644
> --- a/config.c
> +++ b/config.c
> @@ -246,54 +246,6 @@ static unsigned long get_unit_factor(const char *end)
>  	die("unknown unit: '%s'", end);
>  }
>  
> -static struct whitespace_rule {
> -	const char *rule_name;
> -	unsigned rule_bits;
> -} whitespace_rule_names[] = {
> -	{ "trailing-space", WS_TRAILING_SPACE },
> -	{ "space-before-tab", WS_SPACE_BEFORE_TAB },
> -	{ "indent-with-non-tab", WS_INDENT_WITH_NON_TAB },
> -};
> -
> -static unsigned parse_whitespace_rule(const char *string)
> -{
> -	unsigned rule = WS_DEFAULT_RULE;
> -
> -	while (string) {
> -		int i;
> -		size_t len;
> -		const char *ep;
> -		int negated = 0;
> -
> -		string = string + strspn(string, ", \t\n\r");
> -		ep = strchr(string, ',');
> -		if (!ep)
> -			len = strlen(string);
> -		else
> -			len = ep - string;
> -
> -		if (*string == '-') {
> -			negated = 1;
> -			string++;
> -			len--;
> -		}
> -		if (!len)
> -			break;
> -		for (i = 0; i < ARRAY_SIZE(whitespace_rule_names); i++) {
> -			if (strncmp(whitespace_rule_names[i].rule_name,
> -				    string, len))
> -				continue;
> -			if (negated)
> -				rule &= ~whitespace_rule_names[i].rule_bits;
> -			else
> -				rule |= whitespace_rule_names[i].rule_bits;
> -			break;
> -		}
> -		string = ep;
> -	}
> -	return rule;
> -}
> -
>  int git_parse_long(const char *value, long *ret)
>  {
>  	if (value && *value) {
> @@ -480,7 +432,7 @@ int git_default_config(const char *var, const char *value)
>  	}
>  
>  	if (!strcmp(var, "core.whitespace")) {
> -		whitespace_rule = parse_whitespace_rule(value);
> +		whitespace_rule_cfg = parse_whitespace_rule(value);
>  		return 0;
>  	}
>  
> diff --git a/diff.c b/diff.c
> index 6bb902f..c3a1942 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -454,6 +454,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
>  struct emit_callback {
>  	struct xdiff_emit_state xm;
>  	int nparents, color_diff;
> +	unsigned ws_rule;
>  	const char **label_path;
>  	struct diff_words_data *diff_words;
>  	int *found_changesp;
> @@ -493,8 +494,8 @@ static void emit_line(const char *set, const char *reset, const char *line, int
>  }
>  
>  static void emit_line_with_ws(int nparents,
> -		const char *set, const char *reset, const char *ws,
> -		const char *line, int len)
> +			      const char *set, const char *reset, const char *ws,
> +			      const char *line, int len, unsigned ws_rule)
>  {
>  	int col0 = nparents;
>  	int last_tab_in_indent = -1;
> @@ -511,7 +512,7 @@ static void emit_line_with_ws(int nparents,
>  	for (i = col0; i < len; i++) {
>  		if (line[i] == '\t') {
>  			last_tab_in_indent = i;
> -			if ((whitespace_rule & WS_SPACE_BEFORE_TAB) &&
> +			if ((ws_rule & WS_SPACE_BEFORE_TAB) &&
>  			    0 <= last_space_in_indent)
>  				need_highlight_leading_space = 1;
>  		}
> @@ -520,7 +521,7 @@ static void emit_line_with_ws(int nparents,
>  		else
>  			break;
>  	}
> -	if ((whitespace_rule & WS_INDENT_WITH_NON_TAB) &&
> +	if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
>  	    0 <= last_space_in_indent &&
>  	    last_tab_in_indent < 0 &&
>  	    8 <= (i - col0)) {
> @@ -551,7 +552,7 @@ static void emit_line_with_ws(int nparents,
>  	tail = len - 1;
>  	if (line[tail] == '\n' && i < tail)
>  		tail--;
> -	if (whitespace_rule & WS_TRAILING_SPACE) {
> +	if (ws_rule & WS_TRAILING_SPACE) {
>  		while (i < tail) {
>  			if (!isspace(line[tail]))
>  				break;
> @@ -578,7 +579,7 @@ static void emit_add_line(const char *reset, struct emit_callback *ecbdata, cons
>  		emit_line(set, reset, line, len);
>  	else
>  		emit_line_with_ws(ecbdata->nparents, set, reset, ws,
> -				line, len);
> +				  line, len, ecbdata->ws_rule);
>  }
>  
>  static void fn_out_consume(void *priv, char *line, unsigned long len)
> @@ -994,6 +995,7 @@ struct checkdiff_t {
>  	struct xdiff_emit_state xm;
>  	const char *filename;
>  	int lineno, color_diff;
> +	unsigned ws_rule;
>  };
>  
>  static void checkdiff_consume(void *priv, char *line, unsigned long len)
> @@ -1029,7 +1031,8 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
>  			if (white_space_at_end)
>  				printf("white space at end");
>  			printf(":%s ", reset);
> -			emit_line_with_ws(1, set, reset, ws, line, len);
> +			emit_line_with_ws(1, set, reset, ws, line, len,
> +					  data->ws_rule);
>  		}
>  
>  		data->lineno++;
> @@ -1330,6 +1333,7 @@ static void builtin_diff(const char *name_a,
>  		ecbdata.label_path = lbl;
>  		ecbdata.color_diff = o->color_diff;
>  		ecbdata.found_changesp = &o->found_changes;
> +		ecbdata.ws_rule = whitespace_rule(name_b ? name_b : name_a);
>  		xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts;
>  		xecfg.ctxlen = o->context;
>  		xecfg.flags = XDL_EMIT_FUNCNAMES;
> @@ -1423,6 +1427,7 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
>  	data.filename = name_b ? name_b : name_a;
>  	data.lineno = 0;
>  	data.color_diff = o->color_diff;
> +	data.ws_rule = whitespace_rule(data.filename);
>  
>  	if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0)
>  		die("unable to read files to diff");
> diff --git a/environment.c b/environment.c
> index 624dd96..2fbbc8e 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -35,7 +35,7 @@ int pager_in_use;
>  int pager_use_color = 1;
>  char *editor_program;
>  int auto_crlf = 0;	/* 1: both ways, -1: only when adding git objects */
> -unsigned whitespace_rule = WS_DEFAULT_RULE;
> +unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
>  
>  /* This is set by setup_git_dir_gently() and/or git_default_config() */
>  char *git_work_tree_cfg;
> diff --git a/t/t4019-diff-wserror.sh b/t/t4019-diff-wserror.sh
> index dbc895b..67e080b 100755
> --- a/t/t4019-diff-wserror.sh
> +++ b/t/t4019-diff-wserror.sh
> @@ -45,8 +45,24 @@ test_expect_success 'without -trail' '
>  
>  '
>  
> +test_expect_success 'without -trail (attribute)' '
> +
> +	git config --unset core.whitespace
> +	echo "F whitespace=-trail" >.gitattributes
> +	git diff --color >output
> +	grep "$blue_grep" output >error
> +	grep -v "$blue_grep" output >normal
> +
> +	grep Eight normal >/dev/null &&
> +	grep HT error >/dev/null &&
> +	grep With normal >/dev/null &&
> +	grep No normal >/dev/null
> +
> +'
> +
>  test_expect_success 'without -space' '
>  
> +	rm -f .gitattributes
>  	git config core.whitespace -space
>  	git diff --color >output
>  	grep "$blue_grep" output >error
> @@ -59,8 +75,24 @@ test_expect_success 'without -space' '
>  
>  '
>  
> +test_expect_success 'without -space (attribute)' '
> +
> +	git config --unset core.whitespace
> +	echo "F whitespace=-space" >.gitattributes
> +	git diff --color >output
> +	grep "$blue_grep" output >error
> +	grep -v "$blue_grep" output >normal
> +
> +	grep Eight normal >/dev/null &&
> +	grep HT normal >/dev/null &&
> +	grep With error >/dev/null &&
> +	grep No normal >/dev/null
> +
> +'
> +
>  test_expect_success 'with indent-non-tab only' '
>  
> +	rm -f .gitattributes
>  	git config core.whitespace indent,-trailing,-space
>  	git diff --color >output
>  	grep "$blue_grep" output >error
> @@ -73,4 +105,19 @@ test_expect_success 'with indent-non-tab only' '
>  
>  '
>  
> +test_expect_success 'with indent-non-tab only (attribute)' '
> +
> +	git config --unset core.whitespace
> +	echo "F whitespace=indent,-trailing,-space" >.gitattributes
> +	git diff --color >output
> +	grep "$blue_grep" output >error
> +	grep -v "$blue_grep" output >normal
> +
> +	grep Eight error >/dev/null &&
> +	grep HT normal >/dev/null &&
> +	grep With normal >/dev/null &&
> +	grep No normal >/dev/null
> +
> +'
> +
>  test_done
> diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh
> index f53ac46..85f3da2 100755
> --- a/t/t4124-apply-ws-rule.sh
> +++ b/t/t4124-apply-ws-rule.sh
> @@ -112,6 +112,15 @@ test_expect_success 'whitespace=error-all, no rule' '
>  
>  '
>  
> +test_expect_success 'whitespace=error-all, no rule (attribute)' '
> +
> +	git config --unset core.whitespace &&
> +	echo "target -whitespace" >.gitattributes &&
> +	apply_patch --whitespace=error-all &&
> +	diff file target
> +
> +'
> +
>  for t in - ''
>  do
>  	case "$t" in '') tt='!' ;; *) tt= ;; esac
> @@ -121,11 +130,20 @@ do
>  		for i in - ''
>  		do
>  			case "$i" in '') ti='#' ;; *) ti= ;; esac
> -			rule=${t}trailing,${s}space,${i}indent &&
> +			rule=${t}trailing,${s}space,${i}indent
> +
> +			rm -f .gitattributes
>  			test_expect_success "rule=$rule" '
>  				git config core.whitespace "$rule" &&
>  				test_fix "$tt$ts$ti"
>  			'
> +
> +			test_expect_success "rule=$rule (attributes)" '
> +				git config --unset core.whitespace &&
> +				echo "target whitespace=$rule" >.gitattributes &&
> +				test_fix "$tt$ts$ti"
> +			'
> +
>  		done
>  	done
>  done
> diff --git a/ws.c b/ws.c
> new file mode 100644
> index 0000000..52c10ca
> --- /dev/null
> +++ b/ws.c
> @@ -0,0 +1,96 @@
> +/*
> + * Whitespace rules
> + *
> + * Copyright (c) 2007 Junio C Hamano
> + */
> +
> +#include "cache.h"
> +#include "attr.h"
> +
> +static struct whitespace_rule {
> +	const char *rule_name;
> +	unsigned rule_bits;
> +} whitespace_rule_names[] = {
> +	{ "trailing-space", WS_TRAILING_SPACE },
> +	{ "space-before-tab", WS_SPACE_BEFORE_TAB },
> +	{ "indent-with-non-tab", WS_INDENT_WITH_NON_TAB },
> +};
> +
> +unsigned parse_whitespace_rule(const char *string)
> +{
> +	unsigned rule = WS_DEFAULT_RULE;
> +
> +	while (string) {
> +		int i;
> +		size_t len;
> +		const char *ep;
> +		int negated = 0;
> +
> +		string = string + strspn(string, ", \t\n\r");
> +		ep = strchr(string, ',');
> +		if (!ep)
> +			len = strlen(string);
> +		else
> +			len = ep - string;
> +
> +		if (*string == '-') {
> +			negated = 1;
> +			string++;
> +			len--;
> +		}
> +		if (!len)
> +			break;
> +		for (i = 0; i < ARRAY_SIZE(whitespace_rule_names); i++) {
> +			if (strncmp(whitespace_rule_names[i].rule_name,
> +				    string, len))
> +				continue;
> +			if (negated)
> +				rule &= ~whitespace_rule_names[i].rule_bits;
> +			else
> +				rule |= whitespace_rule_names[i].rule_bits;
> +			break;
> +		}
> +		string = ep;
> +	}
> +	return rule;
> +}
> +
> +static void setup_whitespace_attr_check(struct git_attr_check *check)
> +{
> +	static struct git_attr *attr_whitespace;
> +
> +	if (!attr_whitespace)
> +		attr_whitespace = git_attr("whitespace", 10);
> +	check[0].attr = attr_whitespace;
> +}
> +
> +unsigned whitespace_rule(const char *pathname)
> +{
> +	struct git_attr_check attr_whitespace_rule;
> +
> +	setup_whitespace_attr_check(&attr_whitespace_rule);
> +	if (!git_checkattr(pathname, 1, &attr_whitespace_rule)) {
> +		const char *value;
> +
> +		value = attr_whitespace_rule.value;
> +		if (ATTR_TRUE(value)) {
> +			/* true (whitespace) */
> +			unsigned all_rule = 0;
> +			int i;
> +			for (i = 0; i < ARRAY_SIZE(whitespace_rule_names); i++)
> +				all_rule |= whitespace_rule_names[i].rule_bits;
> +			return all_rule;
> +		} else if (ATTR_FALSE(value)) {
> +			/* false (-whitespace) */
> +			return 0;
> +		} else if (ATTR_UNSET(value)) {
> +			/* reset to default (!whitespace) */
> +			return whitespace_rule_cfg;
> +		} else {
> +			/* string */
> +			return parse_whitespace_rule(value);
> +		}
> +	} else {
> +		return whitespace_rule_cfg;
> +	}
> +}
> -- 
> 1.5.3.7-2155-g4c25
> 

^ permalink raw reply

* Re: [BUG/RFC git-gui] password for push/pull in case of git+ssh://repo
From: Jeff King @ 2007-12-06 19:18 UTC (permalink / raw)
  To: Ivo Alxneit; +Cc: git
In-Reply-To: <1196951517.3294.24.camel@localhost.localdomain>

On Thu, Dec 06, 2007 at 03:31:57PM +0100, Ivo Alxneit wrote:

> when i use git-gui (0.9.0) to push/pull to/from a git+ssh://repo i have
> to supply my password to ssh. i get the password prompt from ssh on the
> controlling shell. as i often use several shells and git-gui might run
> in the background it is rather bothering to find the correct shell where
> ssh expects the password. could this be changed (in a safe way) in
> git-gui e.g. like pinentry pops up a window when gpg is used to sign
> emails?

IIRC, ssh will run the 'ssh-askpass' program if DISPLAY is set and there
is no tty available, so git-gui should be able to set that up (or you
probably can yourself by simply backgrounding git-gui and closing the
terminal).

-Peff

^ permalink raw reply

* Re: Git and GCC
From: Jon Loeliger @ 2007-12-06 19:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Berlin, David Miller, ismail, gcc, Git List
In-Reply-To: <alpine.LFD.0.9999.0712052132450.13796@woody.linux-foundation.org>

On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote:

> Git also does delta-chains, but it does them a lot more "loosely". There 
> is no fixed entity. Delta's are generated against any random other version 
> that git deems to be a good delta candidate (with various fairly 
> successful heursitics), and there are absolutely no hard grouping rules.

I'd like to learn more about that.  Can someone point me to
either more documentation on it?  In the absence of that,
perhaps a pointer to the source code that implements it?

I guess one question I posit is, would it be more accurate
to think of this as a "delta net" in a weighted graph rather
than a "delta chain"?

Thanks,
jdl

^ permalink raw reply

* Re: [PATCH/RFC] autoconf: Add test for OLD_ICONV
From: Wincent Colaiuta @ 2007-12-06 19:07 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, Junio C Hamano, Ramsay Jones, Arjen Laarhoven,
	Brian Gernhardt
In-Reply-To: <200712052219.00930.jnareb@gmail.com>

El 5/12/2007, a las 22:19, Jakub Narebski escribió:

> Ahhh... now I understand. You have installed new iconv() on your
> computer, and generic 'uname -s' (OS name) based guessing in Makefile
> guesses wrongly that you need OLD_ICONV, while ./configure script
> actually tests it and correctly decides to unset OLD_ICONV !

No, nothing installed. It's that Mac OS X 10.5 (Leopard)/Darwin 9  
comes with a newer version of iconv. So while there was no warning  
prior to Leopard there is one now, which your patch fixes for those  
using configure.

This came up before here:

<http://marc.info/?l=git&m=119397507006026&w=2>

Still, I'd like to fix the warning for those who don't use configure.  
Have posted a separate patch for that just now.

Cheers,
>

Wincent

^ permalink raw reply

* [PATCH] Silence iconv warnings on Leopard
From: Wincent Colaiuta @ 2007-12-06 19:07 UTC (permalink / raw)
  To: git; +Cc: jnareb, blaker, Wincent Colaiuta

Apple ships a newer version of iconv with Leopard (Mac OS X 10.5/Darwin
9). Ensure that OLD_ICONV is not set on any version of Darwin in the
9.x series; this should be good for at least a couple of years, when
Darwin 10 comes out and we can invert the sense of the test to
specifically check for Darwin 7 or 8.

A more sophisticated and robust check is possible for those who use
autoconf, but not everybody does that.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
---
 Makefile |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 999391e..4dda340 100644
--- a/Makefile
+++ b/Makefile
@@ -406,7 +406,9 @@ endif
 ifeq ($(uname_S),Darwin)
 	NEEDS_SSL_WITH_CRYPTO = YesPlease
 	NEEDS_LIBICONV = YesPlease
-	OLD_ICONV = UnfortunatelyYes
+	ifneq ($(shell expr "$(uname_R)" : '9\.'),2)
+		OLD_ICONV = UnfortunatelyYes
+	endif
 	NO_STRLCPY = YesPlease
 	NO_MEMMEM = YesPlease
 endif
-- 
1.5.3.7.1067.gaa51

^ permalink raw reply related

* Re: Git and GCC
From: Nicolas Pitre @ 2007-12-06 19:08 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Linus Torvalds, Jeff King, Daniel Berlin, Harvey Harrison,
	David Miller, ismail, gcc, git
In-Reply-To: <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com>

On Thu, 6 Dec 2007, Jon Smirl wrote:

> On 12/6/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >
> >
> > On Thu, 6 Dec 2007, Jeff King wrote:
> > >
> > > What is really disappointing is that we saved only about 20% of the
> > > time. I didn't sit around watching the stages, but my guess is that we
> > > spent a long time in the single threaded "writing objects" stage with a
> > > thrashing delta cache.
> >
> > I don't think you spent all that much time writing the objects. That part
> > isn't very intensive, it's mostly about the IO.
> >
> > I suspect you may simply be dominated by memory-throughput issues. The
> > delta matching doesn't cache all that well, and using two or more cores
> > isn't going to help all that much if they are largely waiting for memory
> > (and quite possibly also perhaps fighting each other for a shared cache?
> > Is this a Core 2 with the shared L2?)
> 
> When I lasted looked at the code, the problem was in evenly dividing
> the work. I was using a four core machine and most of the time one
> core would end up with 3-5x the work of the lightest loaded core.
> Setting pack.threads up to 20 fixed the problem. With a high number of
> threads I was able to get a 4hr pack to finished in something like
> 1:15.

But as far as I know you didn't try my latest incarnation which has been
available in Git's master branch for a few months already.


Nicolas

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox