Git development

Git development
 help / color / mirror / Atom feed

* Re: Following renames
From: Jakub Narebski @ 2006-03-26 17:10 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0603260843250.15714@g5.osdl.org>

Linus Torvalds wrote:

> On Sun, 26 Mar 2006, Jakub Narebski wrote:
>>
>> I wonder what is the most common case in Linux kernel or git.
>> 
>> 1.) renaming the file in the same directory, old-file.c to new-file.c?
>> 2.) moving file to other directory (project reorganization), 
>>     old-dir/file.c to new-dir/file.c?
> The kernel uses subdirectories extensively, and a lot of renames (most of
> them, I'd say) is because of that subdirectory structure.
> 
> So the same-directory case is the unusual one, I'd say.

If (2) is common enough then discussed improvements to rename detection, 
namely comparing basenames as a base for candidate selection is a good idea.
I wonder how common is (2) compared to (1)+(2) i.e. move to other dir 
and rename, old-dir/old-file.c to new-dir/new-subdir/new-file.c

>> 3.) splitting file into modules, huge-file.c to file1.c, file2.c?
>> 4.) copying fragment of one file to other?
>> 5.) moving fragment of code from one file to other?
> 
> I'd say that (5) is very common. And (4) happens a lot under certain
> circumstances (new driver, new architecture, new filesystem..).
> 
> Doing (3) happens, but probably less often that it should ;/

Detecting (4) and (5) fast (i.e. for merges) without auxilary (helper) 
information would probably be hard. For interrogation/porcellanish commands
(like pickaxe) would probably be easier.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* minimum set of utils for git 'server'?
From: Paul Jakma @ 2006-03-26 17:13 UTC (permalink / raw)
  To: git list

Hi,

What is the minimum set of git utilities required for a git 'server'?

 	git-receive-pack
 	git-daemon
 	git-init-db
 	git-repack
 	git-fsck-objects
 	?
 	?

I have an old server, but it lacks recent python (for 
merge-recursive), so I'd like to ensure no one accidently tries to do 
actual merging locally. Only the bare minimum needed for a central 
git 'server' desired.

?

regards,
-- 
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
The faster I go, the behinder I get.
 		-- Lewis Carroll

^ permalink raw reply

* [ANNOUNCE] Cogito-0.17.1
From: Petr Baudis @ 2006-03-26 17:56 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

  Hello,

  cogito-0.17.1 was just released, the next release on the latest stable
line of the Cogito user-friendly Git user interface. Note that the
stable stuff is now happening on the v0.17 branch, just like we did in
the 0.16 times. In addition to that, however, an auto-built
documentation in the man, txt and html format is available in the

	http://www.kernel.org/pub/scm/cogito/cogito-doc.git/

repository. It has the same branch structure as the cogito.git
repository and should also have the same tags if my scripts work
properly. Thanks to Junio for the base version of the script maintaining
this repository.

  So, what's new?

  * Fixed several minor relative path related cg-add and cg-status bugs
  * Fixed cg-add -r not readding cg-rm'd files
  * Fixed support for branch names containing slashes

  * cg-admin-rewritehist - the universal history rewriting tool
  * cg-commit --author
  * cg-commit -p alias for cg-commit --review
  * cg-status -S will show the full contents of the untracked
    directories instead of just the directory name
  * $CG_NORC will make Cogito ignore ~/.cgrc
  * https:// URLs are supported now

  * Several small speed-ups (especially --topo-order in cg-mkpatch)
  * Random documentation updates (most notably cg-ref quick reference)
  * The tutorial script updated


  Who did what:

Dennis Stosberg:
      Cogito: Allow https:// URLs

Jonas Fonseca:
      cg-export: document the -r option
      [PATCH 1/4] Simplify wildcards for match files to be ignored
      [PATCH 2/4] Encode the manpage section in the file name
      [PATCH 3/4] Generate PDF documents using docbook2pdf
      [PATCH 4/4] Add quick reference (cg-ref) to the documentation suite
      Fix multi-paragraph list items in OPTIONS section

Pavel Roskin:
      Use Cogito when possible in the "tutorial" test.
      [PATCH 3/3] Allow the tutorial script to be run by "make test"
      [PATCH 1/3] cg-mv doesn't work with bash 3.1.7 due to excessive quotes
      Clean up after failed "git merge" in the tutorial script

Petr Baudis:
      Refer to cg-branch-add in cg-clone docs and clarify stuff
      Add example usage to cg-clone per jbl's request
      Easier cut'n'paste
      --merge-order is too slow, always use --topo-order
      TODO: branches/with/slashes and cg-clone -a
      Add cg-commit --author, consolidate author documentation
      Update for the modern conflicts handling
      Improve cg-switch -r shortdesc
      Expand the git-mv workarounds description
      cg-merge: Do not fast-forward when doing an octopus
      Fix some relpath-related cg-add and cg-status bugs
      Make cg-commit -p synonymous with --review
      TODO: cg-shelf - shelve changes temporarily
      Generalize the tac stub (cg-mkpatch -> cg-Xlib)
      Generalize pick_author() to pick_id()
      cg-admin-rewritehist - history rewriting swiss knife
      Update the example usage
      Hopefully fix cg-admin-rewritehist -r
      Umm, update year in the (c) notice ;)
      Properly support multiple -r arguments
      Make the main cycle more efficient
      Another optimization - retrieve the commit object only once
      Accept subsections inside the OPTIONS section
      Do not load ~/.cgrc if $CG_NORC is set
      Remove bogus information from cg-patch docs
      Properly document cg-commit --signoff=STRING
      cg-admin-rewritehist --parent-filter for rewriting parent string
      cg-admin-rewritehist --commit-filter for omitting commits
      Reference cg-ref(7) from cogito(7)
      A quick docs pointer and Getting help section update
      cg-status -S will turn dirsquashing off
      Fix cg-add -r not readding removed files
      Use the new ref format when resetting the HEAD file
      Fix support for branch names containing slashes


P.S.: See us at #git @ FreeNode!

  Happy hacking,

-- 
				Petr "Pasky the lousy poet" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe.  -- Douglas Adams

^ permalink raw reply

* Re: [PATCH] Avoid slowness when timewarping large trees.
From: Petr Baudis @ 2006-03-26 18:05 UTC (permalink / raw)
  To: git
In-Reply-To: <20060325093957.GA27832@coredump.intra.peff.net>

Dear diary, on Sat, Mar 25, 2006 at 10:39:57AM CET, I got a letter
where Jeff King <peff@peff.net> said that...
> tree_timewarp was calling read, egrep, and rm in an O(N) loop where N is
> the number of changed files between two trees. This caused a bottleneck
> when seeking/switching/merging between trees with many changed files.
> 
> Signed-off-by: Jeff King <peff@peff.net>

Thanks, applied.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Following renames
From: Linus Torvalds @ 2006-03-26 18:10 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e06hts$1ne$1@sea.gmane.org>

On Sun, 26 Mar 2006, Jakub Narebski wrote:
> 
> If (2) is common enough then discussed improvements to rename detection, 
> namely comparing basenames as a base for candidate selection is a good idea.

BK had this "renametool" which got started automatically when you applied 
a patch that removed one or more files and added one or more files, so 
that you could then pair up the files manually.

It left the rename pairing 100% to the user, but it helped a bit by 
guessing what the pairing might be, and yes, it used the basenames to set 
up that initial guess.

It worked in many cases, but it also failed in many cases. I do think it 
was a useful heuristic within the BK model (since the _real_ choice was 
left to the user), but I don't think it's very useful for git.

The thing is, the fast rename detection that is in the "next" branch 
really does a lot better, and it's fast enough.

(If you wanted to make it even faster, but less precise, you could limit 
it to the first few kilobytes of file contents - still a _lot_ better 
heuristic than the actual filename, and it would make the worst-case 
behaviour much better).

> I wonder how common is (2) compared to (1)+(2) i.e. move to other dir 
> and rename, old-dir/old-file.c to new-dir/new-subdir/new-file.c

I don't have any numbers, but from usign renametool for a few years, my 
gut feel/recollection is that about half of renames in the kernel were 
moving to a new directory, and about half changed names (often in 
_addition_ to moving). But I didn't much think about it, so that's just a 
very rough guess based on using a tool that helped you do these things 
manually.

For example, one common case was a directory structure like

	..
	type-file1.c
	type-file2.c
	otherfiles.c
	yet-more.c
	..

being split up into a subdirectory

	..
	type/file1.c
	type/file2.c
	otherfiles.c
	yet-more.c
	..

(eg drivers/scsi/aic7xx-* being given a subdirectory of it's own, as 
drivers/scsi/aic7xx/*). So the basename wouldn't stay the same, because it 
contained some piece of data that became redundant with the move.

> >> 3.) splitting file into modules, huge-file.c to file1.c, file2.c?
> >> 4.) copying fragment of one file to other?
> >> 5.) moving fragment of code from one file to other?
> > 
> > I'd say that (5) is very common. And (4) happens a lot under certain
> > circumstances (new driver, new architecture, new filesystem..).
> > 
> > Doing (3) happens, but probably less often that it should ;/
> 
> Detecting (4) and (5) fast (i.e. for merges) without auxilary (helper) 
> information would probably be hard. For interrogation/porcellanish commands
> (like pickaxe) would probably be easier.

Yes. I don't think we necessarily want to merge automatically across 
things like that, even if it sounds like something you'd want in a perfect 
world. Stupid and obvious (and fails) is often better than smart and 
complex (and succeeds), because at least you _understand_ what happens. 

But _following_ a particular change back is important, and should be both 
efficient and simple to do. Ie the example tool I talked about in

        http://article.gmane.org/gmane.comp.version-control.git/217

is still relevant and important, I think.

I literally think that people wouldn't even _want_ a "git annotate", if 
they instead had more of a visual tool that showed the current state of 
the file, and you could click on a line/set of lines to follow it back to 
the previous change to that area. I'd argue that almost always when you 
want "annotate", you already have the particular place that you want to 
look at in mind (you're really not interested in the whole file).

So wouldn't it be _much_ nicer to have a "graphical git-whatchanged", 
where you just delve deeper (and you don't even look at the whole file 
like git-whatchanged does, but you ask for a very particular region).

Ie, what I imagine would be something gitk/qgit like, where you see the 
file content, select a line or two (or a whole function), and it goes back 
in history and shows you the last diff that changed that 
line/two/function. We can do that EFFICIENTLY. Much more efficiently than 
git-annotate, in fact. And then when you see the diff, you might say "I'm 
not interested in this one, that was just a re-indent" and then continue 
back. 

THAT is the kind of graphical tool I'd want. And dammit, it should even be 
_easy_. I'm just a total clutz myself when it comes to doing things like
QT or nice tcl/tk text-panes, and this really does have to be visual, 
since the whole point is that "select text" and interactive part.

So if somebody wants to be a hero, and feels comfortable with those kinds 
of things, this really should be a fairly straightforward thing to do (it 
would be useful even without rename detection or data movement detection, 
but it's also something where you really _could_ do efficient data 
movement detection by just looking at the "whole diff" when something 
changed in that small area).

		Linus

^ permalink raw reply

* Re: minimum set of utils for git 'server'?
From: Linus Torvalds @ 2006-03-26 18:16 UTC (permalink / raw)
  To: Paul Jakma; +Cc: git list
In-Reply-To: <Pine.LNX.4.64.0603261804180.5276@sheen.jakma.org>

On Sun, 26 Mar 2006, Paul Jakma wrote:
> 
> What is the minimum set of git utilities required for a git 'server'?
> 
> 	git-receive-pack
> 	git-daemon
> 	git-init-db
> 	git-repack
> 	git-fsck-objects

At least git-upload-pack, git-pack-objects, git-rev-list and 
git-unpack-objects (which are all part of the object receive/send paths), 
and git-update-server-info if you do http.

> I have an old server, but it lacks recent python (for merge-recursive), so I'd
> like to ensure no one accidently tries to do actual merging locally. Only the
> bare minimum needed for a central git 'server' desired.

You should be able to just try it out. Start out with the above list, and 
see what complains...

		Linus

^ permalink raw reply

* Re: Use a *real* built-in diff generator
From: Petr Baudis @ 2006-03-26 18:20 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Linus Torvalds, Davide Libenzi, Morten Welinder, Junio C Hamano,
	Git Mailing List
In-Reply-To: <20060326110934.GA3774@linux-mips.org>

Dear diary, on Sun, Mar 26, 2006 at 01:09:35PM CEST, I got a letter
where Ralf Baechle <ralf@linux-mips.org> said that...
> On Sat, Mar 25, 2006 at 10:39:03AM -0800, Linus Torvalds wrote:
> 
> > Besides, I hate how GNU patch bends over backwards in applying crap that 
> > isn't a proper patch at all (whitespace-corruption, you name it: GNU patch 
> > will accept it). Also, I made "git-apply" be all-or-nothing: either it 
> > applies the _whole_ patch (across many different files) or it applies none 
> > of it. With GNU patch, if you get an error on the fifth file, the four 
> > first files have been modified already - aarrgghhh..
> 
> Which is apply's greatest strength - and weakness.  GNU diff doesn't
> understand the file renamings bits of git diffs, so they they need to be
> used with apply.  So if a patch doesn't apply?  Apply doesn't even have
> an option to apply things as good as it can and leave the rest in
> reject files.  Yuck.

I've just updated cg-patch on the master branch today to understand file
renames, so it should be possible to use it for applying fuzzy patches.
(OTOH, cg-patch has grown way too complex and ugly for my taste. It'd be
nice if git-apply could take over the ugly part of the task.)

No dice with patches containing copy information, though. We would need
to perform the copy _before_ applying the patch itself and we have no
infrastructure for that (so far it has been enough to do the
git-specific stuff after applying the patch itself).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Following renames
From: Petr Baudis @ 2006-03-26 19:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ryan Anderson, git
In-Reply-To: <Pine.LNX.4.64.0603260829550.15714@g5.osdl.org>

Dear diary, on Sun, Mar 26, 2006 at 06:33:13PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> If you do
> 
> 	git-rev-list --parents --remove-empty $REV -- $filename
> 
> then you'll get the whole history for that filename. When it ends, you 
> know the file went away, and then you do basically _one_ "where the hell 
> did it go" thing.
> 
> And yes, it's not git-ls-tree (unless you only want to follow pure 
> renames), it's actually one "git-diff-tree -M $lastrev". Then you just 
> continue with the new filename (and do another "git-rev-list" until you 
> hit the next rename).

I wrote a long rant but then it all suddenly fit together and I have now
an idea how to implement it reasonably elegantly.

So only a bugreport remains:

My current target is to support this tree (letters are filenames,
numbers are commit ids; I'll translate any git output to those digits):

    2    4
    b -- d
1 /        \ 6
a            d
  \ 3    5 /
    c -- d

With the commits created in the numerical order (so log shows
1,2,3,4,5,6, and my target is cg-log d showing the same output). If
anyone wants the sample history, it's at

	http://pasky.or.cz/~xpasky/renametree1.git/

Curiously, git-rev-list does something totally strange when trying to
list per-file history at this point:

	$ git-rev-list HEAD -- d
	4

Huh? (It should list 6, 5, 4 instead.)

I worked it around by recording a change in d in the merge 6:

	http://pasky.or.cz/~xpasky/renametree2.git/

	$ git-rev-list --parents --remove-empty HEAD -- d
	6 4 5
	5
	4

Which is the expected output.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Following renames
From: Marco Costalba @ 2006-03-26 19:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0603260947100.15714@g5.osdl.org>

On 3/26/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> So wouldn't it be _much_ nicer to have a "graphical git-whatchanged",
> where you just delve deeper (and you don't even look at the whole file
> like git-whatchanged does, but you ask for a very particular region).
>
> Ie, what I imagine would be something gitk/qgit like, where you see the
> file content, select a line or two (or a whole function), and it goes back
> in history and shows you the last diff that changed that
> line/two/function. We can do that EFFICIENTLY. Much more efficiently than
> git-annotate, in fact. And then when you see the diff, you might say "I'm
> not interested in this one, that was just a re-indent" and then continue
> back.
>
> THAT is the kind of graphical tool I'd want. And dammit, it should even be
> _easy_. I'm just a total clutz myself when it comes to doing things like
> QT or nice tcl/tk text-panes, and this really does have to be visual,
> since the whole point is that "select text" and interactive part.
>
> So if somebody wants to be a hero, and feels comfortable with those kinds
> of things, this really should be a fairly straightforward thing to do (it
> would be useful even without rename detection or data movement detection,
> but it's also something where you really _could_ do efficient data
> movement detection by just looking at the "whole diff" when something
> changed in that small area).
>

I am a thousand miles away from being an hero (and glad of it), but....

I really need a bit of feedback or comment about this because IMHO
qgit annotate is *almost* very similar to what you would ask, so I
need to understand well the difference:

FIRST WAY

After annotating a file history (double click on a file name in
bottom-right window or directly from tree view), you see the whole
file annotated. If you have the diff window open you see also the
corresponding patch (scrolled to selected file name).

Now, double clicking on the chosen code line in file content makes
currently two things:

  - Diff window is updated to show corresponding revision patch, i.e.
the last patch that modified that line of code.

- File content, as well as file annotation, changes to show the
content of the file just after the patch was applied, from there it is
normally possible to go back in the history of that code region in the
same way, i.e. double clicking on interesting lines.

Biggest limitation of 'annotation browsing' is that 'code removing
only' patches are not annotated and you need to check them  directly
in diff window.

SECOND WAY

Without opening the file viewer it is possible to select a file (or
more then one or one directory) from tree view and press magic wand
button. This causes main view to be updated with git-rev-list  --
<selected paths>  content, i.e. a filtered view.

With diff viewer window open you can browse across file patch history
related to chosen file.

Biggest limitation is that all the revisions who touch the file are
shown, not only the ones limited to a selected region.

IF I HAVE UNDERSTOOD...

If I have understood what you would like to see it something like the following:

- From diff/file viewer window select a code region.

- Press Magic wand button and feed git-rev-list with <selected path>
_and_  <selected content>

- Show git-rev-list output on main window as usual, but now selected
revisions are filtered out not only for path but also for region of
code touched.

Am I guessing correctly?

Marco

^ permalink raw reply

* Re: Effective difference between git-rebase and git-resolve
From: J. Bruce Fields @ 2006-03-26 20:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Marc Singer, git
In-Reply-To: <7vacbfxadu.fsf@assigned-by-dhcp.cox.net>

On Fri, Mar 24, 2006 at 11:15:57PM -0800, Junio C Hamano wrote:
>      - Patch C does not apply.  git-am stops here, with conflicts to
>        be resolved in the working tree.  Yet-to-be-applied D and E
>        are still kept in .dotest/ directory at this point.  What the
>        user does is exactly the same as fixing up unapplicable patch
>        when running git-am:
>     
>        - Resolve conflict just like any merge conflicts.
>        - "git am --resolved --3way" to continue applying the patches.

So, does this sum it up accurately for the man page?

--b.

Document git-rebase behavior on conflicts.

---

 Documentation/git-rebase.txt |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

3ef0c8cc7a505f9023a87e7e1ca22251a91bf188
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index b36276c..4a7e67a 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -48,6 +48,18 @@ would be:
              /
     D---E---F---G master
 
+In case of conflict, git-rebase will stop at the first problematic commit
+and leave conflict markers in the tree.  After resolving the conflict manually
+and updating the index with the desired resolution, you can continue the
+rebasing process with
+
+    git am --resolved --3way
+
+Alternatively, you can undo the git-rebase with
+
+    git reset --hard ORIG_HEAD
+    rm -r .dotest
+
 OPTIONS
 -------
 <newbase>::
-- 
1.2.4.g0382

^ permalink raw reply related

* Re: Following renames
From: Petr Baudis @ 2006-03-26 20:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ryan Anderson, git
In-Reply-To: <20060326191445.GQ18185@pasky.or.cz>

Dear diary, on Sun, Mar 26, 2006 at 09:14:45PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Curiously, git-rev-list does something totally strange when trying to
> list per-file history at this point:
> 
> 	$ git-rev-list HEAD -- d
> 	4
> 
> Huh? (It should list 6, 5, 4 instead.)

Obviously not 6 since the file was not changed in that revision, but I'd
still expect it to list 5.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Following renames
From: Petr Baudis @ 2006-03-26 21:09 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: Linus Torvalds, git
In-Reply-To: <44264426.8010608@michonline.com>

Dear diary, on Sun, Mar 26, 2006 at 09:35:02AM CEST, I got a letter
where Ryan Anderson <ryan@michonline.com> said that...
> Linus Torvalds wrote:
> > On Sun, 26 Mar 2006, Petr Baudis wrote:
> >   
> >>   In [1], Linus suggests a non-core solution. Unfortunately, it doesn't
> >> fly - it requires at least two git-ls-tree calls per revision which
> >> would bog things down awfully (to roughly half of the original speed).
> >>     
> >
> > No it doesn't. It requires one git-ls-tree WHEN SOMETHING IS RENAMED.
> >
> > In other words, basically never.
> >   
> 
> A simple example is the first loop in git-annotate.perl.  (Which was
> basically written by Linus, I just translated it from a
> shell/pseudo-code example into Perl)

One case it does not handle:

         2
      -- b --
  1 /         \ 6
  a             d
    \ 3     5 /
      c --- d

git-rev-list at 6 will (understandably) show

        6 5
        5

and you will never detect the d -> b rename leading to 2.

This is one reason why I'm actually not using --parents and pipe stuff
directly to git-diff-tree --stdin -M and then read its output. This also
lets me merge parallel lines of development based on date and I don't
have to fork per each file deletion.

With any luck I'll have the first draft of my (also perlish) script done
this evening yet. (BTW, it has the same output format as

	git-rev-list | git-diff-tree --pretty=raw -M

so with some tweaking it could also serve as a git-whatchanged backend.
Actually, it would be nice to have it in core Git in the long term so
that it gets all the portability tweaks and such.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: [PATCH] Optionally do not list empty directories in git-ls-files --others
From: Junio C Hamano @ 2006-03-26 21:32 UTC (permalink / raw)
  To: Petr Baudis; +Cc: junkio, Jim MacBaine, git
In-Reply-To: <20060326145952.GM18185@pasky.or.cz>

Petr Baudis <pasky@suse.cz> writes:

>   it turned out that cg-clean depends on the original behaviour...

Supporting both sounds sensible.

^ permalink raw reply

* Re: cg-status and empty directories
From: Petr Baudis @ 2006-03-26 21:37 UTC (permalink / raw)
  To: Jim MacBaine; +Cc: git
In-Reply-To: <3afbacad0602270643k9fdd255w8f3769ad77c54e65@mail.gmail.com>

  Hi,

Dear diary, on Mon, Feb 27, 2006 at 03:43:32PM CET, I got a letter
where Jim MacBaine <jmacbaine@gmail.com> said that...
> Many packages put empty directories under /etc, and although only a
> few of those directories are actually needed, the automatic removal of
> those packages will fail if I remove the empty directories manually.  
> Equally, the removal will fail, if I put a .placeholder file into
> those direrectories and cg-add it.  Is there a simple way out?

  BTW, with Cogito-0.17.1 the simple way out should be cg-status -S
which restores the original behaviour.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Following renames
From: Linus Torvalds @ 2006-03-26 22:22 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Ryan Anderson, git
In-Reply-To: <20060326191445.GQ18185@pasky.or.cz>

On Sun, 26 Mar 2006, Petr Baudis wrote:
> 
> My current target is to support this tree (letters are filenames,
> numbers are commit ids; I'll translate any git output to those digits):
> 
>     2    4
>     b -- d
> 1 /        \ 6
> a            d
>   \ 3    5 /
>     c -- d

Yeah, the problem with this is that you need to track separate names 
across separate points. However:

> Curiously, git-rev-list does something totally strange when trying to
> list per-file history at this point:
> 
> 	$ git-rev-list HEAD -- d
> 	4
> 
> Huh? (It should list 6, 5, 4 instead.)

What it does is list the points where file "d" _changed_.

"d" did not change in 6 - it had a parent commit (4) where "d" had the 
same contents (in fact, it likely had _two_ parents where it had the same 
contents, but git will pick the first one). So commit "6" is 
uninteresting, and commit "5" will never even be looked at, since we 
decided that the history of "d" comes from the first parent with the same 
contents.

So then it lists "4", because file "d" really did change in that commit 
(it went away).

Now you need to look at "4" and find the rename (which gives you 2) and 
then from there you do rename detection and get (1), and as a result your 
change history should end up being

 (1)a -> (2)b -> (4)d (-> 6(d) which was your start point)

which is correct (now, there are other histories _too_ that get us to the 
same point, but the one you found this way was _a_ history).

> I worked it around by recording a change in d in the merge 6:
> 
> 	http://pasky.or.cz/~xpasky/renametree2.git/
> 
> 	$ git-rev-list --parents --remove-empty HEAD -- d
> 	6 4 5
> 	5
> 	4
> 
> Which is the expected output.

No, it's the expected output just because you expected merges to always 
show up. Merges get ignored if any of the parents have the same content 
already.

		Linus

^ permalink raw reply

* Re: Following renames
From: Linus Torvalds @ 2006-03-26 22:23 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Jakub Narebski, git
In-Reply-To: <e5bfff550603261122m5e680c62ye1290f3e601e947e@mail.gmail.com>

On Sun, 26 Mar 2006, Marco Costalba wrote:
> 
> FIRST WAY
> 
> After annotating a file history (double click on a file name in
> bottom-right window or directly from tree view), you see the whole
> file annotated. If you have the diff window open you see also the
> corresponding patch (scrolled to selected file name).

The problem is that this step is already _way_ too expensive.

I don't want to work with any tool that makes "Step 1" take a minute or 
two for a project that has a few years of history. Try it on the linux 
historic project with some file that gets lots of modifications.

In other words, starting off with "annotate" is MUCH too expensive. You 
should start off basically with "git-whatchanged".

		Linus

^ permalink raw reply

* Re: Following renames
From: Petr Baudis @ 2006-03-26 22:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ryan Anderson, git
In-Reply-To: <Pine.LNX.4.64.0603261415390.15714@g5.osdl.org>

Dear diary, on Mon, Mar 27, 2006 at 12:22:04AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> So commit "6" is uninteresting, and commit "5" will never even be
> looked at, since we decided that the history of "d" comes from the
> first parent with the same contents.

And this is the thing I have a problem with - this does not make much
sense to me, why can't we just follow all parents instead of arbitrarily
choosing one of them?

> which is correct (now, there are other histories _too_ that get us to the 
> same point, but the one you found this way was _a_ history).

Ok, in that case I want the _full_ history. :-)

> No, it's the expected output just because you expected merges to always 
> show up. Merges get ignored if any of the parents have the same content 
> already.

Eek. Can I avoid that? What was the reason for choosing this behavior?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Following renames
From: Junio C Hamano @ 2006-03-26 22:43 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds
In-Reply-To: <20060326223154.GU18185@pasky.or.cz>

Petr Baudis <pasky@suse.cz> writes:

>> No, it's the expected output just because you expected merges to always 
>> show up. Merges get ignored if any of the parents have the same content 
>> already.
>
> Eek. Can I avoid that? What was the reason for choosing this behavior?

Perhaps rev-list --sparse?

^ permalink raw reply

* Re: Following renames
From: Linus Torvalds @ 2006-03-26 23:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Ryan Anderson, git
In-Reply-To: <20060326223154.GU18185@pasky.or.cz>

On Mon, 27 Mar 2006, Petr Baudis wrote:

> Dear diary, on Mon, Mar 27, 2006 at 12:22:04AM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
> > So commit "6" is uninteresting, and commit "5" will never even be
> > looked at, since we decided that the history of "d" comes from the
> > first parent with the same contents.
> 
> And this is the thing I have a problem with - this does not make much
> sense to me, why can't we just follow all parents instead of arbitrarily
> choosing one of them?

Sure, you can. It's _usually_ a huge waste of time, though. Why would you 
want to do more work than you need, since clearly the other parent was 
_not_ interesting from the standpoint of the question "where did this 
content come from"?

> > No, it's the expected output just because you expected merges to always 
> > show up. Merges get ignored if any of the parents have the same content 
> > already.
> 
> Eek. Can I avoid that? What was the reason for choosing this behavior?

Huge efficiency gains.

Lookie here. Do

	gitk -- rev-list.c

on the git archive with the current git-rev-list, and with your hacked-up 
version.

And tell me my version isn't a hell of a lot better. Because, I guarantee 
you, it is. We're just not _interested_ in all those merges that didn't 
actually make any difference.

Read up on what modern neuro-science thinks about the human brain, and 
what a lot of it is about. It's about ignoring irrelevant information.

The ability to throw stuff out that isn't interesting is the _real_ basis 
of true intelligence. I'd rather have git do the _intelligent_ history, 
than show history that isn't relevant and workign harder doing so.

		Linus

^ permalink raw reply

* Re: Following renames
From: Linus Torvalds @ 2006-03-26 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, git
In-Reply-To: <7vodzsq12g.fsf@assigned-by-dhcp.cox.net>



On Sun, 26 Mar 2006, Junio C Hamano wrote:
> Petr Baudis <pasky@suse.cz> writes:
> 
> >> No, it's the expected output just because you expected merges to always 
> >> show up. Merges get ignored if any of the parents have the same content 
> >> already.
> >
> > Eek. Can I avoid that? What was the reason for choosing this behavior?
> 
> Perhaps rev-list --sparse?

No. "--sparse" still removes the uninteresting parents of merges. It just 
doesn't then make the linear history any denser.

		Linus

^ permalink raw reply

* [PATCH] Remove dependency on a file named "-lz"
From: Johannes Schindelin @ 2006-03-26 23:14 UTC (permalink / raw)
  To: git, junkio


By changing the dependency "$(LIB_H)" to "$(LIBS)", at least one version
of make thought that a file named "-lz" would be needed.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

---

 Makefile |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

5a8333baa1845924348b958208bee59831d4e04e
diff --git a/Makefile b/Makefile
index a8cb0af..055d155 100644
--- a/Makefile
+++ b/Makefile
@@ -214,8 +214,8 @@
 	fetch-clone.o revision.o pager.o \
 	$(DIFF_OBJS)
 
-LIBS = $(LIB_FILE) $(XDIFF_LIB)
-LIBS += -lz
+GITLIBS = $(LIB_FILE) $(XDIFF_LIB)
+LIBS = $(GITLIBS) -lz
 
 #
 # Platform specific tweaks
@@ -554,7 +554,7 @@
 		-DDEFAULT_GIT_TEMPLATE_DIR='"$(template_dir_SQ)"' $*.c
 
 $(LIB_OBJS): $(LIB_H)
-$(patsubst git-%$X,%.o,$(PROGRAMS)): $(LIBS)
+$(patsubst git-%$X,%.o,$(PROGRAMS)): $(GITLIBS)
 $(DIFF_OBJS): diffcore.h
 
 $(LIB_FILE): $(LIB_OBJS)
-- 
1.2.0.gd95e-dirty

^ permalink raw reply related

* Re: Following renames
From: Petr Baudis @ 2006-03-26 23:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ryan Anderson, git
In-Reply-To: <20060326191445.GQ18185@pasky.or.cz>

Dear diary, on Sun, Mar 26, 2006 at 09:14:45PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Dear diary, on Sun, Mar 26, 2006 at 06:33:13PM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
> > If you do
> > 
> > 	git-rev-list --parents --remove-empty $REV -- $filename
> > 
> > then you'll get the whole history for that filename. When it ends, you 
> > know the file went away, and then you do basically _one_ "where the hell 
> > did it go" thing.
> > 
> > And yes, it's not git-ls-tree (unless you only want to follow pure 
> > renames), it's actually one "git-diff-tree -M $lastrev". Then you just 
> > continue with the new filename (and do another "git-rev-list" until you 
> > hit the next rename).
> 
> I wrote a long rant but then it all suddenly fit together and I have now
> an idea how to implement it reasonably elegantly.

So, this is what I have. Testing (I've gave it very little of that) and
thoughts welcome. It is probably pretty efficient, at least in terms of
fork()s it does only 2*N of them where N is the number of commits
containing interesting renames.  Actually, this should be even possible
to reduce to N+1 if you do a single git-diff-tree call and multiplex
different git-rev-lists to it, but I'm too tired to do the trickery now.

It has 'cg' in the name but depends on no Cogito stuff; it should be in
fact possible to trivially put it to git-whatchanged in place of the
final pipeline (not that I'd be suggesting this to be done universally,
but perhaps git-whatchanged -f ...?). There are three downsides in this
regard:

(i) No -c support. I need the separate deltas coming out from
git-diff-tree but I think I can join them together pretty easily on my
own, except that I have problems with -c (see
<20060326102100.GF18185@pasky.or.cz>) so I'm not sure how exactly is it
supposed to behave.

(ii) Only --pretty=raw output. It shouldn't be hard to add the
reformatting code, but I'm personally not going to use it and kind of
lazy, so I'll let someone else do that, I guess. :-)

(iii) Raw deltas required. -p parsing support would be certainly useful
and possible, but see (ii).


To quickly see what it does, you can try it e.g. on the git-log.sh file
in the Git repository.

Thoughts? Opinions? Bugs? Patches?


Signed-off-by: Petr Baudis <pasky@suse.cz>


diff --git a/cg-Xfollowrenames b/cg-Xfollowrenames
new file mode 100755
index 0000000..fa5c552
--- /dev/null
+++ b/cg-Xfollowrenames
@@ -0,0 +1,246 @@
+#!/usr/bin/env perl
+#
+# git-rev-list | git-diff-tree --stdin following renames
+# Copyright (c) Petr Baudis, 2006
+# Uses bits of git-annotate.perl by Ryan Anderson.
+#
+# This script will efficiently show output as of the
+#
+#	git-rev-list --remove-empty ARGS -- FILE... |
+#	git-diff-tree -M -r -m --stdin --pretty=raw ARGS
+#
+# pipeline, except that it follows renames of individual files listed
+# in the FILE... set.
+#
+# Usage:
+#
+#	cg-Xfollowrenames revlistargs -- difftreeargs -- revs -- files
+
+# TODO: Does not work on multiple files properly yet - most probably
+# (I didn't test it!). We want git-rev-list to stop traversing the history
+# when _any_ file disappears while now it probably stops traversing when
+# _all_ files disappear.
+
+use warnings;
+use strict;
+
+$| = 1;
+
+our (@revlist_args, @difftree_args, @revs, @files);
+
+{ # Load arguments
+	my @argp = (\@revlist_args, \@difftree_args, \@revs, \@files);
+	my $argi = 0;
+	for my $arg (@ARGV) {
+		if ($arg eq '--' and $argi < $#argp) {
+			$argi++;
+			next;
+		}
+		push(@{$argp[$argi]}, $arg);
+	}
+}
+
+
+# The heads we watch (sorted by commit time)
+our @heads;
+# Each head is: {
+#	# Persistent for the whole line of development:
+#	pipe => $pipe,
+#	files => \@files, # to watch for
+#
+#	id => $sha1, # useful actually only for debugging
+#	time => $timestamp,
+#	str => $prettyoutput,
+#	parents => \@sha1s,
+#
+#	# When the commit is processed, spawn these extra heads:
+#	recurse => {$sha1id => \@files, ...},
+# }
+
+# To avoid printing duplicate commits
+# FIXME: Currently, we will not handle merge commits properly since
+# we hit them multiple times.
+our %commits;
+
+
+sub open_pipe($@) {
+	my ($stdin, @execlist) = @_;
+
+	my $pid = open my $kid, "-|";
+	defined $pid or die "Cannot fork: $!";
+
+	unless ($pid) {
+		if (defined $stdin) {
+			open STDIN, "<&", $stdin or die "Cannot dup(): $!";
+		}
+		exec @execlist;
+		die "Cannot exec @execlist: $!";
+	}
+
+	return $kid;
+}
+
+sub revlist($@) {
+	my ($rev, @files) = @_;
+	open_pipe(undef, "git-rev-list", "--remove-empty",
+	                 @revlist_args, $rev, "--", @files)
+		or die "Failed to exec git-rev-list: $!";
+}
+
+sub difftree($) {
+	my ($revlist) = @_;
+	open_pipe($revlist, "git-diff-tree", "-r", "-m", "--stdin", "-M",
+	                    "--pretty=raw", @difftree_args)
+		or die "Failed to exec git-diff-tree: $!";
+}
+
+sub revdiffpipe($@) {
+	my ($rev, @files) = @_;
+	my $pipe = difftree(revlist($rev, @files));
+}
+
+
+sub read_commit($$) {
+	my ($head, $tolerant) = @_;
+	my $pipe = $head->{'pipe'};
+	my $against;
+	my @oldset = @{$head->{'files'}};
+	my @newset;
+	my $rename;
+
+	# Load header
+	while (my $line = <$pipe>) {
+		$head->{'str'} .= $line;
+		chomp $line;
+		$line eq '' and goto header_loaded;
+
+		if ($line =~ /^diff-tree (\S+) \(from (root|\S+)\)/) {
+			$head->{'id'} = $1;
+			if (not $tolerant and $commits{$1}++) {
+				close $pipe;
+				return undef;
+			}
+			# The 'root' case is harmless since there'll be no renames.
+			$against = $2;
+		} elsif ($line =~ /^parent (\S+)/) {
+			push (@{$head->{'parents'}}, $1);
+		} elsif ($line =~ /^committer .*?> (\d+)/) {
+			$head->{'time'} = $1;
+		}
+	}
+	return undef;
+header_loaded:
+
+	# Load message
+	while (my $line = <$pipe>) {
+		$head->{'str'} .= $line;
+		chomp $line;
+		$line eq '' and goto message_loaded;
+	}
+	return undef;
+message_loaded:
+
+	# Load delta
+	while (my $line = <$pipe>) {
+		$head->{'str'} .= $line;
+		chomp $line;
+		$line eq '' and goto delta_loaded;
+
+		$line =~ /^:/ or return undef;
+		my ($info, $newfile, $oldfile) = split("\t", $line);
+		if ($info =~ /[RC]\d*$/) {
+			# Behold, a rename!
+			# (Or a copy, it's all the same for us.)
+			my $i;
+			for ($i = 0; $i <= $#oldset; $i++) {
+				$oldfile eq $oldset[$i] or next;
+				$rename = 1;
+				splice(@oldset, $i, 1);
+				push(@newset, $newfile);
+				last;
+			}
+			# In case of multiple candidates, follow
+			# all of them:
+			# (TODO: This might be a policy decision
+			# best left on the user.)
+			if ($i > $#oldset and grep { $oldfile eq $_ } @newset) {
+				$rename = 1;
+				push(@newset, $newfile);
+			}
+		} elsif ($info =~ /D$/) {
+			# Not weeding out deleted files might cause bizarre
+			# results when following multiple files since
+			# git-rev-list weeds them out too (probably?).
+			@oldset = grep { $newfile ne $_ } @oldset;
+			@{$head->{'files'}} = grep { $newfile ne $_ } @{$head->{'files'}};
+		}
+	}
+	$head->{'str'} .= "\n";
+delta_loaded:
+
+	if ($rename) {
+		$head->{'recurse'}->{$against} = [@newset, @oldset];
+	}
+	return 1;
+}
+
+sub load_commit($) {
+	my ($head) = @_;
+	$head->{'time'} = undef;
+	$head->{'str'} = '';
+	$head->{'parents'} = ();
+
+	read_commit($head, 0) or return undef;
+
+	# In case there was a merge, the commit will be multiple times
+	# here, each time with a different delta section. Read them all.
+	for (1 .. $#{$head->{'parents'}}) { # stupid vim syntax highlighting
+		read_commit($head, 1) or return undef;
+	}
+
+	return 1;
+}
+
+
+# Add head at the proper position
+sub add_head($) {
+	my ($head) = @_;
+	my $i;
+	for ($i = 0; $i <= $#heads; $i++) {
+		last if ($head->{'time'} > $heads[$i]->{'time'})
+	}
+	splice(@heads, $i, 0, $head);
+}
+
+# Create new head
+sub init_head($@) {
+	my ($rev, @files) = @_;
+	my $head = { files => \@files, 'pipe' => revdiffpipe($rev, @files) };
+	load_commit($head) or return;
+	add_head($head);
+}
+
+
+
+{ # Seed the heads list
+	for my $rev (@revs) {
+		init_head($rev, @files);
+	}
+}
+
+# Process the heads
+{
+	while (@heads) {
+		my $head = splice(@heads, 0, 1);
+
+		print $head->{'str'};
+
+		foreach my $parent (keys %{$head->{'recurse'}}) {
+			init_head($parent, @{$head->{'recurse'}->{$parent}});
+		}
+		$head->{'recurse'} = undef;
+
+		load_commit($head) or next;
+		add_head($head);
+	}
+}


-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply related

* Fix error handling for nonexistent names
From: Linus Torvalds @ 2006-03-27  0:28 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


[ This is an expanded version of a patch I sent out earlier: the 
  "rev-parse.c" part of it is identical to the earlier version, the 
  revision.c thing is new ]

When passing in a pathname pattern without the "--" separator on the 
command line, we verify that the pathnames in question exist. However, 
there were two bugs in that verification: 

 - git-rev-parse would only check the first pathname, and silently allow 
   any invalid subsequent pathname, whether it existed or not (which 
   defeats the purpose of the check, and is also inconsistent with what 
   git-rev-list actually does)

 - git-rev-list (and "git log" etc) would check each filename, but if the 
   check failed, it would print the error using the first one, ie:

	[torvalds@g5 git]$ git log Makefile bad-file
	fatal: 'Makefile': No such file or directory

   instead of saying that it's 'bad-file' that doesn't exist.

This fixes both bugs.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
diff --git a/rev-parse.c b/rev-parse.c
index 19a5ef7..8ca1c69 100644
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -174,7 +174,9 @@ int main(int argc, char **argv)
 		char *dotdot;
 	
 		if (as_is) {
-			show_file(arg);
+			if (show_file(arg) && as_is < 2)
+				if (lstat(arg, &st) < 0)
+					die("'%s': %s", arg, strerror(errno));
 			continue;
 		}
 		if (!strcmp(arg,"-n")) {
@@ -194,7 +196,7 @@ int main(int argc, char **argv)
 
 		if (*arg == '-') {
 			if (!strcmp(arg, "--")) {
-				as_is = 1;
+				as_is = 2;
 				/* Pass on the "--" if we show anything but files.. */
 				if (filter & (DO_FLAGS | DO_REVS))
 					show_file(arg);
diff --git a/revision.c b/revision.c
index 12cd052..d67718c 100644
--- a/revision.c
+++ b/revision.c
@@ -649,7 +649,7 @@ int setup_revisions(int argc, const char
 			/* If we didn't have a "--", all filenames must exist */
 			for (j = i; j < argc; j++) {
 				if (lstat(argv[j], &st) < 0)
-					die("'%s': %s", arg, strerror(errno));
+					die("'%s': %s", argv[j], strerror(errno));
 			}
 			revs->prune_data = get_pathspec(revs->prefix, argv + i);
 			break;

^ permalink raw reply related

* Re: [PATCH] Add git-explode-packs
From: Junio C Hamano @ 2006-03-27  3:53 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: git
In-Reply-To: <20060326125450.GT31387@lug-owl.de>

Jan-Benedict Glaw <jbglaw@lug-owl.de> writes:

> On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@cox.net> wrote:
>> The script seems to do what it claims to, but now why would one
>> need to use this?  In other words what's the situation one would
>> find this useful?
>
> It's possibly useful if you oftenly access old objects with
> git-cat-file or git-ls-tree.

Benchmarks?

I created two cloned repositories from git.git.  victim03
repository is fully packed with the default pack parameter of
depth and window set both to 10. victim04 repository has the
same set of objects and refs but the pack is expanded (16232
loose objects).

Now in victim03 repository, 657 blobs have depth 10 (i.e. you
need to inflate and apply delta 10 times to get to the object).
So I made the list of these "expensive to access" objects and
run this:

	$ cd victim03
	$ /usr/bin/time sh -c '
            while read sha1; do git cat-file blob $sha1;
            done >/dev/null <list
	'

3.43user 3.36system 0:07.17elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364561minor)pagefaults 0swaps
3.51user 3.33system 0:07.10elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364499minor)pagefaults 0swaps
3.76user 2.99system 0:07.28elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+365155minor)pagefaults 0swaps

With the same file list, in victim04 repository that has 16232
loose objects:

	$ cd victim04
	$ /usr/bin/time sh -c '
            while read sha1; do git cat-file blob $sha1;
            done >/dev/null <../victim03/list
	'

3.29user 2.98system 0:06.33elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+348786minor)pagefaults 0swaps
3.26user 2.88system 0:06.63elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347512minor)pagefaults 0swaps
3.16user 2.98system 0:06.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347489minor)pagefaults 0swaps

So you are getting slight performance gain out of this by
exploding the pack, but on the other hand you are taxing the
buffer cache quite heavily by reading the loose objects (in both
of the experiments above, I discarded numbers from the very
first run).  The size of object databases in these cases are:

        $ du -sh victim0[34]/.git/objects
        6.2M    victim03/.git/objects
        84M     victim04/.git/objects

So I am still not convinced it would be useful in general.  It
used to be that exploding everything and repacking was the only
way to clean out garbage from packs, but after "repack -a -d"
was invented by Frank Sorenson that became more convenient way.
Especially with the recent "delta reusing" pack-objects, doing
"repack -a -d" has become quite cheap, so...

^ permalink raw reply

* Re: Following renames
From: Marco Costalba @ 2006-03-27  5:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0603261422280.15714@g5.osdl.org>

On 3/27/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sun, 26 Mar 2006, Marco Costalba wrote:
> >
> > FIRST WAY
> >
> > After annotating a file history (double click on a file name in
> > bottom-right window or directly from tree view), you see the whole
> > file annotated. If you have the diff window open you see also the
> > corresponding patch (scrolled to selected file name).
>
> The problem is that this step is already _way_ too expensive.
>
> I don't want to work with any tool that makes "Step 1" take a minute or
> two for a project that has a few years of history. Try it on the linux
> historic project with some file that gets lots of modifications.
>

Historic Linux test (63428 revisions)

File: drivers/net/tg3.c
Revisions that modify tg3.c : 292

With qgit
15s to retrieve file history (git-rev-list)
19.5s to annotate (git-diff-tree -p, current GNU algorithm, not new faster one)

and...

$ time git-whatchanged HEAD drivers/net/tg3.c > /dev/null
98.01user 2.44system 1:46.19elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (797major+43033minor)pagefaults 0swaps

NOTE: It seems that  git-whatchanged asks for checked the out file to
work. It didn't work with no repository checked out.


Marco

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox