* git-svn name
From: Eric Wong @ 2006-03-26 3:04 UTC (permalink / raw)
To: git; +Cc: Gerrit Pape, Chris Wright
I told somebody about my 'git-svn' program, and of course they asked
where they could read about it. Since I don't have a website for it,
I Googled for 'git-svn' in hopes that it'd lead me to
contrib/git-svn/git-svn.txt on gitweb.
To my surprise, I found that git-svn was already packaged for several
major distributions. Of course, it turns out that those binary packages
are actually of git-svnimport. Oops, maybe I should've checked before
naming my own creation git-svn :x
Of course, I still think git-svn is a good name because it describes
what the program does in as little text as possible. If anybody has any
suggestions that don't require too much typing while keeping the name
meaningful, feel free to suggest them.
Would distro package maintainers also be willing to add my git-svn
script to their git-svn binary packages when a new release of git is
made, too? It's quite different from git-svnimport (see
contrib/git-svn/git-svn.txt for details).
--
Eric Wong
^ permalink raw reply
* Re: Following renames
From: Linus Torvalds @ 2006-03-26 3:19 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
In-Reply-To: <20060326014946.GB18185@pasky.or.cz>
On Sun, 26 Mar 2006, Petr Baudis wrote:
>
> In [1], Linus suggests a non-core solution. Unfortunately, it doesn't
> fly - it requires at least two git-ls-tree calls per revision which
> would bog things down awfully (to roughly half of the original speed).
No it doesn't. It requires one git-ls-tree WHEN SOMETHING IS RENAMED.
In other words, basically never.
Linus
^ permalink raw reply
* Re: Following renames
From: Jakub Narebski @ 2006-03-26 3:52 UTC (permalink / raw)
To: git
In-Reply-To: <7virq1sywj.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano wrote:
> Petr Baudis <pasky@ucw.cz> writes:
>
>> An obvious solution would be to have git-diff-tree --follow which
>> updates its interesting path set based on seen renames, and now that
>> I've written about non-linear history, it's obvious that it's incorrect.
>> The other obvious way to go is then to add rename detection support to
>> git-rev-list, and it's less obvious that this is a dead end too - I
>> didn't inspect the code myself yet, but for now I trust Linus in [2]
>> (I didn't quite understand the argument, I guess I need to sleep on it).
>
> I'd have to sleep on how the core side can help Porcelains, but
> I think it is a good thing that you, one of the most vocal
> advocate on the list for doing rename recording, are thinking
> about this issue and probably would look into rev-list.c soon.
>
> Looking at the evolution of rev-list.c file itself was a good
> exercise to realize that rename tracking (more specifically,
> having whatchanged to follow renames) is not such a useful
> thing (at least for me).
[...]
> What this suggests is that switching the set of paths to follow
> while traversing ancestry chain needs to depend on which part of
> the original file you are interested in. Marking "this commit
> renames (or copies) file A to file B" is not that useful -- for
> that matter, detecting at runtime like we currently do is not
> better either. If a file A and file B were cleaned up and
> merged into a single file C, which is in the tip of the tree,
> which one you would want whatchanged to switch following depends
> on which part of the C you were interested in.
>
> Unless you are interested in the _entire_ contents of the file,
> that is. Then tracking or even recording renames becomes
> useful, but that is a special case.
>
> That is the reason I am not so enthused about recording renames.
> I think the time is better spent on enhancing what pickaxe tries
> to do (currently it does very little), which I hinted in a
> separate message late last night.
I think one of the better ideas/suggestions about *recording* filenames was
in the "impure renames / history tracking" thread
http://marc.theaimsgroup.com/?l=git&m=114122175216489&w=2
<Pine.LNX.4.64.0603011343170.13612@sheen.jakma.org>
about adding *auxiliary* (helper) information about renames in commits. I'm
not sure about recording parts of the file that were moved or copied. That
might have been left for runtime detection in the likes of pickaxe.
As it would be helper-only information it would ensure backwards
compatibility (older versions would ignore additional information) and
forward compatibility (newer version would fallback to current runtime
renames tracking/detection).
To be generic, I think that the command to record rename/copy or
copy'n'paste/cut'n'paste would take set of source files (one or more,
unless we want to have an option to mark the file as new supressing any
superficial similarities, in which case it would be zero or more), and set
of destination files (one or more, with files which were in source repeated
it was copy, not repeated if it was rename or cut'n'paste; unless we want
to record deletions also, in which case it would be zero or more files).
Such information can be I guess easily entered by user... if one remembers
to record rename/cut'n'paste/etc. that is. Perhaps if it were a way to easy
add such information later, for example confirming detected
renames/relationships during merge... It would be much more difficult for
user to enter the ranges unassisted.
What worries me is that such information, recorded in "own fields to the GIT
revision messages" (in commits) can be used only if you track the ancestry;
it doesn't help if you have only have two or more revisions and not build
relationship graph between them. But maybe I worry unnecessary...
BTW. following renames is important not only in examining file [contents]
history, in the likes of diff, whatchanged, annotate/blame, pickaxe but
also for merges.
References:
===========
* http://marc.theaimsgroup.com/?l=linux-kernel&m=111314792424707
* http://article.gmane.org/gmane.comp.version-control.git/217
* http://marc.theaimsgroup.com/?l=git&m=114123702826251
* http://marc.theaimsgroup.com/?l=git&m=114315795227271
--
Jakub Narebski
Warsaw, Poland
^ permalink raw reply
* Re: Use a *real* built-in diff generator
From: Davide Libenzi @ 2006-03-26 4:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Morten Welinder, Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0603251030340.15714@g5.osdl.org>
On Sat, 25 Mar 2006, Linus Torvalds wrote:
> I don't need "patch", since I wrote my own anyway. It's just called
> "apply" instead of "patch".
Oh, ok. I thought you were calling out GNU patch for the task.
> Doing "apply" is not only much simpler than doing "diff", but I needed my
> own much earlier: it's much more timing-critical for me (applying 200
> patches in one go), and git needed something that could honor renames and
> copies, and the mode bits too.
>
> Besides, I hate how GNU patch bends over backwards in applying crap that
> isn't a proper patch at all (whitespace-corruption, you name it: GNU patch
> will accept it). Also, I made "git-apply" be all-or-nothing: either it
> applies the _whole_ patch (across many different files) or it applies none
> of it. With GNU patch, if you get an error on the fifth file, the four
> first files have been modified already - aarrgghhh..
>
> See "apply.c" for details if you care. It's stupid, but it works (and it
> _only_ handles unified diffs - with the git extensions, of course).
So is xdl_patch(). It handles unified diffs, a simple ignore whitespace
changes, and all (methink) the fuzzy merge features of GNU patch.
Okie then, drop me an email if you find bugs in the libxdiff code, so I
can fix the main library.
- Davide
^ permalink raw reply
* Re: Use a *real* built-in diff generator
From: Davide Libenzi @ 2006-03-26 5:33 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Morten Welinder, Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0603251040190.15714@g5.osdl.org>
On Sat, 25 Mar 2006, Linus Torvalds wrote:
> Btw, git-apply does it, and it's actually quite simple: the code to handle
> the "\ No newline" case is literally just this:
>
> /*
> * "plen" is how much of the line we should use for
> * the actual patch data. Normally we just remove the
> * first character on the line, but if the line is
> * followed by "\ No newline", then we also remove the
> * last one (which is the newline, of course).
> */
> plen = len-1;
> if (len < size && patch[len] == '\\')
> plen--;
>
> if we just remove the last '\n' on a line, if the _next_ line starts with
> a '\\' (so the git-apply code actually depends on knowing that the patch
> text is dense, and that it's also padded out so that you can look one byte
> past the end of the diff and it won't be a '\\').
>
> I don't know how well that fits into xpatch (I never looked at the patch
> side, since I already had my own ;), but my point being that handling this
> special case _can_ be very simple if the data structures are just set up
> for it.
Yeah, should be a pretty trivial fix in the xpatch parsing code. Thanks
for remembering me the missing-eol issue, that fell forgotten somewhere in
my todo list :D
- Davide
^ permalink raw reply
* What's in git.git
From: Junio C Hamano @ 2006-03-26 6:00 UTC (permalink / raw)
To: git
* The 'master' branch has these since the last announcement.
- git-svn memory usage reduction (Eric Wong)
- documentation updates (Francis Daly, Jon Loeliger)
- fix updating working tree after cvsimport reads from CVS
- fetch exits non-zero when fast-forward check fails.
- improve git-pull's failur case when pulling into the tracking branch.
- commit-tree checks return value from write_sha1_file().
- git-apply understands "@@ -l, +m @@" correctly.
----------------------------------------------------------------
* The 'next' branch, in addition, has these.
These are harmless and useful to be pushed into "master"; I just
have not gotten around to.
- updates around git-clone:
. --use-separate-remote
. --reference <repo>
. fetch,parse-remote,fmt-merge-msg: refs/remotes/* support (Eric Wong)
. sha1_name() understands refs/remotes/$foo/HEAD
- sha1_name safety and core.warnambiguousrefs
- git-merge knows some strategies want to skip trivial merges
------------
I really should do some more stats on this and push it out.
Just haven't got around to do so.
- insanely fast rename detection (Linus and me)
------------
These look very good, but people depend on them, so I'd like to
simmer them in "next" for a couple of days to hear success
stories, or "ah crap I got burned" story ;-).
- tar-tree updates (Rene Scharfe)
- send-email updates (Eric Wong)
------------
Hot off the press. I smell the beginning of a good stuff here.
- truly built-in diff (Linus with Davide)
------------
This is harmless to be pushed into "master" but is staying here
only because nobody expressed urgency.
- ls-{files,tree} --abbrev (Eric Wong)
----------------------------------------------------------------
* The 'pu' branch, in addition, has these.
Since I do not have a good guinea pig case to use this, I
haven't read and understood the code being patched enough to
comment on the change this one introduces; it looks obviously
correct, though.
I'd like an ACK or two from people who works with SVN gateway
before I apply this to "master".
- git-svnimport: if a limit is specified, respect it (Anand Kumria)
------------
This script does what it claims to do, but I do not think of a
useful use case for this. When I have packs with garbage
objects in them (because I rewind my "pu" branch I usually end
up having a handful in my packs), I just run "repack -a -d" and
that is good enough. So I need a bit of convincing to keep
this.
- Add git-explode-packs (Martin Atukunda)
^ permalink raw reply
* Re: [PATCH] Add git-explode-packs
From: Junio C Hamano @ 2006-03-26 6:12 UTC (permalink / raw)
To: Martin Atukunda; +Cc: git
In-Reply-To: <11432881443149-git-send-email-matlads@dsmagic.com>
Martin Atukunda <matlads@dsmagic.com> writes:
> This script does the opposite of git repack -a -d.
The script seems to do what it claims to, but now why would one
need to use this? In other words what's the situation one would
find this useful?
^ permalink raw reply
* Re: Following renames
From: Ryan Anderson @ 2006-03-26 7:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Petr Baudis, git
In-Reply-To: <Pine.LNX.4.64.0603251919170.15714@g5.osdl.org>
[-- Attachment #1: Type: text/plain, Size: 646 bytes --]
Linus Torvalds wrote:
> On Sun, 26 Mar 2006, Petr Baudis wrote:
>
>> In [1], Linus suggests a non-core solution. Unfortunately, it doesn't
>> fly - it requires at least two git-ls-tree calls per revision which
>> would bog things down awfully (to roughly half of the original speed).
>>
>
> No it doesn't. It requires one git-ls-tree WHEN SOMETHING IS RENAMED.
>
> In other words, basically never.
>
A simple example is the first loop in git-annotate.perl. (Which was
basically written by Linus, I just translated it from a
shell/pseudo-code example into Perl)
--
Ryan Anderson
sometimes Pug Majere
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]
^ permalink raw reply
* Re: Following renames
From: Petr Baudis @ 2006-03-26 10:07 UTC (permalink / raw)
To: Linus Torvalds, Ryan Anderson; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603251919170.15714@g5.osdl.org>
Dear diary, on Sun, Mar 26, 2006 at 05:19:50AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> On Sun, 26 Mar 2006, Petr Baudis wrote:
> >
> > In [1], Linus suggests a non-core solution. Unfortunately, it doesn't
> > fly - it requires at least two git-ls-tree calls per revision which
> > would bog things down awfully (to roughly half of the original speed).
>
> No it doesn't. It requires one git-ls-tree WHEN SOMETHING IS RENAMED.
>
> In other words, basically never.
Huh? I don't see that now (and caps don't help me see it better). That's
certainly not what is in [1], and I don't see _how_ to detect the
renames in this case, and what would I be actually doing git-ls-tree for
when I've already detected the rename. Based on [1], I'd be doing
git-ls-tree merely to detect that a file _disappeared_ in the first
place, I have to do other stuff to detect the renames themselves.
Dear diary, on Sun, Mar 26, 2006 at 09:35:02AM CEST, I got a letter
where Ryan Anderson <ryan@michonline.com> said that...
> A simple example is the first loop in git-annotate.perl. (Which was
> basically written by Linus, I just translated it from a
> shell/pseudo-code example into Perl)
Thanks for the hint. Unfortunately, this is precisely the thing I want
to avoid, that is essentially reimplementing part of git-rev-list - to
do something good, I would have to do my own toposort and merge by date
between parallel lines. OTOH, I might just construct a large revlist
commandline specifying all the segments I'm interested in and see what
happens when I run that.
Besides, doing it in shell would be pretty ugly job (forcing me to
finally rewrite it in perl is not a bad thing but that'd be a somewhat
larger project since I share various common routines with other cg
tools, etc).
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply
* Re: git-svn name
From: Petr Baudis @ 2006-03-26 10:10 UTC (permalink / raw)
To: Eric Wong; +Cc: git, Gerrit Pape, Chris Wright
In-Reply-To: <20060326030425.GA6306@hand.yhbt.net>
Dear diary, on Sun, Mar 26, 2006 at 05:04:25AM CEST, I got a letter
where Eric Wong <normalperson@yhbt.net> said that...
> Would distro package maintainers also be willing to add my git-svn
> script to their git-svn binary packages when a new release of git is
> made, too? It's quite different from git-svnimport (see
> contrib/git-svn/git-svn.txt for details).
I think the primary purpose of the packages separation are dependencies
- not to make the git package depend on svn and such. So I guess the
packagers won't have a problem adding your script to the git-svn package
(concerning SuSE, I won't at least).
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply
* Union diff
From: Petr Baudis @ 2006-03-26 10:21 UTC (permalink / raw)
To: git
Hello,
sorry for possibly a silly question, but can I get a diff of a merge
commit with _union_ of changes against all the parents?
$ git-diff-tree --abbrev -r -m --pretty=raw badfc383b
diff-tree badfc38... (from 1ee6c84...)
tree 29d81f18912328df4f4104e9b9cc355424ced04d
parent 1ee6c84efda742eda8b4b200491341125d8d9639
parent 453b160f03c8c6d450879482f617412c257e5889
author Petr Baudis <pasky@suse.cz> 1143328578 +0100
committer Petr Baudis <xpasky@machine.or.cz> 1143328578 +0100
Merge with v0.17
:100755 100755 743c19f... b05900d... M Documentation/make-cogito-asciidoc
:100644 100644 5896df7... 6f06c35... M cg-Xlib
diff-tree badfc38... (from 453b160...)
tree 29d81f18912328df4f4104e9b9cc355424ced04d
parent 1ee6c84efda742eda8b4b200491341125d8d9639
parent 453b160f03c8c6d450879482f617412c257e5889
author Petr Baudis <pasky@suse.cz> 1143328578 +0100
committer Petr Baudis <xpasky@machine.or.cz> 1143328578 +0100
Merge with v0.17
:100644 100644 24ce0a4... d540853... M TODO
:100755 100755 6005083... f7efa9d... M cg-log
I would like something like:
diff-tree badfc38... (from parents)
tree 29d81f18912328df4f4104e9b9cc355424ced04d
parent 1ee6c84efda742eda8b4b200491341125d8d9639
parent 453b160f03c8c6d450879482f617412c257e5889
author Petr Baudis <pasky@suse.cz> 1143328578 +0100
committer Petr Baudis <xpasky@machine.or.cz> 1143328578 +0100
Merge with v0.17
:100755 100755 743c19f... b05900d... M Documentation/make-cogito-asciidoc
:100644 100644 24ce0a4... d540853... M TODO
:100644 100644 5896df7... 6f06c35... M cg-Xlib
:100755 100755 6005083... f7efa9d... M cg-log
Now, the -c option documentation says:
It shows the differences from each of the parents to the merge
result simultaneously, instead of showing pairwise diff between
a parent and the result one at a time, which '-m' option output
does.
This sounds as exactly what I want. Well, the only problem is that the
same diff command as above with -c option added produces no diff at all,
just the header and commit messages. Did I misunderstand the -c
description and does it do something different?
Thanks,
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply
* Re: Following renames
From: Fredrik Kuivinen @ 2006-03-26 10:34 UTC (permalink / raw)
To: Petr Baudis; +Cc: Linus Torvalds, Ryan Anderson, git
In-Reply-To: <20060326100717.GD18185@pasky.or.cz>
On Sun, Mar 26, 2006 at 12:07:17PM +0200, Petr Baudis wrote:
> Dear diary, on Sun, Mar 26, 2006 at 05:19:50AM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
> > On Sun, 26 Mar 2006, Petr Baudis wrote:
> > >
> > > In [1], Linus suggests a non-core solution. Unfortunately, it doesn't
> > > fly - it requires at least two git-ls-tree calls per revision which
> > > would bog things down awfully (to roughly half of the original speed).
> >
> > No it doesn't. It requires one git-ls-tree WHEN SOMETHING IS RENAMED.
> >
> > In other words, basically never.
>
> Huh? I don't see that now (and caps don't help me see it better). That's
> certainly not what is in [1], and I don't see _how_ to detect the
> renames in this case, and what would I be actually doing git-ls-tree for
> when I've already detected the rename. Based on [1], I'd be doing
> git-ls-tree merely to detect that a file _disappeared_ in the first
> place, I have to do other stuff to detect the renames themselves.
>
> Dear diary, on Sun, Mar 26, 2006 at 09:35:02AM CEST, I got a letter
> where Ryan Anderson <ryan@michonline.com> said that...
> > A simple example is the first loop in git-annotate.perl. (Which was
> > basically written by Linus, I just translated it from a
> > shell/pseudo-code example into Perl)
>
> Thanks for the hint. Unfortunately, this is precisely the thing I want
> to avoid, that is essentially reimplementing part of git-rev-list - to
> do something good, I would have to do my own toposort and merge by date
> between parallel lines. OTOH, I might just construct a large revlist
> commandline specifying all the segments I'm interested in and see what
> happens when I run that.
>
> Besides, doing it in shell would be pretty ugly job (forcing me to
> finally rewrite it in perl is not a bad thing but that'd be a somewhat
> larger project since I share various common routines with other cg
> tools, etc).
>
If you decide to modify rev-list to do rename tracking you might want
to have a look at how this is done in blame.c. git-blame only tracks
one file (since that is what it needs) but I think it should be
possible to track multiple files with a similar approach.
- Fredrik
^ permalink raw reply
* Re: Following renames
From: Petr Baudis @ 2006-03-26 10:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7virq1sywj.fsf@assigned-by-dhcp.cox.net>
(Note that I do *not* want to raise the explicit vs. implicit rename
tracking argument, in case anyone would misunderstood. I've accepted
implicit rename tracking as a fact of Git life for now. I just want to
make use of it now. ;-)
Dear diary, on Sun, Mar 26, 2006 at 04:49:48AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Looking at the evolution of rev-list.c file itself was a good
> exercise to realize that rename tracking (more specifically,
> having whatchanged to follow renames) is not such a useful
> thing (at least for me).
Well, noone argues that rename tracking cures all the woes of hackerkind
and anything more precise than that is useless. I'm rather saying that
rename tracking indeed _is_ a special case of something more general and
truly very interesting, but a special case so frequent that it's worth
doing even if we can't do the general case yet. Or at least people
*think* it's very frequent and it gives them the warm fuzzy feeling
knowing that the tool can handle it (at least somehow) - and the warm
fuzzy feeling is important, especially if you're trusting your sources
to the tool.
So, obviously, you'll find plenty of counter-examples where rename
detection won't help. I don't argue that. I merely say that there will
still be enough cases where following renames will help to warrant
doing it.
Now, Git history has enough examples of where rename following would be
useful. When I'm digging into the history, I'm hitting the big tools
rename barrier all the time, and just yesterday when wondering about
jdl's <snap> removal from git.txt I've hit 2cf565c53 - coming along any
file to that commit should make me follow Documentation/core-git.txt out
of the commit (well, that's rather copy than rename detection).
> Another example. Today's tar-tree updates have one interesting
> function I think should belong to strbuf.c, and before merging
> it to the mainline, I may move that function from tar-tree.c to
> strbuf.c. After that happens, if I run "whatchanged strbuf.c"
> to see where that function came from, I would want it to notice
> it came from tar-tree.c, although it is not a rename at all.
> Just one function moved from a file to another.
A wild pickaxe - when the string disappears from file X, scan all the
changes in the commit and start following files where it reappears. This
should help, right?
But when you want to implement this, you hit the exact same problems as
when you try to follow renames, only a different part of diffcore
detects it. So, what I'm trying to solve is actually not just following
renames but a more general problem.
> If a file A and file B were cleaned up and merged into a single file
> C, which is in the tip of the tree, which one you would want
> whatchanged to switch following depends on which part of the C you
> were interested in.
If in doubt (and the user does not use pickaxe to clarify it), you can
just follow both. The user will get some extra stuff (or maybe even not
if he wants to know about pieces from both), but we are at least trying
to be useful and DTRT instead of doing nothing in case we would by any
chance not do the very best.
> Unless you are interested in the _entire_ contents of the file,
> that is. Then tracking or even recording renames becomes
> useful, but that is a special case.
A frequent (and wanted) special case.
> That is the reason I am not so enthused about recording renames.
> I think the time is better spent on enhancing what pickaxe tries
> to do (currently it does very little), which I hinted in a
> separate message late last night.
Sure, pickaxe is cool, but as I said above, if you try to teach _it_
following around files, you'll hit the exact same problems as me. We're
just trying to build something using lego blocks with different stuff
inside but otherwise actually looking pretty much the same.
The thing with pickaxe is that frequently it would be simply more
laborous to dig for and construct the proper pickaxe string than just
firing up cg-log -s filename with greedy renames following and quickly
scanning through the results.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply
* Re: Following renames
From: Petr Baudis @ 2006-03-26 10:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <20060326105248.GG18185@pasky.or.cz>
Dear diary, on Sun, Mar 26, 2006 at 12:52:48PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Well, noone argues that rename tracking cures all the woes of hackerkind
^^^^^^^^^^
Or is it hackerdom?
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply
* Re: Use a *real* built-in diff generator
From: Ralf Baechle @ 2006-03-26 11:09 UTC (permalink / raw)
To: Linus Torvalds
Cc: Davide Libenzi, Morten Welinder, Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0603251030340.15714@g5.osdl.org>
On Sat, Mar 25, 2006 at 10:39:03AM -0800, Linus Torvalds wrote:
> Besides, I hate how GNU patch bends over backwards in applying crap that
> isn't a proper patch at all (whitespace-corruption, you name it: GNU patch
> will accept it). Also, I made "git-apply" be all-or-nothing: either it
> applies the _whole_ patch (across many different files) or it applies none
> of it. With GNU patch, if you get an error on the fifth file, the four
> first files have been modified already - aarrgghhh..
Which is apply's greatest strength - and weakness. GNU diff doesn't
understand the file renamings bits of git diffs, so they they need to be
used with apply. So if a patch doesn't apply? Apply doesn't even have
an option to apply things as good as it can and leave the rest in
reject files. Yuck.
Ralf
^ permalink raw reply
* Re: [PATCH] Add git-explode-packs
From: Jan-Benedict Glaw @ 2006-03-26 12:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Martin Atukunda, git
In-Reply-To: <7vwtehpwdd.fsf@assigned-by-dhcp.cox.net>
[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]
On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@cox.net> wrote:
> Martin Atukunda <matlads@dsmagic.com> writes:
> > This script does the opposite of git repack -a -d.
>
> The script seems to do what it claims to, but now why would one
> need to use this? In other words what's the situation one would
> find this useful?
It's possibly useful if you oftenly access old objects with
git-cat-file or git-ls-tree.
Not being a Perl hacker, a friend and I eg. started to hack GIT
support into LXR. I've just posted some very early patches on the LXR
mailing list
(http://sourceforge.net/mailarchive/forum.php?forum_id=1734). What
would be even more interesting is to not unpack _all_ objects, but
only those belonging to specifically mentioned commits or tags. I
think LXR could make _good_ use of that.
MfG, JBG
--
Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 _ O _
"Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg _ _ O
für einen Freien Staat voll Freier Bürger" | im Internet! | im Irak! O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply
* Re: History rewriting swiss army knife
From: Petr Baudis @ 2006-03-26 13:17 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vek0rzchc.fsf@assigned-by-dhcp.cox.net>
Dear diary, on Fri, Mar 24, 2006 at 11:47:43PM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@suse.cz> writes:
>
> > It's never been so easy before - I've written cg-admin-rewritehist,
> > which will execute your filters for each commit (which can rewrite the
> > tree contents, just the tree itself through the index, committer/author
> > information and commit message) while the script will obviously preserve
> > all the other information like merges, author/committer information etc.
>
> Hmph. The above description sounds like you are not allowing
> the user's custom script to drop existing parent (or graft a new
> one) while rewriting. I have not looked at how you are
> interfacing with user's custom script, but I sort-of expected
> you to throw a commit at it from older to newer (i.e. topo-order
> in reverse), along with the names of already re-written commit
> objects that are parents of taht commit, and have it build a
> rewritten commit and report its object name back to you.
There are rather several "filters" (user scripts) which are called at
various stages of the commit rewrite. In sum they are doing the same
thing as the single user script would, but cg-admin-rewritehist will
prepare some things to you so that everyone does not have to write the
common stuff again and again.
The net flexibility loss was zero, except two things:
* The parents list construction was hardcoded. Now I added a parent
filter which gets the parent string on stdin (including the -p bits,
but life's tough) and let it rewrite it (e.g. add stuff at the end).
So to "etch a graft":
cg-admin-rewritehist --parent-filter sed\ 's/^$/-p graftcommitid/' newbranch
assuming single-root history; but you have current commit id in
$GIT_COMMIT so you can go wild:
cg-admin-rewritehist --parent-filter 'cat; [ "$GIT_COMMIT" = "COMMIT" ] && echo "-p GRAFTCOMMIT"' newbranch
* A new commit would be always created. Sometimes you might want to
omit some commits. Now I added a commit filter which would be
called instead of the git-commit-tree command.
To remove commits authored by "Darl McBribe" from the history:
cg-admin-rewritehist --commit-filter '
if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ]; then
shift
while [ -n "$1" ]; do
shift; echo "$1"; shift
done
else
git-commit-tree "$@"
fi' newbranch
(note that this will handle even Darl's merges).
Thanks for the inspiration,
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply
* [PATCH] Do not ever list empty directories in git-ls-files --others
From: Petr Baudis @ 2006-03-26 14:25 UTC (permalink / raw)
To: junkio, Jim MacBaine; +Cc: git
In-Reply-To: <3afbacad0602270643k9fdd255w8f3769ad77c54e65@mail.gmail.com>
Hi,
Dear diary, on Mon, Feb 27, 2006 at 03:43:32PM CET, I got a letter
where Jim MacBaine <jmacbaine@gmail.com> said that...
> Many packages put empty directories under /etc, and although only a
> few of those directories are actually needed, the automatic removal of
> those packages will fail if I remove the empty directories manually.
> Equally, the removal will fail, if I put a .placeholder file into
> those direrectories and cg-add it. Is there a simple way out?
this is caused by git-ls-files behaviour - we now call it with
the --directory argument which is nice since it will show a non-empty
unknown directory as a single entry and won't list all its contents.
What is not so nice is the side-effect you are describing, and I tend
to agree that if the directory is empty, it should not be listed.
---
Without the --directory flag, git-ls-files wouldn't ever list directories,
producing no output for empty directories, which is good since they cannot
be added and they bear no content, even untracked one (if Git ever starts
tracking directories on their own, this should obviously change since the
content notion will change).
With the --directory flag however, git-ls-files would list even empty
directories. This patch fixes this.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
ls-files.c | 19 ++++++++++++++-----
1 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/ls-files.c b/ls-files.c
index e42119c..4502b51 100644
--- a/ls-files.c
+++ b/ls-files.c
@@ -258,11 +258,12 @@ static int dir_exists(const char *dirnam
* Also, we ignore the name ".git" (even if it is not a directory).
* That likely will not change.
*/
-static void read_directory(const char *path, const char *base, int baselen)
+static int read_directory(const char *path, const char *base, int baselen)
{
- DIR *dir = opendir(path);
+ DIR *fdir = opendir(path);
+ int contents = 0;
- if (dir) {
+ if (fdir) {
int exclude_stk;
struct dirent *de;
char fullname[MAXPATHLEN + 1];
@@ -270,7 +271,7 @@ static void read_directory(const char *p
exclude_stk = push_exclude_per_directory(base, baselen);
- while ((de = readdir(dir)) != NULL) {
+ while ((de = readdir(fdir)) != NULL) {
int len;
if ((de->d_name[0] == '.') &&
@@ -288,6 +289,7 @@ static void read_directory(const char *p
switch (DTYPE(de)) {
struct stat st;
+ int subdir, rewind_base;
default:
continue;
case DT_UNKNOWN:
@@ -301,22 +303,31 @@ static void read_directory(const char *p
case DT_DIR:
memcpy(fullname + baselen + len, "/", 2);
len++;
- if (show_other_directories &&
- !dir_exists(fullname, baselen + len))
+ rewind_base = nr_dir;
+ subdir = read_directory(fullname, fullname,
+ baselen + len);
+ if (show_other_directories && subdir &&
+ !dir_exists(fullname, baselen + len)) {
+ // Rewind the read subdirectory
+ while (nr_dir > rewind_base)
+ free(dir[--nr_dir]);
break;
- read_directory(fullname, fullname,
- baselen + len);
+ }
+ contents += subdir;
continue;
case DT_REG:
case DT_LNK:
break;
}
add_name(fullname, baselen + len);
+ contents++;
}
- closedir(dir);
+ closedir(fdir);
pop_exclude_per_directory(exclude_stk);
}
+
+ return contents;
}
static int cmp_name(const void *p1, const void *p2)
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply related
* [PATCH] Optionally do not list empty directories in git-ls-files --others
From: Petr Baudis @ 2006-03-26 14:59 UTC (permalink / raw)
To: junkio, Jim MacBaine; +Cc: git
In-Reply-To: <20060326142505.GL18185@pasky.or.cz>
Hi,
Dear diary, on Sun, Mar 26, 2006 at 04:25:05PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Dear diary, on Mon, Feb 27, 2006 at 03:43:32PM CET, I got a letter
> where Jim MacBaine <jmacbaine@gmail.com> said that...
> > Many packages put empty directories under /etc, and although only a
> > few of those directories are actually needed, the automatic removal of
> > those packages will fail if I remove the empty directories manually.
> > Equally, the removal will fail, if I put a .placeholder file into
> > those direrectories and cg-add it. Is there a simple way out?
>
> this is caused by git-ls-files behaviour - we now call it with
> the --directory argument which is nice since it will show a non-empty
> unknown directory as a single entry and won't list all its contents.
> What is not so nice is the side-effect you are describing, and I tend
> to agree that if the directory is empty, it should not be listed.
it turned out that cg-clean depends on the original behaviour (and it
makes sense there, we want to purge even empty directories). Therefore
this patch will preserve the old behaviour but add an option
--no-empty-directory. When that gets propagated to Git releases, I will
use it in cg-status.
---
Without the --directory flag, git-ls-files wouldn't ever list directories,
producing no output for empty directories, which is good since they cannot
be added and they bear no content, even untracked one (if Git ever starts
tracking directories on their own, this should obviously change since the
content notion will change).
With the --directory flag however, git-ls-files would list even empty
directories. This may be good in some situations but sometimes you want to
prevent that. This patch adds a --no-empty-directory option which makes
git-ls-files omit empty directories.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
Documentation/git-ls-files.txt | 3 +++
ls-files.c | 33 +++++++++++++++++++++++++--------
2 files changed, 28 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index e813f84..980c5c9 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -52,6 +52,9 @@ OPTIONS
If a whole directory is classified as "other", show just its
name (with a trailing slash) and not its whole contents.
+--no-empty-directory::
+ Do not list empty directories. Has no effect without --directory.
+
-u|--unmerged::
Show unmerged files in the output (forces --stage)
diff --git a/ls-files.c b/ls-files.c
index e42119c..83b0a3b 100644
--- a/ls-files.c
+++ b/ls-files.c
@@ -20,6 +20,7 @@ static int show_unmerged = 0;
static int show_modified = 0;
static int show_killed = 0;
static int show_other_directories = 0;
+static int hide_empty_directories = 0;
static int show_valid_bit = 0;
static int line_terminator = '\n';
@@ -258,11 +259,12 @@ static int dir_exists(const char *dirnam
* Also, we ignore the name ".git" (even if it is not a directory).
* That likely will not change.
*/
-static void read_directory(const char *path, const char *base, int baselen)
+static int read_directory(const char *path, const char *base, int baselen)
{
- DIR *dir = opendir(path);
+ DIR *fdir = opendir(path);
+ int contents = 0;
- if (dir) {
+ if (fdir) {
int exclude_stk;
struct dirent *de;
char fullname[MAXPATHLEN + 1];
@@ -270,7 +272,7 @@ static void read_directory(const char *p
exclude_stk = push_exclude_per_directory(base, baselen);
- while ((de = readdir(dir)) != NULL) {
+ while ((de = readdir(fdir)) != NULL) {
int len;
if ((de->d_name[0] == '.') &&
@@ -288,6 +290,7 @@ static void read_directory(const char *p
switch (DTYPE(de)) {
struct stat st;
+ int subdir, rewind_base;
default:
continue;
case DT_UNKNOWN:
@@ -301,22 +304,32 @@ static void read_directory(const char *p
case DT_DIR:
memcpy(fullname + baselen + len, "/", 2);
len++;
+ rewind_base = nr_dir;
+ subdir = read_directory(fullname, fullname,
+ baselen + len);
if (show_other_directories &&
- !dir_exists(fullname, baselen + len))
+ (subdir || !hide_empty_directories) &&
+ !dir_exists(fullname, baselen + len)) {
+ // Rewind the read subdirectory
+ while (nr_dir > rewind_base)
+ free(dir[--nr_dir]);
break;
- read_directory(fullname, fullname,
- baselen + len);
+ }
+ contents += subdir;
continue;
case DT_REG:
case DT_LNK:
break;
}
add_name(fullname, baselen + len);
+ contents++;
}
- closedir(dir);
+ closedir(fdir);
pop_exclude_per_directory(exclude_stk);
}
+
+ return contents;
}
static int cmp_name(const void *p1, const void *p2)
@@ -696,6 +709,10 @@ int main(int argc, const char **argv)
show_other_directories = 1;
continue;
}
+ if (!strcmp(arg, "--no-empty-directory")) {
+ hide_empty_directories = 1;
+ continue;
+ }
if (!strcmp(arg, "-u") || !strcmp(arg, "--unmerged")) {
/* There's no point in showing unmerged unless
* you also show the stage information.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time. I think
I have forgotten this before.
^ permalink raw reply related
* Re: Following renames
From: Timo Hirvonen @ 2006-03-26 16:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: pasky, git
In-Reply-To: <7virq1sywj.fsf@assigned-by-dhcp.cox.net>
On Sat, 25 Mar 2006 18:49:48 -0800
Junio C Hamano <junkio@cox.net> wrote:
> Looking at the evolution of rev-list.c file itself was a good
> exercise to realize that rename tracking (more specifically,
> having whatchanged to follow renames) is not such a useful
> thing (at least for me).
It would be useful for me. I had all files organized in subdirectories,
but then noticed it was not good idea because make does not play nicely
with subdirs, so I moved all files to top level directory.
Now
git-whatchanged -p file.c
stops at the big rename. To continue I have to do
git-whatchanged -p -- <some-commit> <old-filename>
> Another example. Today's tar-tree updates have one interesting
> function I think should belong to strbuf.c, and before merging
> it to the mainline, I may move that function from tar-tree.c to
> strbuf.c. After that happens, if I run "whatchanged strbuf.c"
> to see where that function came from, I would want it to notice
> it came from tar-tree.c, although it is not a rename at all.
> Just one function moved from a file to another.
Yes in this case you can do
$ git-whatchanged strbuf.c
$ git-whatchanged tar-tree.c
but after rename...
$ git-whatchanged old-file.c
fatal: 'old-file.c': No such file or directory
$ touch old-file.c
$ git-whatchanged old-file.c
Hah, it worked!
Hmm... this works too without the touch-hack:
$ git-whatchanged file.c old-file.c
I wish I had known this before.
> What this suggests is that switching the set of paths to follow
> while traversing ancestry chain needs to depend on which part of
> the original file you are interested in. Marking "this commit
> renames (or copies) file A to file B" is not that useful -- for
> that matter, detecting at runtime like we currently do is not
> better either. If a file A and file B were cleaned up and
> merged into a single file C, which is in the tip of the tree,
> which one you would want whatchanged to switch following depends
> on which part of the C you were interested in.
OK, maybe following renames is not such a good idea. But for GUIs
(gitk, qgit) following renames or even file merges (select a file to
follow by clicking it) would be big plus.
--
http://onion.dynserv.net/~timo/
^ permalink raw reply
* Re: Following renames
From: Jakub Narebski @ 2006-03-26 16:31 UTC (permalink / raw)
To: git
In-Reply-To: <7virq1sywj.fsf@assigned-by-dhcp.cox.net>
I wonder what is the most common case in Linux kernel or git.
1.) renaming the file in the same directory, old-file.c to new-file.c?
2.) moving file to other directory (project reorganization),
old-dir/file.c to new-dir/file.c?
3.) splitting file into modules, huge-file.c to file1.c, file2.c?
4.) copying fragment of one file to other?
5.) moving fragment of code from one file to other?
--
Jakub Narebski
Warsaw, Poland
^ permalink raw reply
* Re: Following renames
From: Linus Torvalds @ 2006-03-26 16:33 UTC (permalink / raw)
To: Petr Baudis; +Cc: Ryan Anderson, git
In-Reply-To: <20060326100717.GD18185@pasky.or.cz>
On Sun, 26 Mar 2006, Petr Baudis wrote:
>
> Huh? I don't see that now (and caps don't help me see it better). That's
> certainly not what is in [1], and I don't see _how_ to detect the
> renames in this case, and what would I be actually doing git-ls-tree for
> when I've already detected the rename. Based on [1], I'd be doing
> git-ls-tree merely to detect that a file _disappeared_ in the first
> place, I have to do other stuff to detect the renames themselves.
No, the point is that "git-rev-list" already does all of [1] in the core.
If you do
git-rev-list --parents --remove-empty $REV -- $filename
then you'll get the whole history for that filename. When it ends, you
know the file went away, and then you do basically _one_ "where the hell
did it go" thing.
And yes, it's not git-ls-tree (unless you only want to follow pure
renames), it's actually one "git-diff-tree -M $lastrev". Then you just
continue with the new filename (and do another "git-rev-list" until you
hit the next rename).
Linus
^ permalink raw reply
* Re: Following renames
From: Linus Torvalds @ 2006-03-26 16:43 UTC (permalink / raw)
To: Timo Hirvonen; +Cc: Junio C Hamano, pasky, git
In-Reply-To: <20060326190836.dbe95a72.tihirvon@gmail.com>
On Sun, 26 Mar 2006, Timo Hirvonen wrote:
>
> $ git-whatchanged old-file.c
> fatal: 'old-file.c': No such file or directory
>
> $ touch old-file.c
> $ git-whatchanged old-file.c
>
> Hah, it worked!
It worked even before:
git-whatchanged -- old-file.c
always works.
If you think of the "filename spec" as _always_ having to have a "--" to
separate the filenames from the other arguments, you're thinking the right
way. Then, there's a _shorthand_ for existing files, where we allow users
being lazy (because _I_ am very lazy indeed), which allows dropping of the
"--", but then the code requires that the filenames are real filenames as
of now.
> Hmm... this works too without the touch-hack:
>
> $ git-whatchanged file.c old-file.c
>
> I wish I had known this before.
Actually, it -shouldn't- work. It's just that "git-rev-parse" isn't as
anal as it should be.
Here's a fix.
Linus
----
diff --git a/rev-parse.c b/rev-parse.c
index f90e999..104b1e2 100644
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -172,7 +172,9 @@ int main(int argc, char **argv)
char *dotdot;
if (as_is) {
- show_file(arg);
+ if (show_file(arg) && as_is < 2)
+ if (lstat(arg, &st) < 0)
+ die("'%s': %s", arg, strerror(errno));
continue;
}
if (!strcmp(arg,"-n")) {
@@ -192,7 +194,7 @@ int main(int argc, char **argv)
if (*arg == '-') {
if (!strcmp(arg, "--")) {
- as_is = 1;
+ as_is = 2;
/* Pass on the "--" if we show anything but files.. */
if (filter & (DO_FLAGS | DO_REVS))
show_file(arg);
^ permalink raw reply related
* Re: Following renames
From: Linus Torvalds @ 2006-03-26 16:46 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
In-Reply-To: <e06fl8$p9f$1@sea.gmane.org>
On Sun, 26 Mar 2006, Jakub Narebski wrote:
>
> I wonder what is the most common case in Linux kernel or git.
>
> 1.) renaming the file in the same directory, old-file.c to new-file.c?
The kernel uses subdirectories extensively, and a lot of renames (most of
them, I'd say) is because of that subdirectory structure.
So the same-directory case is the unusual one, I'd say.
> 3.) splitting file into modules, huge-file.c to file1.c, file2.c?
> 4.) copying fragment of one file to other?
> 5.) moving fragment of code from one file to other?
I'd say that (5) is very common. And (4) happens a lot under certain
circumstances (new driver, new architecture, new filesystem..).
Doing (3) happens, but probably less often that it should ;/
Linus
^ permalink raw reply
* Re: Following renames
From: Jakub Narebski @ 2006-03-26 17:10 UTC (permalink / raw)
To: git
In-Reply-To: <Pine.LNX.4.64.0603260843250.15714@g5.osdl.org>
Linus Torvalds wrote:
> On Sun, 26 Mar 2006, Jakub Narebski wrote:
>>
>> I wonder what is the most common case in Linux kernel or git.
>>
>> 1.) renaming the file in the same directory, old-file.c to new-file.c?
>> 2.) moving file to other directory (project reorganization),
>> old-dir/file.c to new-dir/file.c?
> The kernel uses subdirectories extensively, and a lot of renames (most of
> them, I'd say) is because of that subdirectory structure.
>
> So the same-directory case is the unusual one, I'd say.
If (2) is common enough then discussed improvements to rename detection,
namely comparing basenames as a base for candidate selection is a good idea.
I wonder how common is (2) compared to (1)+(2) i.e. move to other dir
and rename, old-dir/old-file.c to new-dir/new-subdir/new-file.c
>> 3.) splitting file into modules, huge-file.c to file1.c, file2.c?
>> 4.) copying fragment of one file to other?
>> 5.) moving fragment of code from one file to other?
>
> I'd say that (5) is very common. And (4) happens a lot under certain
> circumstances (new driver, new architecture, new filesystem..).
>
> Doing (3) happens, but probably less often that it should ;/
Detecting (4) and (5) fast (i.e. for merges) without auxilary (helper)
information would probably be hard. For interrogation/porcellanish commands
(like pickaxe) would probably be easier.
--
Jakub Narebski
Warsaw, Poland
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox