* Default remote branch for local branch
From: Pavel Roskin @ 2006-04-01 1:48 UTC (permalink / raw)
To: git
Hello!
This is not a ready-to-use proposal, but I think my message can prompt
useful changes in GIT or in the "porcelain".
Let's see how remote changes end up in the head branch of the local
repository (sorry for possible mistakes in terminology):
branch in "remote" local index,
a remote -------> branch -------> branch ------> tree
repository (e.g. origin) (e.g. master)
Brought in sync by:
fetch merge checkout
Relationship stored in:
.git/remotes nowhere!!! .git/HEAD
If I fetch a remote branch, git knows where to get changes. If I change
the current branch, git remembers that. But git doesn't remember the
relationship between branches mirroring the remote branches and branches
used for local development.
There are situations when I want to work for a significant time on a
certain branch. Significant time means that changes are made by
others, and I'm supposed to integrate them and make more changes. Yet I
may want to take advantage of some patches from another team that uses
the other repository.
I want GIT and porcelains work the same way if I'm working with
repository 1 or repository 2. As it stands now, git gives preferential
treatment to the "origin" branch. Whatever branch I'm on, "git-pull"
will pull from "origin". I believe the special role of the "origin"
branch should be reduced to cg-clone and similar commands.
I think it would be convenient to have git remember the remote branch
for the given local branch. git should know that if HEAD points to
"local-B", "git-fetch" should fetch from "remote-B", not from "origin".
To implement this, git would have to implement attributes for local
branches (other ideas are welcome). Currently, there are no attributes
for local branches other than the top SHA1 in .git/refs/heads/
Once at the topic of branch attributes, let's see what else could be
useful? I think that would be the creator of the branch and the comment
(e.g. "this is for tested commits only, to be merged tomorrow" etc).
Where can this data be stored? Probably in blobs pointed to by refs in
e.g. .git/refs/branch-data/ or just in plain files e.g.
under .git/branch-data/
I know, it sounds like a lot of work to save a few keystrokes. But I
think it may be worth it. Working on different branches should have the
same "look and feel". Otherwise, git would repeat a design error of
CVS, where the head branch was given preference e.g. for date based
updates.
I'm sorry, reading this mailing list is beyond my capabilities, so
certain overlaps with other postings may be expected, unless I'm
suggesting something totally off-base :-)
--
Regards,
Pavel Roskin
^ permalink raw reply
* Re: Moving a file back to an earlier revision.
From: David Ho @ 2006-03-31 22:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vacb6thc7.fsf@assigned-by-dhcp.cox.net>
> Remember, a branch in git is very cheap, and is a powerful way
> to keep track of things while you decide which alternate
> universe to take. And even after you decide, you could always
> look at and even build on the other universe.
I feel embarrassed to say this but in my branch there are commits to
the driver and other commits for the board so it looks more like
---0---D1---B1---B2---D2---B3---B4---
D* - driver changes
B* - board changes
So to go back to the 0 state I lose my board changes. But I hope what
I did (in my reply to Linus) is very close to your idea of having
separate branches.
David
^ permalink raw reply
* Re: Moving a file back to an earlier revision.
From: David Ho @ 2006-03-31 22:03 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603311324040.27203@g5.osdl.org>
On 3/31/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Don't revert.
>
> Just pick the point you want to start testing his patch at (with gitk, for
> example, just cut-and-paste the sha1), and do
>
> git checkout -b test-better-fix <sha1>
>
> which creates a new branch ("test-better-fix") that starts at that point,
> and checks it out.
I forget to mention I have also in my branch changes necessary to run
on my board. So what I did was
git-branch test-better-fix my-branch
git-checkout test-better-fix
git-diff commit(my-fixes)..commit(original) filename | git-apply
git-commit
>
> Then, just apply the patch, and off you go. You now have _both_ his patch
> and your own series in separate branches, so you can cherry-pick and do
> other things (like do a "diff" between branches - which can sometimes be
> useful too to verify that the two branches end up fixing all the same
> problems).
>
Yes, good point.
Thanks, David
^ permalink raw reply
* Re: Moving a file back to an earlier revision.
From: Junio C Hamano @ 2006-03-31 21:49 UTC (permalink / raw)
To: David Ho; +Cc: git
In-Reply-To: <4dd15d180603311332p60fa1867nc303bd92d515b4e0@mail.gmail.com>
"David Ho" <davidkwho@gmail.com> writes:
>> I am working on a board port on a separate branch. The branch has
>> accumulated several revision of changes to a driver I worked on. Now,
>> someone has come along with a better fix so I want to help test his
>> patch. To do that I have to revert my changes to that driver (several
>> revisions back) before I can apply his patch.
>>
>> What would be a convenient way to do that with git?
>
> Sorry I might already have found it.
>
> File revisions
>
> +----+----+
> 1 2 3
>
> git diff commit(3)..commit(1) filename | git-apply
[please do not top-post].
That lets you go back to the state before 1, so what you are
doing is to start from here:
---0---1---2---3
and
---0---1---2---3---*
where * has the same tree as 0, and then on top of that you
apply his patch:
---0---1---2---3---X
But what if you find a room for further improvements in his
patch? You could commit X (which is revert of 321 *and* his
patch) and then build on top of it, like this:
---0---1---2---3---X---Y
and feed him "diff X..Y" back.
However.
What Linus said is more natural in git. Starting from the same
picture, you do this:
X (side branch to test his patch on)
/
---0---1---2---3 (your original branch)
You apply his patch to a new branch. You could even make
further improvements like this:
X---Y
/
---0---1---2---3
And if you decide Y is better than your version 3 after all, you
can switch to his branch and then pick up anything important
from your development track between 0..3 on top of Y by
cherry-picking, and you can even later discard your original
development track. On the other hand, if you end up deciding 3
is better than Y after all, you can just discard the whole side
branch.
Remember, a branch in git is very cheap, and is a powerful way
to keep track of things while you decide which alternate
universe to take. And even after you decide, you could always
look at and even build on the other universe.
^ permalink raw reply
* Re: Moving a file back to an earlier revision.
From: David Ho @ 2006-03-31 21:32 UTC (permalink / raw)
To: git
In-Reply-To: <4dd15d180603311313t7781f2ebk616276e9134f6472@mail.gmail.com>
Sorry I might already have found it.
File revisions
+----+----+
1 2 3
git diff commit(3)..commit(1) filename | git-apply
David
On 3/31/06, David Ho <davidkwho@gmail.com> wrote:
> Hi,
>
> Another user question. Other may actually have similar needs.
>
> I am working on a board port on a separate branch. The branch has
> accumulated several revision of changes to a driver I worked on. Now,
> someone has come along with a better fix so I want to help test his
> patch. To do that I have to revert my changes to that driver (several
> revisions back) before I can apply his patch.
>
> What would be a convenient way to do that with git?
>
> TIA, David
>
^ permalink raw reply
* Re: Moving a file back to an earlier revision.
From: Linus Torvalds @ 2006-03-31 21:26 UTC (permalink / raw)
To: David Ho; +Cc: git
In-Reply-To: <4dd15d180603311313t7781f2ebk616276e9134f6472@mail.gmail.com>
On Fri, 31 Mar 2006, David Ho wrote:
>
> I am working on a board port on a separate branch. The branch has
> accumulated several revision of changes to a driver I worked on. Now,
> someone has come along with a better fix so I want to help test his
> patch. To do that I have to revert my changes to that driver (several
> revisions back) before I can apply his patch.
>
> What would be a convenient way to do that with git?
Don't revert.
Just pick the point you want to start testing his patch at (with gitk, for
example, just cut-and-paste the sha1), and do
git checkout -b test-better-fix <sha1>
which creates a new branch ("test-better-fix") that starts at that point,
and checks it out.
Then, just apply the patch, and off you go. You now have _both_ his patch
and your own series in separate branches, so you can cherry-pick and do
other things (like do a "diff" between branches - which can sometimes be
useful too to verify that the two branches end up fixing all the same
problems).
Linus
^ permalink raw reply
* Moving a file back to an earlier revision.
From: David Ho @ 2006-03-31 21:13 UTC (permalink / raw)
To: git
Hi,
Another user question. Other may actually have similar needs.
I am working on a board port on a separate branch. The branch has
accumulated several revision of changes to a driver I worked on. Now,
someone has come along with a better fix so I want to help test his
patch. To do that I have to revert my changes to that driver (several
revisions back) before I can apply his patch.
What would be a convenient way to do that with git?
TIA, David
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Linus Torvalds @ 2006-03-31 20:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfykyuzc2.fsf@assigned-by-dhcp.cox.net>
On Fri, 31 Mar 2006, Junio C Hamano wrote:
>
> Yes, that is exactly the fix I have in "pu" -- I suspect you
> replied before getting to my response last night.
No, I just get too much email, and hadn't read your response carefully
enough, so I just missed the fact that you had already found the problem.
Linus
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Junio C Hamano @ 2006-03-31 20:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603311135290.27203@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> The fix is trivial - with the new get_revision() organization, the
> BOUNDARY case special-case actually goes away entirely, and this trivial
> patch (on top of my 2/2 patch) should just fix it.
Yes, that is exactly the fix I have in "pu" -- I suspect you
replied before getting to my response last night.
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Linus Torvalds @ 2006-03-31 19:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1wwjvwz9.fsf@assigned-by-dhcp.cox.net>
On Fri, 31 Mar 2006, Junio C Hamano wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > This is an absolutely huge deal for anything like "git log -- <pathname>",
> > but also for some things that we don't do yet - like the "find where
> > things changed" logic I've described elsewhere, where we want to find the
> > previous revision that changed a file.
> >...
> > Btw, don't even bother testing this with the git archive. git itself is so
> > small that parsing the whole revision history for it takes about a second
> > even with path limiting.
>
> By the way, I forgot to praise you ;-).
>
> Even on a fast machine, the old one was not very useful, but
> this one is _instantaneous_. Very good job.
Indeed. It's why I'd really like this to be merged before 1.3.0 - it moves
a certain class of problems from "it works" to "it's actually usable".
Now, the _real_ usage I foresee (which just wasn't practical before) is
the interactive annotation thing - this won't help a _full_file_ annotate
(which usually needs to go back to the very first version of a file
anyway), but it should make it possible to play with an incremental one
(the "graphical git-whatchanged" kind).
But even just the "git log" difference makes it worth it.
Linus
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Linus Torvalds @ 2006-03-31 19:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7v3bgzxgbg.fsf@assigned-by-dhcp.cox.net>
On Thu, 30 Mar 2006, Junio C Hamano wrote:
>
> There already was a report that --boundary stuff is not quite
> right, so what you are seeing might be that the new code exposes
> its original breakage even more. I haven't looked into the
> breakage of the original version yet either, so I cannot really
> say how your change breaks it.
[ Awake and thinking about this again ]
No, I think the new breakage was just because I was being a stupid ass.
When I converted get_revision() to do the parent parsing up at the top
instead of at the bottom, I just didn't think correctly about your new
BOUNDARY code, and my conversion for that was just wrong. Part of the "do
the parents early" code was also removing the current commit early, so
suddenly your BOUNDARY case for doing that thing was just removing some
other random commit instead, which was obviously wrong.
The fix is trivial - with the new get_revision() organization, the
BOUNDARY case special-case actually goes away entirely, and this trivial
patch (on top of my 2/2 patch) should just fix it.
At least it passes my tests again now, and looking at the code everything
seems sane.
Linus
---
diff --git a/revision.c b/revision.c
index 0e3f074..753633e 100644
--- a/revision.c
+++ b/revision.c
@@ -796,18 +796,6 @@ struct commit *get_revision(struct rev_i
if (revs->parents)
rewrite_parents(commit);
}
- /* More to go? */
- if (revs->max_count) {
- if (commit->object.flags & BOUNDARY) {
- /* this is already uninteresting,
- * so there is no point popping its
- * parents into the list.
- */
- struct commit_list *it = revs->commits;
- revs->commits = it->next;
- free(it);
- }
- }
commit->object.flags |= SHOWN;
return commit;
} while (revs->commits);
^ permalink raw reply related
* Re: windows problems summary
From: Eric W. Biederman @ 2006-03-31 19:25 UTC (permalink / raw)
To: Alex Riesen; +Cc: Git Mailing List
In-Reply-To: <81b0412b0603020649u99a2035i3b8adde8ddce9410@mail.gmail.com>
"Alex Riesen" <raa.lkml@gmail.com> writes:
> This is just to summarize all the problems which make porting to that
> thing so boring. Maybe if we have them all on one page, it'd be easier
> to locate the workarounds (it can be one thread, for example).
>
> 1. opened and mmaped files can't be removed or renamed
> (caused workaround with reading index in memory)
> 2. command can safely contain only one argument
> (breaks and complicates passing things between processes)
> 3. no fork
> (slows down and complicates passing things between processes)
> 4. non-unix permissions model
> (breaks x-attr)
> 5. real slow filesystems and caching
> (makes everything slow. I noticed I'm trying to avoid git status!).
> Caused workaround with manual checkout)
> 6. real slow program startup
> (makes everything slow, eventually may cause everything being put
> in one super-executable, just to avoid spawning new processes,
> with all associated problems. Makes scripting harder)
>
> I hope this message can be a start of a big porting thread,
> even though it is only about windows at the moment.
Not to forget make install gets confused when there
is a file named INSTALL in the git directory.
Eric
^ permalink raw reply
* Re: Possible --boundary bug
From: Marco Costalba @ 2006-03-31 16:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Paul Mackerras
In-Reply-To: <7v64lvvyev.fsf@assigned-by-dhcp.cox.net>
On 3/31/06, Junio C Hamano <junkio@cox.net> wrote:
> "Marco Costalba" <mcostalba@gmail.com> writes:
>
> > Sorry, the good description is below, please ignore the wrong previous one.
>
> I think this patch should fix it.
>
Yes. It works for me.
Thanks
Marco
^ permalink raw reply
* Re: [PATCH] cvsimport: use git-update-ref when updating
From: Johannes Schindelin @ 2006-03-31 11:14 UTC (permalink / raw)
To: Eric Wong; +Cc: Junio C Hamano, git
In-Reply-To: <20060331103230.GB15159@hand.yhbt.net>
Hi,
On Fri, 31 Mar 2006, Eric Wong wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > Hi,
> >
> > On Thu, 30 Mar 2006, Junio C Hamano wrote:
> >
> > > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> > >
> > > > This simplifies code, and also fixes a subtle bug: when importing in a
> > > > shared repository, where another user last imported from CVS, cvsimport
> > > > used to complain that it could not open <branch> for update.
> > >
> > > The second hunk look sensible but I do not know about "use Fcntl"
> > > since I do not see anything you are adding that starts to use it...
> >
> > O_EXCL. Without "use Fcntl;" perl says I am not allowed to use bareword
> > things in strict mode or some such.
>
> Huh? I still don't see where O_EXCL is used.
Yes. I did not make that point clear enough, I guess. My first approach
was to reimplement git-update-ref in perl, which worked well enough, until
I remembered that you could just call programs from perl :-)
> > > > + system("git-update-ref refs/heads/$branch $cid") == 0
>
> Passing args to system() in list form is always preferable in case
> there's a shell-unfriendly variable:
>
> system("git-update-ref", "refs/heads/$branch", $cid) == 0
Old habit dies hard.
---
git-cvsimport.perl | 6 +-----
1 files changed, 1 insertions(+), 5 deletions(-)
diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 3728294..fe6298b 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -677,11 +677,7 @@ my $commit = sub {
waitpid($pid,0);
die "Error running git-commit-tree: $?\n" if $?;
- open(C,">$git_dir/refs/heads/$branch")
- or die "Cannot open branch $branch for update: $!\n";
- print C "$cid\n"
- or die "Cannot write branch $branch for update: $!\n";
- close(C)
+ system("git-update-ref", "refs/heads/$branch", $cid) == 0
or die "Cannot write branch $branch for update: $!\n";
if($tag) {
^ permalink raw reply related
* Re: [PATCH] cvsimport: use git-update-ref when updating
From: Eric Wong @ 2006-03-31 10:32 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.63.0603311207270.20122@wbgn013.biozentrum.uni-wuerzburg.de>
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 30 Mar 2006, Junio C Hamano wrote:
>
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >
> > > This simplifies code, and also fixes a subtle bug: when importing in a
> > > shared repository, where another user last imported from CVS, cvsimport
> > > used to complain that it could not open <branch> for update.
> >
> > The second hunk look sensible but I do not know about "use Fcntl"
> > since I do not see anything you are adding that starts to use it...
>
> O_EXCL. Without "use Fcntl;" perl says I am not allowed to use bareword
> things in strict mode or some such.
Huh? I still don't see where O_EXCL is used.
> > > + system("git-update-ref refs/heads/$branch $cid") == 0
Passing args to system() in list form is always preferable in case
there's a shell-unfriendly variable:
system("git-update-ref", "refs/heads/$branch", $cid) == 0
--
Eric Wong
^ permalink raw reply
* Re: [PATCH] cvsimport: use git-update-ref when updating
From: Johannes Schindelin @ 2006-03-31 10:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vk6ab1iy2.fsf@assigned-by-dhcp.cox.net>
Hi,
On Thu, 30 Mar 2006, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > This simplifies code, and also fixes a subtle bug: when importing in a
> > shared repository, where another user last imported from CVS, cvsimport
> > used to complain that it could not open <branch> for update.
>
> The second hunk look sensible but I do not know about "use Fcntl"
> since I do not see anything you are adding that starts to use it...
O_EXCL. Without "use Fcntl;" perl says I am not allowed to use bareword
things in strict mode or some such.
Ciao,
Dscho
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Junio C Hamano @ 2006-03-31 8:28 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603301652531.27203@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> This is an absolutely huge deal for anything like "git log -- <pathname>",
> but also for some things that we don't do yet - like the "find where
> things changed" logic I've described elsewhere, where we want to find the
> previous revision that changed a file.
>...
> Btw, don't even bother testing this with the git archive. git itself is so
> small that parsing the whole revision history for it takes about a second
> even with path limiting.
By the way, I forgot to praise you ;-).
Even on a fast machine, the old one was not very useful, but
this one is _instantaneous_. Very good job.
$ PAGER=cat GIT_DIR=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/ \
/usr/bin/time git log -1 --pretty=short -- drivers/
commit ce362c009250340358a7221f3cdb7954cbf19c01
Merge: 064c94f... cd7a920...
Author: Linus Torvalds <torvalds@g5.osdl.org>
Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6
15.44user 0.19system 0:25.11elapsed 62%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+18050minor)pagefaults 0swaps
$ PAGER=cat GIT_DIR=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/ \
/usr/bin/time ./git.pu log -1 --pretty=short -- drivers/
commit ce362c009250340358a7221f3cdb7954cbf19c01
Merge: 064c94f... cd7a920...
Author: Linus Torvalds <torvalds@g5.osdl.org>
Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+388minor)pagefaults 0swaps
^ permalink raw reply
* Re: Possible --boundary bug
From: Junio C Hamano @ 2006-03-31 7:58 UTC (permalink / raw)
To: Marco Costalba; +Cc: git, Paul Mackerras
In-Reply-To: <e5bfff550603301255j52c68963v4b8eebea697eeecf@mail.gmail.com>
"Marco Costalba" <mcostalba@gmail.com> writes:
> Sorry, the good description is below, please ignore the wrong previous one.
I think this patch should fix it.
-- >8 --
rev-list --boundary: fix re-injecting boundary commits.
Marco reported that
$ git rev-list --boundary --topo-order --parents 5aa44d5..ab57c8d
misses these two boundary commits.
c649657501bada28794a30102d9c13cc28ca0e5e
eb38cc689e84a8fd01c1856e889fe8d3b4f1bfb4
Indeed, we can see that gitk shows these two commits at the
bottom, because the --boundary code failed to output them.
The code did not check to avoid pushing the same uninteresting
commit twice to the result list. I am not sure why this fixes
the reported problem, but this seems to fix it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/revision.c b/revision.c
index abc8745..c2a95aa 100644
--- a/revision.c
+++ b/revision.c
@@ -420,24 +420,33 @@ static void limit_list(struct rev_info *
p = &commit_list_insert(commit, p)->next;
}
if (revs->boundary) {
- list = newlist;
- while (list) {
+ /* mark the ones that are on the result list first */
+ for (list = newlist; list; list = list->next) {
struct commit *commit = list->item;
+ commit->object.flags |= TMP_MARK;
+ }
+ for (list = newlist; list; list = list->next) {
+ struct commit *commit = list->item;
struct object *obj = &commit->object;
- struct commit_list *parent = commit->parents;
- if (obj->flags & (UNINTERESTING|BOUNDARY)) {
- list = list->next;
- continue;
- }
- while (parent) {
+ struct commit_list *parent;
+ if (obj->flags & UNINTERESTING)
+ continue;
+ for (parent = commit->parents;
+ parent;
+ parent = parent->next) {
struct commit *pcommit = parent->item;
- parent = parent->next;
if (!(pcommit->object.flags & UNINTERESTING))
continue;
pcommit->object.flags |= BOUNDARY;
+ if (pcommit->object.flags & TMP_MARK)
+ continue;
+ pcommit->object.flags |= TMP_MARK;
p = &commit_list_insert(pcommit, p)->next;
}
- list = list->next;
+ }
+ for (list = newlist; list; list = list->next) {
+ struct commit *commit = list->item;
+ commit->object.flags &= ~TMP_MARK;
}
}
revs->commits = newlist;
^ permalink raw reply related
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Junio C Hamano @ 2006-03-31 7:02 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603302225160.27203@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> So the following is wrong:
>
>> diff --git a/revision.c b/revision.c
>> index 0e3f074..a55c0d1 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -404,10 +404,6 @@ static void limit_list(struct rev_info *
>> list = list->next;
>> free(entry);
>>
>> - if (revs->max_age != -1 && (commit->date < revs->max_age))
>> - obj->flags |= UNINTERESTING;
>> - if (revs->unpacked && has_sha1_pack(obj->sha1))
>> - obj->flags |= UNINTERESTING;
>> add_parents_to_list(revs, commit, &list);
>> if (obj->flags & UNINTERESTING) {
>> mark_parents_uninteresting(commit);
>...
> ..but the later part of the patch looks fine (modulo testing, of course):
>
>> @@ -786,7 +773,9 @@ struct commit *get_revision(struct rev_i
>> if (revs->min_age != -1 && (commit->date > revs->min_age))
>> continue;
>> if (revs->max_age != -1 && (commit->date < revs->max_age))
>> - return NULL;
>> + continue;
>> + if (revs->unpacked && has_sha1_pack(commit->object.sha1))
>> + continue;
OK, so let's say I agree with you that --unpacked and --since
are "stop early" rules. I fully agree with that from usability
and implementation point of view, but I now wonder if the
"output filter" in get_revision() and "stop early" in limit_list()
would do the same thing. That is, if we are _otherwise_
limited, limit_list() would use them for "stop early" rule, but
if we end up not calling limit_list() we would simply filter
them and keep going (which is what my patch did for both
timestamp and packedness), or we could also stop there.
I am not sure which one is correct, but I suspect whichever we
do in get_revision() we would get inconsistent results between a
case where we call limit_list() and we do not. That is, the set
of commits we are going to show when --(topo|date)-order is
given and not given can be different. Is this acceptable?
I would say, from the implementation point of view, it should be
tolerated, but...
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Junio C Hamano @ 2006-03-31 6:45 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0603302153350.27203@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> Sadly, it seems to react really badly to Junio's new --boundary logic for
> some reason that I haven't quite figured out yet.
There already was a report that --boundary stuff is not quite
right, so what you are seeing might be that the new code exposes
its original breakage even more. I haven't looked into the
breakage of the original version yet either, so I cannot really
say how your change breaks it.
> That reaction is independent of the actual pathname restriction, and seems
> to be related to how the --boundary logic expected
> pop_most_recent_commit() to work. In particular:
>
> ...
> if (commit->object.flags & BOUNDARY) {
> /* this is already uninteresting,
> * so there is no point popping its
> * parents into the list.
> */
>
> that code is magic, and seems to depend on something subtle going on with
> the list, and the incremental thing already popped the parent earlier and
> screwed up whatever magic that the BOUNDARY code depends on.
This was not so magic, but the magic was actually in the code
added to limit_list(). Usually, "newlist" consists interesting
commits, and what are found interesting initially but marked as
uninteresting when a different ancestry chain coming from an
uninteresting head leading to it was later discovered. The
magic code looks at still-interesting commits, and re-injects
its parents that are uninteresting to the list (and I just
spotted a bug there -- it does not check if what is being "re-"
injected are already on the list -- should I check for SEEN flag
there perhaps?), while marking them as boundary. This was done
to make sure that all the open-circle (in gitk) commits are on
the resulting list.
The part of the code you quoted is just a short-cut for not
doing pop_most_recent_commit() -- we used to have
pop_most_recent_commit() there, which pushed the parents of the
commit being processed into the list. Because we are processing
a boundary commit, we know it is uninteresting -- and by
definition, its parents are uninteresting and that is why it
just advances the list without calling pop_most_recent_commit(),
bypassing its side effect to push its parents into the list.
Since the new code has already advanced the list immediately
after the beginning of do {} block, I think you can remove the
entire "if (revs->max_count) {}" block. As the code currently
stands, you are skipping what happens to be next to the boundary
commit on the result list.
> Junio? I think you did some funky special case with BOUNDARY commits, and
> I broke it for you, can you look at the patch and see if you can see it?
> I'd really like to have the incremental path-limiter, because it really
> makes a huge difference in the usability of "git log pathname".
Oh, there is no question about making it streamable in more
cases is a good thing.
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Linus Torvalds @ 2006-03-31 6:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr74jxhp3.fsf@assigned-by-dhcp.cox.net>
On Thu, 30 Mar 2006, Junio C Hamano wrote:
>
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > The reason I put "RFC" in the subject line is that while I've validated it
> > various ways, like doing
> >
> > git-rev-list HEAD -- drivers/char/ | md5sum
> >
> > before-and-after on the kernel archive, it's "git-rev-list" after all. In
> > other words, it's that really really subtle and complex central piece of
> > software. So while I think this is important and should go in asap, I also
> > think it should get lots of testing and eyeballs looking at the code.
>
> Let me make sure I understand what is going on.
>
> Currently, limit_list(), which is called when revs->limited is
> set, serves two different purposes:
>
> * traverse the ancestry chain and mark ancestors of
> uninteresting commits also uninteresting;
Right.
We _also_ traverse the ancestry chain in the "walking" stage later in
get_revision(), but if we have a "limit" case, we'll pre-traverse
everything early in limit_list.
> * "simplify" each commit that is still interesting, by
> comparing the tree object of it and its parents.
Through "add_parent_to_list()", yes.
> We used to call limit_list() when we are limiting the commits by
> paths, because that was what the latter does as a side effect of
> add_parents_to_list(). You made it streamable by moving the
> call to get_revision() and not calling limit_list() when you do
> not have other reasons to call it.
Yes, and by using "add_parent_to_list()" instead of the
non-pathname-limit-aware "pop_most_recent_commit()".
So we don't call limit-list any more, because we do the same thing in
get_revision() that it used to do at limit time.
> You currently do not do this streaming optimization when you
> have to show simplified parents, because the streaming version
> traverses ancestry chain one step at a time as it goes, and you
> cannot obviously see who the final "fake" parent is until you
> simplify the parents enouogh.
Yes. For exactly the same reason some other things ause us to do
limit_list(): some ops simply need the fully connected end result in order
to be able to work correctly.
> * get_commit_reference() sets "limited" when the user gives ^exclude
> commit. no question about this --- we would need to see the
> reachability in this case.
Right.
> * when we are going to sort the result we are going to show,
> we set "limited" -- we cannot sort without knowing the set we
> are sorting.
Right.
> * giving commit timestamp limits via --max-age, --min-age
> etc. sets "limited". I have doubts about this.
I agree. I looked at it when I did the "rev-list.c" vs "revision.c" split,
but I wanted to do as few changes as possible, and for some reason we've
always considered "time" to cause us to limit things.
However, I think your patch is wrong. Even if you don't limit things when
we have a date specifier, you still want to stop doing traversal in
limit_list when you hit that date. Why? Because it might be limited for
some _other_ reason, eg due to --topo-order or or some other issue.
> * As to handling of --unpacked, exactly the same comment as
> "max-age" applies, but it might be even worse. Marking an
> unpacked one "uninteresting" means if a packed commit refers
> to loose commit, the loose one will be also marked
> uninteresting, I think.
>
> "--objects --unpacked" is an idiom to repack objects that are
> still loose, but I suspect it would do interesting thing if the
> commit is in a pack but its trees and blobs are still loose. Am
> I confused, or have I just spotted a potential (but hard to
> trigger) bug?
Regardless of where you do the "unpacked" test, it will always really end
up stopping when it hits a packed commit. So you won't ever get away from
that.
So same logic applies. Once you hit a packed commit, you should stop, both
in limit_list _and_ in get_revision(). Otherwise you'll do too much work
when trying to limit things, for no gain.
The fact is, "--unpacked" means that we traverse the part of the chain
that hasn't been packed yet. Anything that is packed automatically cuts
off parsing, whether there is anything else unpacked below it or not. It's
not a bug, it's a feature, and if you want to pack everything, you should
just do
git repack -a -d
and not use --unpacked.
Same goes for dates, btw. We've always stopped when we reached a certain
date, even though there _could_ be commits before it that are from a more
recent date (due to clocks being set wrong). Neither --unpacked nor
--since=xyzzy are meant to be any kind of "guarantees" that you get all
commits that match some pattern. They are just "stop early" rules.
Linus
So the following is wrong:
> diff --git a/revision.c b/revision.c
> index 0e3f074..a55c0d1 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -404,10 +404,6 @@ static void limit_list(struct rev_info *
> list = list->next;
> free(entry);
>
> - if (revs->max_age != -1 && (commit->date < revs->max_age))
> - obj->flags |= UNINTERESTING;
> - if (revs->unpacked && has_sha1_pack(obj->sha1))
> - obj->flags |= UNINTERESTING;
> add_parents_to_list(revs, commit, &list);
> if (obj->flags & UNINTERESTING) {
> mark_parents_uninteresting(commit);
> @@ -415,8 +411,6 @@ static void limit_list(struct rev_info *
> break;
> continue;
> }
> - if (revs->min_age != -1 && (commit->date > revs->min_age))
> - continue;
> p = &commit_list_insert(commit, p)->next;
> }
> if (revs->boundary) {
..but the later part of the patch looks fine (modulo testing, of course):
> @@ -543,32 +537,26 @@ int setup_revisions(int argc, const char
> }
> if (!strncmp(arg, "--max-age=", 10)) {
> revs->max_age = atoi(arg + 10);
> - revs->limited = 1;
> continue;
> }
> if (!strncmp(arg, "--min-age=", 10)) {
> revs->min_age = atoi(arg + 10);
> - revs->limited = 1;
> continue;
> }
> if (!strncmp(arg, "--since=", 8)) {
> revs->max_age = approxidate(arg + 8);
> - revs->limited = 1;
> continue;
> }
> if (!strncmp(arg, "--after=", 8)) {
> revs->max_age = approxidate(arg + 8);
> - revs->limited = 1;
> continue;
> }
> if (!strncmp(arg, "--before=", 9)) {
> revs->min_age = approxidate(arg + 9);
> - revs->limited = 1;
> continue;
> }
> if (!strncmp(arg, "--until=", 8)) {
> revs->min_age = approxidate(arg + 8);
> - revs->limited = 1;
> continue;
> }
> if (!strcmp(arg, "--all")) {
> @@ -635,7 +623,6 @@ int setup_revisions(int argc, const char
> }
> if (!strcmp(arg, "--unpacked")) {
> revs->unpacked = 1;
> - revs->limited = 1;
> continue;
> }
> *unrecognized++ = arg;
> @@ -786,7 +773,9 @@ struct commit *get_revision(struct rev_i
> if (revs->min_age != -1 && (commit->date > revs->min_age))
> continue;
> if (revs->max_age != -1 && (commit->date < revs->max_age))
> - return NULL;
> + continue;
> + if (revs->unpacked && has_sha1_pack(commit->object.sha1))
> + continue;
> if (revs->no_merges &&
> commit->parents && commit->parents->next)
> continue;
>
^ permalink raw reply
* Re: Gitk strangeness..
From: Alex Riesen @ 2006-03-31 6:27 UTC (permalink / raw)
To: Paul Mackerras; +Cc: Junio C Hamano, git, Linus Torvalds
In-Reply-To: <17452.28122.129442.49226@cargo.ozlabs.ibm.com>
On 3/31/06, Paul Mackerras <paulus@samba.org> wrote:
>
> > The old gitk produced a denser graph, which wasn't perfect too, but
> > had a higher count of visible commit titles (and this is two-monitor
> > setup, too).
>
> I just pushed a new version which does better on this.
>
This one looks reallly much better! Thanks!
^ permalink raw reply
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Junio C Hamano @ 2006-03-31 6:16 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603301652531.27203@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> The reason I put "RFC" in the subject line is that while I've validated it
> various ways, like doing
>
> git-rev-list HEAD -- drivers/char/ | md5sum
>
> before-and-after on the kernel archive, it's "git-rev-list" after all. In
> other words, it's that really really subtle and complex central piece of
> software. So while I think this is important and should go in asap, I also
> think it should get lots of testing and eyeballs looking at the code.
Let me make sure I understand what is going on.
Currently, limit_list(), which is called when revs->limited is
set, serves two different purposes:
* traverse the ancestry chain and mark ancestors of
uninteresting commits also uninteresting;
* "simplify" each commit that is still interesting, by
comparing the tree object of it and its parents.
We used to call limit_list() when we are limiting the commits by
paths, because that was what the latter does as a side effect of
add_parents_to_list(). You made it streamable by moving the
call to get_revision() and not calling limit_list() when you do
not have other reasons to call it.
You currently do not do this streaming optimization when you
have to show simplified parents, because the streaming version
traverses ancestry chain one step at a time as it goes, and you
cannot obviously see who the final "fake" parent is until you
simplify the parents enouogh.
Now, I have some observations, not necessarily limited to this
patch but also to the code from the "master" version.
* get_commit_reference() sets "limited" when the user gives ^exclude
commit. no question about this --- we would need to see the
reachability in this case.
* when we are going to sort the result we are going to show,
we set "limited" -- we cannot sort without knowing the set we
are sorting.
* giving commit timestamp limits via --max-age, --min-age
etc. sets "limited". I have doubts about this. We currently
look at the commit timestamp in limit_list() and mark a
commit old enough to be "uninteresting" -- which makes
ancestor of such commit also uninteresting. Similarly
commits that are too young are not pushed into the result.
There is a filter in get_revision() to omit ineligible
commits by time range already, so I wonder if we can remove
that code from limit_list() and not set "limited" when these
timestamp ranges are given. This would let us stream even
more.
Admittably, ancestors of an old commit had better be even
older, so marking them uninteresting to stop the traversal
early is a good way to optimize the full-traversal case, but
not having to call limit_list() would help streaming usage.
Also I have doubts about correctness issue about the max-age
handling. Is it correct to mark a commit that is old enough
to be uninteresting, implying that its ancestors are all
uninteresting? With clock skew among people, it is possible
to make a merge commit that has parents one of which is "in
the future".
* As to handling of --unpacked, exactly the same comment as
"max-age" applies, but it might be even worse. Marking an
unpacked one "uninteresting" means if a packed commit refers
to loose commit, the loose one will be also marked
uninteresting, I think.
"--objects --unpacked" is an idiom to repack objects that are
still loose, but I suspect it would do interesting thing if the
commit is in a pack but its trees and blobs are still loose. Am
I confused, or have I just spotted a potential (but hard to
trigger) bug?
--
diff --git a/revision.c b/revision.c
index 0e3f074..a55c0d1 100644
--- a/revision.c
+++ b/revision.c
@@ -404,10 +404,6 @@ static void limit_list(struct rev_info *
list = list->next;
free(entry);
- if (revs->max_age != -1 && (commit->date < revs->max_age))
- obj->flags |= UNINTERESTING;
- if (revs->unpacked && has_sha1_pack(obj->sha1))
- obj->flags |= UNINTERESTING;
add_parents_to_list(revs, commit, &list);
if (obj->flags & UNINTERESTING) {
mark_parents_uninteresting(commit);
@@ -415,8 +411,6 @@ static void limit_list(struct rev_info *
break;
continue;
}
- if (revs->min_age != -1 && (commit->date > revs->min_age))
- continue;
p = &commit_list_insert(commit, p)->next;
}
if (revs->boundary) {
@@ -543,32 +537,26 @@ int setup_revisions(int argc, const char
}
if (!strncmp(arg, "--max-age=", 10)) {
revs->max_age = atoi(arg + 10);
- revs->limited = 1;
continue;
}
if (!strncmp(arg, "--min-age=", 10)) {
revs->min_age = atoi(arg + 10);
- revs->limited = 1;
continue;
}
if (!strncmp(arg, "--since=", 8)) {
revs->max_age = approxidate(arg + 8);
- revs->limited = 1;
continue;
}
if (!strncmp(arg, "--after=", 8)) {
revs->max_age = approxidate(arg + 8);
- revs->limited = 1;
continue;
}
if (!strncmp(arg, "--before=", 9)) {
revs->min_age = approxidate(arg + 9);
- revs->limited = 1;
continue;
}
if (!strncmp(arg, "--until=", 8)) {
revs->min_age = approxidate(arg + 8);
- revs->limited = 1;
continue;
}
if (!strcmp(arg, "--all")) {
@@ -635,7 +623,6 @@ int setup_revisions(int argc, const char
}
if (!strcmp(arg, "--unpacked")) {
revs->unpacked = 1;
- revs->limited = 1;
continue;
}
*unrecognized++ = arg;
@@ -786,7 +773,9 @@ struct commit *get_revision(struct rev_i
if (revs->min_age != -1 && (commit->date > revs->min_age))
continue;
if (revs->max_age != -1 && (commit->date < revs->max_age))
- return NULL;
+ continue;
+ if (revs->unpacked && has_sha1_pack(commit->object.sha1))
+ continue;
if (revs->no_merges &&
commit->parents && commit->parents->next)
continue;
^ permalink raw reply related
* Re: [PATCH/RFC 2/2] Make path-limiting be incremental when possible.
From: Linus Torvalds @ 2006-03-31 6:05 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0603301652531.27203@g5.osdl.org>
On Thu, 30 Mar 2006, Linus Torvalds wrote:
>
> This makes git-rev-list able to do path-limiting without having to parse
> all of history before it starts showing the results.
>
> This makes it things like "git log -- pathname" much more pleasant to use.
Sadly, it seems to react really badly to Junio's new --boundary logic for
some reason that I haven't quite figured out yet.
That reaction is independent of the actual pathname restriction, and seems
to be related to how the --boundary logic expected
pop_most_recent_commit() to work. In particular:
...
if (commit->object.flags & BOUNDARY) {
/* this is already uninteresting,
* so there is no point popping its
* parents into the list.
*/
that code is magic, and seems to depend on something subtle going on with
the list, and the incremental thing already popped the parent earlier and
screwed up whatever magic that the BOUNDARY code depends on.
Junio? I think you did some funky special case with BOUNDARY commits, and
I broke it for you, can you look at the patch and see if you can see it?
I'd really like to have the incremental path-limiter, because it really
makes a huge difference in the usability of "git log pathname".
Linus
^ permalink raw reply
* Re: Gitk strangeness..
From: Junio C Hamano @ 2006-03-31 1:50 UTC (permalink / raw)
To: Paul Mackerras; +Cc: Alex Riesen, git, Linus Torvalds
In-Reply-To: <17452.28122.129442.49226@cargo.ozlabs.ibm.com>
Paul Mackerras <paulus@samba.org> writes:
> Alex Riesen writes:
>
>> The old gitk produced a denser graph, which wasn't perfect too, but
>> had a higher count of visible commit titles (and this is two-monitor
>> setup, too).
>
> I just pushed a new version which does better on this.
Thanks. Pulled, merged and pushed out..
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox