git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] revision limiter in git-rev-list
@ 2006-06-06  8:36 Marco Costalba
  2006-06-06 10:29 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Marco Costalba @ 2006-06-06  8:36 UTC (permalink / raw)
  To: git; +Cc: junkio

We currently can run something like

git-rev-list --topo-order --parents HEAD -- foo1.c  foo2.c foo3.c

And have in output a list of revisions that modify paths foo1.c,
foo2.c and foo.3

For each revision we have also the corresponding pseudo-parents,
i.e. the proper parents chosen among the revisions in output list.

The idea is to extend this behaviour to also commit objects sha.

As example, given the following revisions history:

a
b-
| c
| d
e
f

We could add a new option --filtered so that

git-rev-list --topo-order --filtered HEAD -- a d e

Gives the following

a
b-
| d
e

Note that the merge point b has been added implicitly as in path limiter case.


This is a powerful and quite general option that can be use for a large
and interesting number of cases.

1) Get the branch a given <sha> belongs

git-rev-list --topo-order --parents --filtered
    HEAD -- <sha> branchList[0] branchList[1]... branchList[k-1]


where branchList[] is the vector of branches of lenght k

Searched branch is the one where sha appears as one of his parents.



2) Get nearest previous tag of a given <sha>


git-rev-list --topo-order --parents -n1 --filtered
    <sha> -- <sha> tagList[0] tagList[1]... tagList[n-1]


where tagList[] is the vector of tags of lenght n

In output we have only one revision (see option -n1) that is <sha>,
his parent(s) are the nearest tag(s).


3) Get tag/branch history with a sensible graph (using a GUI frontend)

git-rev-list --topo-order --parents --filtered HEAD -- tagList[0] ...
branchList[n-1]

git-rev-list --topo-order --parents --filtered HEAD -- branchList[0]
... branchList[n-1]


4) Filter/find revisions according to a given search criteria (using a
GUI frontend)

git-rev-list --topo-order --parents --filtered HEAD -- matchList[0]
... matchList[n-1]

where matchList[] is the vector of sha's matching a filter criteria and could
be provided by a GUI frontend that already normally have filter capabilities.

The plus here is to let frontend to draw a sensible graph of the
resulting revisions.

Probably there are other uses of this option that is very powerful
because essentially
it adds topological information to a given set of revisions.

Of course I really ignore any implementation difficult/feasibility issues ;-)


Marco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] revision limiter in git-rev-list
  2006-06-06  8:36 [RFC] revision limiter in git-rev-list Marco Costalba
@ 2006-06-06 10:29 ` Junio C Hamano
  2006-06-06 11:14   ` Marco Costalba
  2006-06-06 11:46   ` Marco Costalba
  0 siblings, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2006-06-06 10:29 UTC (permalink / raw)
  To: Marco Costalba; +Cc: git, junkio

"Marco Costalba" <mcostalba@gmail.com> writes:

> Of course I really ignore any implementation difficult/feasibility issues ;-)

I do not think your problem is ignoring difficulty/feasibility.
The bigger problem is ignoring to describe the semantics of what
you are proposing.

For example:

> As example, given the following revisions history:
>
> a
> b-
> | c
> | d
> e
> f
>
> We could add a new option --filtered so that
>
> git-rev-list --topo-order --filtered HEAD -- a d e
>
> Gives the following
>
> a
> b-
> | d
> e

Why does it give that?  Where is the HEAD in the example, and
how does it affect the result?  I presume when you wrote the
unfiltered graph, you meant time flows from top to bottom.  In
other words, I would have written the original graph like this,
time flowing from left to right:

          c---d
         /
    a---b---e---f

Does your rev-list given above give different results depending
on HEAD being d or f?  If so how, and for which case is the
example output produced?  If not, what significance does the
HEAD argument have in the algorithm that generates the result
you are proposing (maybe your example had time flowing from
bottom to top, then the HEAD might be at a, and you did not have
to think about this question when you wrote the message)?  These
are rhetorical questions, so do not waste time thinking about
answers to them before reading to the end of this message.

> Note that the merge point b has been added implicitly as in
> path limiter case.

I do not think path limiter case adds anything.  A merge is
shown if it touches the path in an nontrivial way, but otherwise
it isn't.  Also b is not a merge unless time is flowing from
bottom to top in your picture -- it is a fork point.

While I really do not think this belongs to rev-list, I suspect
what you want is a command that takes a set of commits you are
interested in and gives you an abbreviated topology across them.
I think it might be a good thing to have in our toolset (didn't
I say that already?).

So your example would become:

        Given this graph (and there may be other nodes before a or after
        d or f):

                  c---d
                 /
            a---b---e---f

	the user is interested in A, D, and E.  Show an
	abbreviated topology containing them.

which would give you

                  D
                 /
            A---B---E

BTW, interestingly enough if the time flowed from right to left,
the answer is the same here.

The command line would not involve HEAD unless it is also one of
the commits you are interested in.  So it would be something
like:

	git graph-reduce A D E

which would give you:

	E B
        D B
        B A

or even (since you are interested in reachability):

	E B A
        D B A
        B A

Unfortunately, your description is a bit too fuzzy to me, so I
am making guesses as to what you really want.  For example,
although you said "b is included because it is a merge", I
strongly suspect you have cases where you would want to and not
want to include a fork point or a merge point in the result,
depending on the commits you are interested in.  If you are
reducing this graph for A, H, and J:

                f---g
               /     \
              c---d---e---h
             /    
	a---b---i---j

I think you would want to see this as the result of graph
reduction:

              H
             /    
	A---B---J

instead of:

              C---E---H
             /   
        A---B---J

That is, although e is a merge and c is a fork point, they do
not bring anything interesting in the bird's eye view picture,
and are filtered out.  However, b is a point where lines of
development leading to H and J (two of the commits the user is
interested in) forks, and it is interesting.

Is this kind of graph reduction what you are after?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] revision limiter in git-rev-list
  2006-06-06 10:29 ` Junio C Hamano
@ 2006-06-06 11:14   ` Marco Costalba
  2006-06-06 11:46   ` Marco Costalba
  1 sibling, 0 replies; 5+ messages in thread
From: Marco Costalba @ 2006-06-06 11:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 6/6/06, Junio C Hamano <junkio@cox.net> wrote:
> "Marco Costalba" <mcostalba@gmail.com> writes:
>
> > Of course I really ignore any implementation difficult/feasibility issues ;-)
>
> I do not think your problem is ignoring difficulty/feasibility.
> The bigger problem is ignoring to describe the semantics of what
> you are proposing.
>

Sadly I have to agree, sorry.

> For example:
>
> > As example, given the following revisions history:
> >
> > a
> > b-
> > | c
> > | d
> > e
> > f
> >
> > We could add a new option --filtered so that
> >
> > git-rev-list --topo-order --filtered HEAD -- a d e
> >
> > Gives the following
> >
> > a
> > b-
> > | d
> > e
>
> Why does it give that?  Where is the HEAD in the example,

HEAD is "a", just to try to be more clear that's the graph you would see running
gitk HEAD: HEAD is at top ("a") initial revision is at bottom ("e").

>
> > Note that the merge point b has been added implicitly as in
> > path limiter case.
>
> I do not think path limiter case adds anything.  A merge is
> shown if it touches the path in an nontrivial way, but otherwise
> it isn't.

Yes.

>Also b is not a merge unless time is flowing from
> bottom to top in your picture -- it is a fork point.
>

I meant a graph as shown by gitk HEAD, so I really meant
"b" is a merge.


> While I really do not think this belongs to rev-list, I suspect
> what you want is a command that takes a set of commits you are
> interested in and gives you an abbreviated topology across them.
> I think it might be a good thing to have in our toolset (didn't
> I say that already?).
>
> So your example would become:
>
>         Given this graph (and there may be other nodes before a or after
>         d or f):
>
>                   c---d
>                  /
>             a---b---e---f
>
>         the user is interested in A, D, and E.  Show an
>         abbreviated topology containing them.
>
> which would give you
>
>                   D
>                  /
>             A---B---E
>

Yes.

>
> Unfortunately, your description is a bit too fuzzy to me, so I
> am making guesses as to what you really want.  For example,
> although you said "b is included because it is a merge", I
> strongly suspect you have cases where you would want to and not
> want to include a fork point or a merge point in the result,
> depending on the commits you are interested in.  If you are
> reducing this graph for A, H, and J:
>
>                 f---g
>                /     \
>               c---d---e---h
>              /
>         a---b---i---j
>
> I think you would want to see this as the result of graph
> reduction:
>
>               H
>              /
>         A---B---J
>
> instead of:
>
>               C---E---H
>              /
>         A---B---J
>
> That is, although e is a merge and c is a fork point, they do
> not bring anything interesting in the bird's eye view picture,
> and are filtered out.  However, b is a point where lines of
> development leading to H and J (two of the commits the user is
> interested in) forks, and it is interesting.
>
> Is this kind of graph reduction what you are after?
>
>

Practically speaking it's the kind of reduction for whom examples I
gave _do_ work.


   Marco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] revision limiter in git-rev-list
  2006-06-06 10:29 ` Junio C Hamano
  2006-06-06 11:14   ` Marco Costalba
@ 2006-06-06 11:46   ` Marco Costalba
  2006-06-06 18:29     ` Junio C Hamano
  1 sibling, 1 reply; 5+ messages in thread
From: Marco Costalba @ 2006-06-06 11:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

>
> While I really do not think this belongs to rev-list, I suspect
> what you want is a command that takes a set of commits you are
> interested in and gives you an abbreviated topology across them.


I was thinking at an extension of git-rev-list because

1) Current git-rev-list options are quite orthogonal with rev limiter.
As example in previous given examples -n and --parents options are
used and I think more could be used with
interesting results.

Current git-rev-list options are a lot and are very powerful, it's a
pity if this new feature do not inherit them.

2) This feature could be seen as a generalization of path limiting.

>From today:
  git-rev-list  HEAD -- <path 1>  <path 2>  ...   <path n>

To possible:
git-rev-list  HEAD -- <obj 1>  <obj 2>  ...   <obj n>

Where obj == <path> || obj == <commit sha> ||  obj == <something else
I didn't think of>

Of course we need a (syntactic) way to disambiguate the arguments
after '--' but the results are very powerful and general, as example
we could mix commit objects _and_ paths in git-rev-list command line
(git-rev-list HEAD -- foo.c  tag1) and also be able to use the full
set of git-rev-list options before the '--'


     Marco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] revision limiter in git-rev-list
  2006-06-06 11:46   ` Marco Costalba
@ 2006-06-06 18:29     ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2006-06-06 18:29 UTC (permalink / raw)
  To: Marco Costalba; +Cc: git

"Marco Costalba" <mcostalba@gmail.com> writes:

> I was thinking at an extension of git-rev-list because
>
> 1) Current git-rev-list options are quite orthogonal with rev limiter.

Which means you have to think through what you want to happen
when the user uses various other existing features.

For example, let's say you want to make this work together with
path limiting.  I have not thought things through, especially
around merges and fork points, but I suspect that it could be
implemented by having a new postprocessing phase:

 - Get the list of commits we are interested in from the command
   line.

 - Teach revision.c::try_to_simplify_commit() not to drop the
   commit from the simplified ancestry chain when such a list
   exists.

   This would probably involve disabling the code to do merge
   simplification, which would make rev-list much less useful
   while working in this mode.

 - At the end of revision.c::limit_list(), before we compute
   boundary commits perhaps, look at the resulting list and drop
   single-parent commits from the ancestry chain that are not in
   the "interesting" set.

When viewed this way, this thing is not a "rev limiter", but
more like "I want you keep these even if they are usually
dropped otherwise" -- "rev keeper" perhaps.  That is why I said
I suspect what you want is a graph reduction and not necessarily
something that interacts with rev-list.

On the other hand, you may not mind (or you may even prefer) the
algorithm not to show commits in your "rev limiter" set if they
do not touch the specified paths.  You do not have to butcher
try_to_simplify_commit() but only need the postprocessing, if
that is the case.  I cannot tell which one you wanted.

By the way, I have become quite unsure about the extra inclusion
of merge and fork-point.  I said:

> I think you would want to see this as the result of graph
> reduction:
>
>               H
>              /
>         A---B---J
>
> That is, although e is a merge and c is a fork point, they do
> not bring anything interesting in the bird's eye view picture,
> and are filtered out.  However, b is a point where lines of
> development leading to H and J (two of the commits the user is
> interested in) forks, and it is interesting.

But I now think what is reasonable to give might be this instead:

          H
         /
	A---J

Often, the topic-branch workflow involves many merges between
topic and master.  I will depict just one criss-cross pair here,
but in practice there will be several, mostly merging "so far
this is good" part from topic to master:

          o---o---o---o---o---o---o---o "topic"      
         /     \         /
    o---o---o---o---o---o---o---o---o "master"

When I say I am interested in viewing a reduced topology around
"topic" and "master", I could end up getting:

          .---*-------*---* "topic"      
         /     \     /
     ---*-------*---*---* "master"

with many cross bridges.  I am not sure why I am interested in
these cross bridges more than I am interested in the commits
that do the real work to introduce changes to the topic and
master (i.e. single parent commits).  It is _interesting_ to
see, but I consider it is more of a curiosity than of a real
value in understanding what happened in the development
history.

On the other hand, if we say we are interested in seeing the
skeleton picture, I am not sure what is reasonable to draw when
your topic needed a merge from another topic because they are
somewhat related:


                o---o---o---o---o---o "another topic"
               /             \
              /   o---o---o---X---o---o---o---o "topic"
             /   /     \         /
        o---o---o---o---o---o---o---o---o---o "master"

Now, when you say "I am interested in seeing how topic and
master have evolved", what would you do with the merge X?
One possibility is to do this:

            .-------.
           /         \
          /   .---*---X---*---* "topic"      
         /   /     \     /
     ---Y---*-------*---*---* "master"

Between X and Y are all the interesting work that are omitted
from the bird's eye view, but nevertheless Y is a fake parent of
X.  Perhaps, this is useful, but I am not so convinced.

I am afraid this is going to be the last message from me on this
topic for now, since I'd like to concentrate on getting the
"master" branch in a good shape for 1.4.0 by this weekend and
would like to have everybody do the same, and I expect to be
mostly offline next week.  But I think this exchange is bringing
us closer to have a description of how the various bits in the
current rev-list are supposed to interact with this new feature,
making it a bit more concrete for us to visualize how it could
be made implementable.

There are other interactions with "existing bits" that need to
be thought through but I only gave an example and a half of path
limiter and merge/fork in this message.

Output filters (e.g. --max-count) would be easy, but I suspect
you would have even more interesting issues when you have
negative refs (^A ^B C).  I currently do not have any idea how
you would do --boundary for this either.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-06-06 18:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-06  8:36 [RFC] revision limiter in git-rev-list Marco Costalba
2006-06-06 10:29 ` Junio C Hamano
2006-06-06 11:14   ` Marco Costalba
2006-06-06 11:46   ` Marco Costalba
2006-06-06 18:29     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).