* Quickly searching for a note
@ 2012-09-21 14:41 Joshua Jensen
2012-09-21 15:10 ` Andreas Schwab
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Joshua Jensen @ 2012-09-21 14:41 UTC (permalink / raw)
To: git@vger.kernel.org
Background: To tie Perforce changelists to Git commits, I add a note to
a commit with the form "P4@123456". Later, I use the note to sync down
the closest Perforce changelist matching the Git commit.
I search for these notes by getting a list of revisions:
git rev-list --max-count=1000
I iterate those revisions and run git show and grep on each:
git show -s --format=%N%n%s --show-notes=p4notes COMMIT
For short runs, this isn't so bad. For longer runs of commits (I just
walked through approximately 100), it takes a long time. Running 'git
show' is costing me about 7/10 of second, presumably because I am on
Windows.
Is there a faster way to do this?
Thanks.
Josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 14:41 Quickly searching for a note Joshua Jensen
@ 2012-09-21 15:10 ` Andreas Schwab
2012-09-21 18:34 ` Joshua Jensen
2012-09-21 17:21 ` Junio C Hamano
2012-09-22 4:51 ` Junio C Hamano
2 siblings, 1 reply; 19+ messages in thread
From: Andreas Schwab @ 2012-09-21 15:10 UTC (permalink / raw)
To: Joshua Jensen; +Cc: git@vger.kernel.org
Joshua Jensen <jjensen@workspacewhiz.com> writes:
> Background: To tie Perforce changelists to Git commits, I add a note to a
> commit with the form "P4@123456". Later, I use the note to sync down the
> closest Perforce changelist matching the Git commit.
>
> I search for these notes by getting a list of revisions:
>
> git rev-list --max-count=1000
>
> I iterate those revisions and run git show and grep on each:
>
> git show -s --format=%N%n%s --show-notes=p4notes COMMIT
How about "git grep P4@123456 notes/p4notes"?
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 14:41 Quickly searching for a note Joshua Jensen
2012-09-21 15:10 ` Andreas Schwab
@ 2012-09-21 17:21 ` Junio C Hamano
2012-09-21 18:29 ` Joshua Jensen
2012-09-22 4:51 ` Junio C Hamano
2 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-09-21 17:21 UTC (permalink / raw)
To: Joshua Jensen; +Cc: git@vger.kernel.org
Joshua Jensen <jjensen@workspacewhiz.com> writes:
> Background: To tie Perforce changelists to Git commits, I add a note
> to a commit with the form "P4@123456". Later, I use the note to sync
> down the closest Perforce changelist matching the Git commit.
>
> I search for these notes by getting a list of revisions:
>
> git rev-list --max-count=1000
>
> I iterate those revisions and run git show and grep on each:
>
> git show -s --format=%N%n%s --show-notes=p4notes COMMIT
>
> For short runs, this isn't so bad. For longer runs of commits (I just
> walked through approximately 100), it takes a long time. Running 'git
> show' is costing me about 7/10 of second, presumably because I am on
> Windows.
Is there any particular reason you do that as two separate steps?
It would feel more natural, at least to me, to do something along
the lines of
git log --show-notes=p4notes -1000
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 17:21 ` Junio C Hamano
@ 2012-09-21 18:29 ` Joshua Jensen
2012-09-21 20:04 ` Junio C Hamano
0 siblings, 1 reply; 19+ messages in thread
From: Joshua Jensen @ 2012-09-21 18:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git@vger.kernel.org
----- Original Message -----
From: Junio C Hamano
Date: 9/21/2012 11:21 AM
> Joshua Jensen <jjensen@workspacewhiz.com> writes:
>
>> Background: To tie Perforce changelists to Git commits, I add a note
>> to a commit with the form "P4@123456". Later, I use the note to sync
>> down the closest Perforce changelist matching the Git commit.
>>
>> I search for these notes by getting a list of revisions:
>>
>> git rev-list --max-count=1000
>>
>> I iterate those revisions and run git show and grep on each:
>>
>> git show -s --format=%N%n%s --show-notes=p4notes COMMIT
>>
>> For short runs, this isn't so bad. For longer runs of commits (I just
>> walked through approximately 100), it takes a long time. Running 'git
>> show' is costing me about 7/10 of second, presumably because I am on
>> Windows.
> Is there any particular reason you do that as two separate steps?
> It would feel more natural, at least to me, to do something along
> the lines of
>
> git log --show-notes=p4notes -1000
>
>
Thanks for the reply.
I did not make clear above that I want to stop looking when I find the
first commit that has the note.
In the case of 'git log --show-notes=p4notes -1000', Git will process
and hand me the log output for 1,000 commits. It is rare I need to walk
that deep. We saw 300 commits deep once on a long-lived branch that
hadn't been merged in yet, but I'd be surprised to see 1,000.
Still, it shows an arbitrary choice. Really, I want to say to Git: Walk
up the history as far as you need to go from HEAD and return to me the
first commit containing the text "P4@".
Any other thoughts?
-Josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 15:10 ` Andreas Schwab
@ 2012-09-21 18:34 ` Joshua Jensen
0 siblings, 0 replies; 19+ messages in thread
From: Joshua Jensen @ 2012-09-21 18:34 UTC (permalink / raw)
To: Andreas Schwab; +Cc: git@vger.kernel.org
----- Original Message -----
From: Andreas Schwab
Date: 9/21/2012 9:10 AM
> Joshua Jensen <jjensen@workspacewhiz.com> writes:
>
>> Background: To tie Perforce changelists to Git commits, I add a note to a
>> commit with the form "P4@123456". Later, I use the note to sync down the
>> closest Perforce changelist matching the Git commit.
>>
>> I search for these notes by getting a list of revisions:
>>
>> git rev-list --max-count=1000
>>
>> I iterate those revisions and run git show and grep on each:
>>
>> git show -s --format=%N%n%s --show-notes=p4notes COMMIT
> How about "git grep P4@123456 notes/p4notes"?
>
> Andreas.
>
Thanks for the reply.
I should have labeled the format above as "P4@#######". The numeric
part will change. The "P4@" will not.
So, I run "git grep P4@ notes/p4notes". I get a bunch of responses. I
need the closest commit to HEAD that contains the P4@ text.
Any ideas?
-Josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 18:29 ` Joshua Jensen
@ 2012-09-21 20:04 ` Junio C Hamano
2012-09-21 20:25 ` Joshua Jensen
0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-09-21 20:04 UTC (permalink / raw)
To: Joshua Jensen; +Cc: git@vger.kernel.org
Joshua Jensen <jjensen@workspacewhiz.com> writes:
>> Is there any particular reason you do that as two separate steps?
>> It would feel more natural, at least to me, to do something along
>> the lines of
>>
>> git log --show-notes=p4notes -1000
>>
>>
> Thanks for the reply.
>
> I did not make clear above that I want to stop looking when I find the
> first commit that has the note.
>
> In the case of 'git log --show-notes=p4notes -1000', Git will process
> and hand me the log output for 1,000 commits. It is rare I need to
> walk that deep.
I simply matched it with your initial "rev-list --max-count=1000".
The "log" command pages and you can hit 'q' once you saw enough (in
other words, you do not have to say -1000).
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 20:04 ` Junio C Hamano
@ 2012-09-21 20:25 ` Joshua Jensen
2012-09-21 20:50 ` Johannes Sixt
0 siblings, 1 reply; 19+ messages in thread
From: Joshua Jensen @ 2012-09-21 20:25 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git@vger.kernel.org
----- Original Message -----
From: Junio C Hamano
Date: 9/21/2012 2:04 PM
> Joshua Jensen <jjensen@workspacewhiz.com> writes:
>
>>> Is there any particular reason you do that as two separate steps?
>>> It would feel more natural, at least to me, to do something along
>>> the lines of
>>>
>>> git log --show-notes=p4notes -1000
>>>
>>>
>> Thanks for the reply.
>>
>> I did not make clear above that I want to stop looking when I find the
>> first commit that has the note.
>>
>> In the case of 'git log --show-notes=p4notes -1000', Git will process
>> and hand me the log output for 1,000 commits. It is rare I need to
>> walk that deep.
> I simply matched it with your initial "rev-list --max-count=1000".
> The "log" command pages and you can hit 'q' once you saw enough (in
> other words, you do not have to say -1000).
>
This is run via script without user intervention. Presumably, Git will
do 1,000 commits of work when it may only need to do 1 or 5 or 10?
-Josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 20:25 ` Joshua Jensen
@ 2012-09-21 20:50 ` Johannes Sixt
2012-09-21 21:10 ` Joshua Jensen
0 siblings, 1 reply; 19+ messages in thread
From: Johannes Sixt @ 2012-09-21 20:50 UTC (permalink / raw)
To: Joshua Jensen; +Cc: Junio C Hamano, git@vger.kernel.org
Am 21.09.2012 22:25, schrieb Joshua Jensen:
> ----- Original Message -----
> From: Junio C Hamano
> Date: 9/21/2012 2:04 PM
>> Joshua Jensen <jjensen@workspacewhiz.com> writes:
>>
>>>> Is there any particular reason you do that as two separate steps?
>>>> It would feel more natural, at least to me, to do something along
>>>> the lines of
>>>>
>>>> git log --show-notes=p4notes -1000
>>>>
>>>>
>>> Thanks for the reply.
>>>
>>> I did not make clear above that I want to stop looking when I find the
>>> first commit that has the note.
>>>
>>> In the case of 'git log --show-notes=p4notes -1000', Git will process
>>> and hand me the log output for 1,000 commits. It is rare I need to
>>> walk that deep.
>> I simply matched it with your initial "rev-list --max-count=1000".
>> The "log" command pages and you can hit 'q' once you saw enough (in
>> other words, you do not have to say -1000).
>>
> This is run via script without user intervention. Presumably, Git will
> do 1,000 commits of work when it may only need to do 1 or 5 or 10?
The trick is to pipe 'git log' output into another process that reads no
more than it needs and exits. Then 'git log' dies from SIGPIPE before it
processed all 1000 commits because its down-stream has gone away.
For example:
git log --show-notes=p4notes -1000 |
sed -n -e '/^commit /h' -e '/P4@/{H;g;p;q}'
(The pipeline keeps track of the most recent 'commit' line, and when it
finds the 'P4@' it prints the most recent 'commit' line followed by the
'P4@' line.)
-- Hannes
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 20:50 ` Johannes Sixt
@ 2012-09-21 21:10 ` Joshua Jensen
2012-09-21 23:37 ` Jeff King
0 siblings, 1 reply; 19+ messages in thread
From: Joshua Jensen @ 2012-09-21 21:10 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, git@vger.kernel.org
----- Original Message -----
From: Johannes Sixt
Date: 9/21/2012 2:50 PM
> The trick is to pipe 'git log' output into another process that reads no
> more than it needs and exits. Then 'git log' dies from SIGPIPE before it
> processed all 1000 commits because its down-stream has gone away.
>
> For example:
>
> git log --show-notes=p4notes -1000 |
> sed -n -e '/^commit /h' -e '/P4@/{H;g;p;q}'
>
> (The pipeline keeps track of the most recent 'commit' line, and when it
> finds the 'P4@' it prints the most recent 'commit' line followed by the
> 'P4@' line.)
>
Got it. I'll try that out now.
-Josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 21:10 ` Joshua Jensen
@ 2012-09-21 23:37 ` Jeff King
2012-09-21 23:51 ` Junio C Hamano
0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2012-09-21 23:37 UTC (permalink / raw)
To: Joshua Jensen; +Cc: Johannes Sixt, Junio C Hamano, git@vger.kernel.org
On Fri, Sep 21, 2012 at 03:10:40PM -0600, Joshua Jensen wrote:
> ----- Original Message -----
> From: Johannes Sixt
> Date: 9/21/2012 2:50 PM
> >The trick is to pipe 'git log' output into another process that reads no
> >more than it needs and exits. Then 'git log' dies from SIGPIPE before it
> >processed all 1000 commits because its down-stream has gone away.
> >
> >For example:
> >
> > git log --show-notes=p4notes -1000 |
> > sed -n -e '/^commit /h' -e '/P4@/{H;g;p;q}'
> >
> >(The pipeline keeps track of the most recent 'commit' line, and when it
> >finds the 'P4@' it prints the most recent 'commit' line followed by the
> >'P4@' line.)
> >
> Got it. I'll try that out now.
I think people have provided sane techniques for doing this with a
pipeline. But there is really no reason not to have --grep-notes, just
as we have --grep. It's simply that nobody has implemented it yet (and
nobody is working on it as far as I know). It would actually be a fairly
simple feature to add if somebody wanted to get their feet wet with git.
-Peff
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 23:37 ` Jeff King
@ 2012-09-21 23:51 ` Junio C Hamano
2012-09-22 16:10 ` Michael J Gruber
0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-09-21 23:51 UTC (permalink / raw)
To: Jeff King; +Cc: Joshua Jensen, Johannes Sixt, git@vger.kernel.org
Jeff King <peff@peff.net> writes:
> I think people have provided sane techniques for doing this with a
> pipeline. But there is really no reason not to have --grep-notes, just
> as we have --grep. It's simply that nobody has implemented it yet (and
> nobody is working on it as far as I know). It would actually be a fairly
> simple feature to add if somebody wanted to get their feet wet with git.
I agree that the implementation will be simple once you figure out
what the sensible semantics and external interfaces are. The latter
is not that simple and certainly not something for newbies to solve
on their own. That is why I didn't mention it.
But now you brought it up, here are a few thinking-points as a
starter:
- Wouldn't it be more intuitive to just let the normal "--grep" to
also hit what "--show-notes" would add to the output? Does it
really add value to the end user experience to add a separate
"--grep-notes=P4[0-9]*" option, even though it would give you
more flexibility?
Not having thought things through thorouly, I still answer this
question both ways myself and what the right user experience
should look like.
- Do we want to be limited to one notes tree? Would it make sense
to show notes from the usual commit notes but use different notes
tree for the sole purpose of restricting visibility? If we
wanted to allow that for people who want flexibility, but still
want to use only one and the same by default, what should the
command line options look like?
- Would it be common to say "I want commits with _any_ notes from
this notes tree"? Having to say "--grep-notes=." for such a
simple task, if it is common, feels a bit clunky.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 14:41 Quickly searching for a note Joshua Jensen
2012-09-21 15:10 ` Andreas Schwab
2012-09-21 17:21 ` Junio C Hamano
@ 2012-09-22 4:51 ` Junio C Hamano
2 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2012-09-22 4:51 UTC (permalink / raw)
To: Joshua Jensen; +Cc: git@vger.kernel.org, Johan Herland, Jeff King
Joshua Jensen <jjensen@workspacewhiz.com> writes:
> Background: To tie Perforce changelists to Git commits, I add a note
> to a commit with the form "P4@123456". Later, I use the note to sync
> down the closest Perforce changelist matching the Git commit.
I noticed that nobody brought this up, but probably it should not be
left unsaid, so...
For annotating commits with additional pieces of data, notes is a
reasonable mechanism, but the user should be aware that it is
heavily geared towards one-way mapping. When you have a commit and
want to know something about it, it will give you the associated
information reasonably efficiently.
But it is not a good mechanism for retrieval if the primary way you
use the stored information is to go from the associated information
to find the commit that has that note attached to it. Your usage
pattern that triggered this thread may fall into that category.
It may still be a reasonable mechanism to use notes to exchange the
information across repositories, but if your application relies
heavily on mapping the information in the opposite way, you may want
to maintain a local cache of the reverse mapping in a more efficient
fashion. For example, every time your notes tree is updated, you
can loop over "git notes list" output and register the contents of
the blob object that annotates each commit as the key and the commit
object name as the value to a repository-local sqlite database or
something (and depending on the nature of the frequent query, have
efficient index on the key).
Having mentioned an external database as the most generic approach,
I suspect that one important way to use notes is to associate
commits with some other (presumably unique) ID to interface with the
external world. For example, I maintain "amlog" notes to record the
original message-ID for each commit that resulted from "git am".
The primary use of this is to find the message-ID for a commit that
was made some time ago and later found to be questionable, so that I
can find the relevant discussion thread, but the information could
be used to see if a given message I see in the mail archive has been
already applied, and this needs a fast reverse mapping.
It actually is fairly trivial to maintain both forward and reverse
mapping for this kind of use case. For example, your gateway that
syncs from Perforce may currently be doing something like this at
the end of it:
git notes --ref p4notes add -m "P4@$p4_change_id" HEAD
to give a quick mapping the commit object name of the resulting
commit (in HEAD) to "P4@123456".
This is stored as a mapping from the object name of HEAD to the
object name of a blob whose contents is "P4@123456" You can see it
in action with
$ git notes --ref p4notes list HEAD
that gives the blob object name that stores the note for the HEAD.
Now, there is _no_ reason why you cannot attach notes to these blob
objects. For example, your "Perforce to Git" gateway can end with
something like this instead:
HEAD=$(git rev-parse --verify HEAD)
git notes --ref p4notes add -m "P4@$p4_change_id" $HEAD
noteblob=$(git notes --ref p4notes list $HEAD)
git notes --ref p4notes add -m "$HEAD" $noteblob
Then when you want to map P4@123456 to Git commit, you could
$ noteblob=$(echo P4@123456 | git hash-object --stdin)
$ git notes --ref p4notes show $noteblob
to see the commit object name that is associated with that notes.
Of course, the same notes tree holds the forward mapping as before,
so
$ git notes --ref p4notes show HEAD
will give you the "P4@123456".
We may want to support such a reverse mapping natively so that
"notes rewrite" logic maintains the mapping in both direction.
I've CC'ed people who may want to be involved in further design
work.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-21 23:51 ` Junio C Hamano
@ 2012-09-22 16:10 ` Michael J Gruber
2012-09-22 20:23 ` Junio C Hamano
0 siblings, 1 reply; 19+ messages in thread
From: Michael J Gruber @ 2012-09-22 16:10 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, Joshua Jensen, Johannes Sixt, git@vger.kernel.org
Junio C Hamano venit, vidit, dixit 22.09.2012 01:51:
> Jeff King <peff@peff.net> writes:
>
>> I think people have provided sane techniques for doing this with a
>> pipeline. But there is really no reason not to have --grep-notes, just
>> as we have --grep. It's simply that nobody has implemented it yet (and
>> nobody is working on it as far as I know). It would actually be a fairly
>> simple feature to add if somebody wanted to get their feet wet with git.
>
> I agree that the implementation will be simple once you figure out
> what the sensible semantics and external interfaces are. The latter
> is not that simple and certainly not something for newbies to solve
> on their own. That is why I didn't mention it.
>
> But now you brought it up, here are a few thinking-points as a
> starter:
>
> - Wouldn't it be more intuitive to just let the normal "--grep" to
> also hit what "--show-notes" would add to the output? Does it
> really add value to the end user experience to add a separate
> "--grep-notes=P4[0-9]*" option, even though it would give you
> more flexibility?
>
> Not having thought things through thorouly, I still answer this
> question both ways myself and what the right user experience
> should look like.
>
> - Do we want to be limited to one notes tree? Would it make sense
> to show notes from the usual commit notes but use different notes
> tree for the sole purpose of restricting visibility? If we
> wanted to allow that for people who want flexibility, but still
> want to use only one and the same by default, what should the
> command line options look like?
>
> - Would it be common to say "I want commits with _any_ notes from
> this notes tree"? Having to say "--grep-notes=." for such a
> simple task, if it is common, feels a bit clunky.
>
On my mental scratch pad (yeah, that's where the bald spots are) I have
the following more general idea to enhance the revision parser:
--limit-run=<script>::
--run=<script>:::
These options run the script `<script>` on each revision that is walked.
The script is run in an environment which has the variables
`GIT_<SPECIFIER>` exported, where `<SPECIFIER>` is any of the specifiers
for the `--format` option in the long format (the same as for 'git
for-each-ref').
In the case of `--limit-run`, the return code of `<script>` decides
whether the commit is processed further (i.e. shown using the format in
effect) or ignored.
So far the idea. We could also squash both the limitting and the
formatting option into one run option. Typical usecase could be
git log --limit-run='sh -c "test x$GIT_NOTE = xp@myid'
or the like. We could also feed <script> to a shell directly. We could
also make the limit option stop traversal (optionally). Just a scratch
pad, rwally ;)
Michael
P.S.: option name bike shedders: it's named after bisects's "run"; we
could name it after rebase-i's "exec" instead...
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-22 16:10 ` Michael J Gruber
@ 2012-09-22 20:23 ` Junio C Hamano
2012-09-23 15:07 ` Michael J Gruber
2012-09-25 0:38 ` Jeff King
0 siblings, 2 replies; 19+ messages in thread
From: Junio C Hamano @ 2012-09-22 20:23 UTC (permalink / raw)
To: Michael J Gruber
Cc: Jeff King, Joshua Jensen, Johannes Sixt, git@vger.kernel.org
Michael J Gruber <git@drmicha.warpmail.net> writes:
> On my mental scratch pad (yeah, that's where the bald spots are) I have
> the following more general idea to enhance the revision parser:
>
> --limit-run=<script>::
> --run=<script>:::
> These options run the script `<script>` on each revision that is walked.
> The script is run in an environment which has the variables
> `GIT_<SPECIFIER>` exported, where `<SPECIFIER>` is any of the specifiers
> for the `--format` option in the long format (the same as for 'git
> for-each-ref').
>
> In the case of `--limit-run`, the return code of `<script>` decides
> whether the commit is processed further (i.e. shown using the format in
> effect) or ignored.
You could argue that the above is not an inpractical solution as
long as the user of --run, which spawns a new process every time we
need to check if a commit is worth showing in the log/rev-list
stream, knows what she is doing and promises not to complain that it
is no more performant than an external script that reads from
rev-list output and does the equivalent filtering.
I personally am not very enthused.
If we linked with an embeddable scripting language interpreter
(e.g. lua, tcl, guile, ...), it may be a more practical enhancement,
though.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-22 20:23 ` Junio C Hamano
@ 2012-09-23 15:07 ` Michael J Gruber
2012-09-25 0:42 ` Jeff King
2012-09-25 0:38 ` Jeff King
1 sibling, 1 reply; 19+ messages in thread
From: Michael J Gruber @ 2012-09-23 15:07 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, Joshua Jensen, Johannes Sixt, git@vger.kernel.org
Junio C Hamano venit, vidit, dixit 22.09.2012 22:23:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
>> On my mental scratch pad (yeah, that's where the bald spots are) I have
>> the following more general idea to enhance the revision parser:
>>
>> --limit-run=<script>::
>> --run=<script>:::
>> These options run the script `<script>` on each revision that is walked.
>> The script is run in an environment which has the variables
>> `GIT_<SPECIFIER>` exported, where `<SPECIFIER>` is any of the specifiers
>> for the `--format` option in the long format (the same as for 'git
>> for-each-ref').
>>
>> In the case of `--limit-run`, the return code of `<script>` decides
>> whether the commit is processed further (i.e. shown using the format in
>> effect) or ignored.
>
> You could argue that the above is not an inpractical solution as
> long as the user of --run, which spawns a new process every time we
> need to check if a commit is worth showing in the log/rev-list
> stream, knows what she is doing and promises not to complain that it
> is no more performant than an external script that reads from
> rev-list output and does the equivalent filtering.
>
> I personally am not very enthused.
>
> If we linked with an embeddable scripting language interpreter
> (e.g. lua, tcl, guile, ...), it may be a more practical enhancement,
> though.
>
Yes, the idea is "extend, don't embed" the other way round, so to say. I
still think extending "git log" so that it can call a script with commit
info already in the environment gives a more convenient approach then
"embedding git rev-list" into your own script. It's not more performant,
of course.
I just see many more requests of the type "grep notes" coming, i.e.
limitting based on other commit info, or in a different way then already
possible. Just image you want to find out who's responsible for those
commits in git.git with subject lengths > 100 ;)
The point is also that when you pipe rev-list into your script you have
to do all the output formatting yourself, or call "git log -1"/"git
show" again to have git do the output formatting after your script
decided about the limitting.
Michael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-22 20:23 ` Junio C Hamano
2012-09-23 15:07 ` Michael J Gruber
@ 2012-09-25 0:38 ` Jeff King
2012-09-25 16:19 ` Junio C Hamano
1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2012-09-25 0:38 UTC (permalink / raw)
To: Junio C Hamano
Cc: Michael J Gruber, Joshua Jensen, Johannes Sixt,
git@vger.kernel.org
On Sat, Sep 22, 2012 at 01:23:56PM -0700, Junio C Hamano wrote:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
> > On my mental scratch pad (yeah, that's where the bald spots are) I have
> > the following more general idea to enhance the revision parser:
> >
> > --limit-run=<script>::
> > --run=<script>:::
> > These options run the script `<script>` on each revision that is walked.
> > The script is run in an environment which has the variables
> > `GIT_<SPECIFIER>` exported, where `<SPECIFIER>` is any of the specifiers
> > for the `--format` option in the long format (the same as for 'git
> > for-each-ref').
> >
> > In the case of `--limit-run`, the return code of `<script>` decides
> > whether the commit is processed further (i.e. shown using the format in
> > effect) or ignored.
>
> You could argue that the above is not an inpractical solution as
> long as the user of --run, which spawns a new process every time we
> need to check if a commit is worth showing in the log/rev-list
> stream, knows what she is doing and promises not to complain that it
> is no more performant than an external script that reads from
> rev-list output and does the equivalent filtering.
>
> I personally am not very enthused.
Nor me. I experimented long ago with a perl pipeline that would parse commit
messages and allow Turing-complete grepping. I recall it was noticeably
slow. I cannot imagine what forking for each commit would be like.
Actually, wait, I can imagine it. Git has ~33K commits. Doing 'sh -c
exit' takes on the order of .002s. That's a minute of processing to look
at each commit in "git log", assuming the filtering itself takes 0
seconds.
> If we linked with an embeddable scripting language interpreter
> (e.g. lua, tcl, guile, ...), it may be a more practical enhancement,
> though.
Agreed. I just posted a patch series that gives you --pretty lua
support, though I haven't convinced myself it's all that exciting yet. I
think it would be nicer for grepping, where the conditionals read more
like regular code. Something like:
git log --lua-filter='
return
author().name.match("Junio") &&
note("p4").match("1234567")
'
reads OK to me.
-Peff
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-23 15:07 ` Michael J Gruber
@ 2012-09-25 0:42 ` Jeff King
2012-09-25 7:24 ` Michael J Gruber
0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2012-09-25 0:42 UTC (permalink / raw)
To: Michael J Gruber
Cc: Junio C Hamano, Joshua Jensen, Johannes Sixt, git@vger.kernel.org
On Sun, Sep 23, 2012 at 05:07:04PM +0200, Michael J Gruber wrote:
> > If we linked with an embeddable scripting language interpreter
> > (e.g. lua, tcl, guile, ...), it may be a more practical enhancement,
> > though.
> >
>
> Yes, the idea is "extend, don't embed" the other way round, so to say. I
> still think extending "git log" so that it can call a script with commit
> info already in the environment gives a more convenient approach then
> "embedding git rev-list" into your own script. It's not more performant,
> of course.
I think Junio is going the other way than you think. That is, you still
run rev-list, but rather than call a sub-program, you call a snippet of
an embeddable script. Which is the same idea as yours, but theoretically
way faster.
> I just see many more requests of the type "grep notes" coming, i.e.
> limitting based on other commit info, or in a different way then already
> possible. Just image you want to find out who's responsible for those
> commits in git.git with subject lengths > 100 ;)
Like this:
git log --lua-filter='return subject().len > 100'
? :)
-Peff
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-25 0:42 ` Jeff King
@ 2012-09-25 7:24 ` Michael J Gruber
0 siblings, 0 replies; 19+ messages in thread
From: Michael J Gruber @ 2012-09-25 7:24 UTC (permalink / raw)
To: Jeff King
Cc: Junio C Hamano, Joshua Jensen, Johannes Sixt, git@vger.kernel.org
Jeff King venit, vidit, dixit 25.09.2012 02:42:
> On Sun, Sep 23, 2012 at 05:07:04PM +0200, Michael J Gruber wrote:
>
>>> If we linked with an embeddable scripting language interpreter
>>> (e.g. lua, tcl, guile, ...), it may be a more practical enhancement,
>>> though.
>>>
>>
>> Yes, the idea is "extend, don't embed" the other way round, so to say. I
>> still think extending "git log" so that it can call a script with commit
>> info already in the environment gives a more convenient approach then
>> "embedding git rev-list" into your own script. It's not more performant,
>> of course.
>
> I think Junio is going the other way than you think. That is, you still
> run rev-list, but rather than call a sub-program, you call a snippet of
> an embeddable script. Which is the same idea as yours, but theoretically
> way faster.
>
>> I just see many more requests of the type "grep notes" coming, i.e.
>> limitting based on other commit info, or in a different way then already
>> possible. Just image you want to find out who's responsible for those
>> commits in git.git with subject lengths > 100 ;)
>
> Like this:
>
> git log --lua-filter='return subject().len > 100'
>
> ? :)
Like this! :)
Michael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Quickly searching for a note
2012-09-25 0:38 ` Jeff King
@ 2012-09-25 16:19 ` Junio C Hamano
0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2012-09-25 16:19 UTC (permalink / raw)
To: Jeff King
Cc: Michael J Gruber, Joshua Jensen, Johannes Sixt,
git@vger.kernel.org
Jeff King <peff@peff.net> writes:
> Agreed. I just posted a patch series that gives you --pretty lua
> support, though I haven't convinced myself it's all that exciting yet. I
> think it would be nicer for grepping, where the conditionals read more
> like regular code. Something like:
>
> git log --lua-filter='
> return
> author().name.match("Junio") &&
> note("p4").match("1234567")
> '
>
> reads OK to me.
Yeah, except for "me and p4???" ;-)
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-09-25 16:20 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-21 14:41 Quickly searching for a note Joshua Jensen
2012-09-21 15:10 ` Andreas Schwab
2012-09-21 18:34 ` Joshua Jensen
2012-09-21 17:21 ` Junio C Hamano
2012-09-21 18:29 ` Joshua Jensen
2012-09-21 20:04 ` Junio C Hamano
2012-09-21 20:25 ` Joshua Jensen
2012-09-21 20:50 ` Johannes Sixt
2012-09-21 21:10 ` Joshua Jensen
2012-09-21 23:37 ` Jeff King
2012-09-21 23:51 ` Junio C Hamano
2012-09-22 16:10 ` Michael J Gruber
2012-09-22 20:23 ` Junio C Hamano
2012-09-23 15:07 ` Michael J Gruber
2012-09-25 0:42 ` Jeff King
2012-09-25 7:24 ` Michael J Gruber
2012-09-25 0:38 ` Jeff King
2012-09-25 16:19 ` Junio C Hamano
2012-09-22 4:51 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).