From: Junio C Hamano <gitster@pobox.com>
To: David Kastrup <dak@gnu.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 3/3] builtin/blame.c: large-scale rewrite
Date: Mon, 27 Jan 2014 12:51:33 -0800 [thread overview]
Message-ID: <xmqqtxcpno6i.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <87a9eh8b0d.fsf@fencepost.gnu.org> (David Kastrup's message of "Mon, 27 Jan 2014 20:45:06 +0100")
David Kastrup <dak@gnu.org> writes:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> David Kastrup <dak@gnu.org> writes:
>>
>>> The previous implementation uses a sorted linear list of struct
>>> blame_entry in a struct scoreboard for organizing all partial or
>>> completed work. Every task that is done requires going through the
>>> whole list where most entries are not relevant to the task at hand.
>>>
>>> This commit reorganizes the data structures in order to have each
>>> remaining subtask work with its own sorted linear list it can work off
>>> front to back. Subtasks are organized into "struct origin" chains
>>> hanging off particular commits. Commits are organized into a priority
>>> queue, processing them in commit date order in order to keep most of
>>> the work affecting a particular blob collated even in the presence of
>>> an extensive merge history. In that manner, linear searches can be
>>> basically avoided anywhere. They still are done for identifying one
>>> of multiple analyzed files in a given commit, but the degenerate case
>>> of a single large file being assembled from a multitude of smaller
>>> files in the past is not likely to occur often enough to warrant
>>> special treatment.
>>> ---
>>
>> Sign-off?
>
> Not while this is not fit for merging because of #if 0 etc and missing
> functionality. This is just for review.
That is not what Signing off a patch means, though ;-)
>> Actually, I'd like to take my previous suggestion to add this as
>> blame2 (to replace blame in the future) back. That was based on my
>> fear/hope to see an implementation based on a completely different
>> approach, but the basic premise of having one blame_entry record per
>> the lines of the final image in the scoreboard, using diff between
>> parents to the child to find common lines for passing the blame up,
>> etc. have not changed at all and the change is all about organizing
>> the pieces of data in a *much* *better* way to avoid needlessly
>> finding what we already have computed.
>
> Yes. Basically, the call graph outside of blame.c itself should be
> pretty much the same.
> ...
> Ok. I'm also certain to have a few "space between function name and
> arguments" left and will grep for those before submitting the next
> version.
>
> Next version will at least include -M option, possibly leaving -C for
> later.
OK. As we do not want to break the build in the middle of the
series, but we still want to keep the individual steps reviewable
incrementally, I would think the best structure would be have the
series that consists of multiple steps "the basic infrastructure is
there, but no rename following, and neither -M or -C works", "now
renames are followed", "now -M works", etc., with tests that are not
yet working (in the beginning, most of "git blame" test may no
longer work until the series is finished) marked with
s/test_expect_success/test_expect_failure/
and turn expect_failure into expect_success as the series advances.
That way, we may get a better picture of what each step achieves. I
dunno.
next prev parent reply other threads:[~2014-01-27 20:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-25 12:44 How to substructure rewrites? David Kastrup
2014-01-25 18:23 ` [PATCH 0/3] "Teaser" patch for rewriting blame for efficiency David Kastrup
2014-01-25 18:23 ` [PATCH 1/3] builtin/blame.c: struct blame_entry does not need a prev link David Kastrup
2014-01-25 18:23 ` [PATCH 2/3] Eliminate same_suspect function in builtin/blame.c David Kastrup
2014-01-25 18:23 ` [PATCH 3/3] builtin/blame.c: large-scale rewrite David Kastrup
2014-01-27 16:54 ` Junio C Hamano
2014-01-27 19:45 ` David Kastrup
2014-01-27 20:51 ` Junio C Hamano [this message]
2014-01-27 21:21 ` David Kastrup
2014-01-27 15:58 ` How to substructure rewrites? Junio C Hamano
2014-01-27 16:27 ` David Kastrup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqtxcpno6i.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox.com \
--cc=dak@gnu.org \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.