From: Kirill Smelkov <kirr@navytux.spb.ru>
To: Junio C Hamano <gitster@pobox.com>
Cc: kirr@mns.spb.ru, git@vger.kernel.org
Subject: Re: [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion
Date: Tue, 25 Mar 2014 13:23:20 +0400 [thread overview]
Message-ID: <20140325092320.GC3777@mini.zxlink> (raw)
In-Reply-To: <xmqq61n3p913.fsf@gitster.dls.corp.google.com>
On Mon, Mar 24, 2014 at 02:43:36PM -0700, Junio C Hamano wrote:
> Kirill Smelkov <kirr@mns.spb.ru> writes:
>
> > instead of allocating it all the time for every subtree in
> > __diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all
> > callee just use it in stacking style, without memory allocations.
> >
> > This should be faster, and for me this change gives the following
> > slight speedups for
> >
> > git log --raw --no-abbrev --no-renames --format='%H'
> >
> > navy.git linux.git v3.10..v3.11
> >
> > before 0.618s 1.903s
> > after 0.611s 1.889s
> > speedup 1.1% 0.7%
> >
> > Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
> > ---
> >
> > Changes since v1:
> >
> > - don't need to touch diff.h, as the function we are changing became static.
> >
> > tree-diff.c | 36 ++++++++++++++++++------------------
> > 1 file changed, 18 insertions(+), 18 deletions(-)
> >
> > diff --git a/tree-diff.c b/tree-diff.c
> > index aea0297..c76821d 100644
> > --- a/tree-diff.c
> > +++ b/tree-diff.c
> > @@ -115,7 +115,7 @@ static void show_path(struct strbuf *base, struct diff_options *opt,
> > if (recurse) {
> > strbuf_addch(base, '/');
> > __diff_tree_sha1(t1 ? t1->entry.sha1 : NULL,
> > - t2 ? t2->entry.sha1 : NULL, base->buf, opt);
> > + t2 ? t2->entry.sha1 : NULL, base, opt);
> > }
> >
> > strbuf_setlen(base, old_baselen);
>
> I was scratching my head for a while, after seeing that there does
> not seem to be any *new* code added by this patch in order to
> store-away the original length and restore the singleton base buffer
> to the original length after using addch/addstr to extend it.
>
> But I see that the code has already been prepared to do this
> conversion. I wonder why we didn't do this earlier ;-)
The conversion to reusing memory started in 48932677 "diff-tree: convert
base+baselen to writable strbuf" which allowed to avoid "quite a bit of
malloc() and memcpy()", but for this to work allocation at diff_tree()
entry had to be there.
In particular it had to be there, because diff_tree() accepted base as C
string, not strbuf, and since diff_tree() was calling itself
recursively - oops - new allocation on every subtree.
I've opened the door for avoiding allocations via splitting diff_tree
into high-level and low-level parts. The high-level part still accepts
`char *base`, but low-level function operates on strbuf and recurses
into low-level self.
The high-level diff_tree_sha1() still allocates memory for every
diff(tree1,tree2), but that is significantly lower compared to
allocating memory on every subtree...
The lesson here is: better use strbuf for api unless there is a reason
not to.
> Looks good. Thanks.
Thanks.
> > @@ -138,12 +138,10 @@ static void skip_uninteresting(struct tree_desc *t, struct strbuf *base,
> > }
> >
> > static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > - const char *base_str, struct diff_options *opt)
> > + struct strbuf *base, struct diff_options *opt)
> > {
> > struct tree_desc t1, t2;
> > void *t1tree, *t2tree;
> > - struct strbuf base;
> > - int baselen = strlen(base_str);
> >
> > t1tree = fill_tree_descriptor(&t1, old);
> > t2tree = fill_tree_descriptor(&t2, new);
> > @@ -151,17 +149,14 @@ static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > /* Enable recursion indefinitely */
> > opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
> >
> > - strbuf_init(&base, PATH_MAX);
> > - strbuf_add(&base, base_str, baselen);
> > -
> > for (;;) {
> > int cmp;
> >
> > if (diff_can_quit_early(opt))
> > break;
> > if (opt->pathspec.nr) {
> > - skip_uninteresting(&t1, &base, opt);
> > - skip_uninteresting(&t2, &base, opt);
> > + skip_uninteresting(&t1, base, opt);
> > + skip_uninteresting(&t2, base, opt);
> > }
> > if (!t1.size && !t2.size)
> > break;
> > @@ -173,7 +168,7 @@ static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
> > hashcmp(t1.entry.sha1, t2.entry.sha1) ||
> > (t1.entry.mode != t2.entry.mode))
> > - show_path(&base, opt, &t1, &t2);
> > + show_path(base, opt, &t1, &t2);
> >
> > update_tree_entry(&t1);
> > update_tree_entry(&t2);
> > @@ -181,18 +176,17 @@ static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> >
> > /* t1 < t2 */
> > else if (cmp < 0) {
> > - show_path(&base, opt, &t1, /*t2=*/NULL);
> > + show_path(base, opt, &t1, /*t2=*/NULL);
> > update_tree_entry(&t1);
> > }
> >
> > /* t1 > t2 */
> > else {
> > - show_path(&base, opt, /*t1=*/NULL, &t2);
> > + show_path(base, opt, /*t1=*/NULL, &t2);
> > update_tree_entry(&t2);
> > }
> > }
> >
> > - strbuf_release(&base);
> > free(t2tree);
> > free(t1tree);
> > return 0;
> > @@ -209,7 +203,7 @@ static inline int diff_might_be_rename(void)
> > !DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
> > }
> >
> > -static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
> > +static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, struct strbuf *base, struct diff_options *opt)
> > {
> > struct diff_options diff_opts;
> > struct diff_queue_struct *q = &diff_queued_diff;
> > @@ -306,13 +300,19 @@ static void try_to_follow_renames(const unsigned char *old, const unsigned char
> > q->nr = 1;
> > }
> >
> > -int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
> > +int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base_str, struct diff_options *opt)
> > {
> > + struct strbuf base;
> > int retval;
> >
> > - retval = __diff_tree_sha1(old, new, base, opt);
> > - if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
> > - try_to_follow_renames(old, new, base, opt);
> > + strbuf_init(&base, PATH_MAX);
> > + strbuf_addstr(&base, base_str);
> > +
> > + retval = __diff_tree_sha1(old, new, &base, opt);
> > + if (!*base_str && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
> > + try_to_follow_renames(old, new, &base, opt);
> > +
> > + strbuf_release(&base);
> >
> > return retval;
> > }
next prev parent reply other threads:[~2014-03-25 9:27 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-24 16:21 [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov
2014-02-24 16:21 ` [PATCH 01/19] combine-diff: move show_log_first logic/action out of paths scanning Kirill Smelkov
2014-02-24 16:21 ` [PATCH 02/19] combine-diff: move changed-paths scanning logic into its own function Kirill Smelkov
2014-02-24 16:21 ` [PATCH 03/19] tree-diff: no need to manually verify that there is no mode change for a path Kirill Smelkov
2014-02-24 16:21 ` [PATCH 04/19] tree-diff: no need to pass match to skip_uninteresting() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 05/19] tree-diff: show_tree() is not needed Kirill Smelkov
2014-02-24 16:21 ` [PATCH 06/19] tree-diff: consolidate code for emitting diffs and recursion in one place Kirill Smelkov
2014-02-24 16:21 ` [PATCH 07/19] tree-diff: don't assume compare_tree_entry() returns -1,0,1 Kirill Smelkov
2014-02-24 16:21 ` [PATCH 08/19] tree-diff: move all action-taking code out of compare_tree_entry() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 09/19] tree-diff: rename compare_tree_entry -> tree_entry_pathcmp Kirill Smelkov
2014-02-24 16:21 ` [PATCH 10/19] tree-diff: show_path prototype is not needed anymore Kirill Smelkov
2014-02-24 16:21 ` [PATCH 11/19] tree-diff: simplify tree_entry_pathcmp Kirill Smelkov
2014-03-24 21:25 ` Junio C Hamano
2014-03-25 9:23 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 12/19] tree-diff: remove special-case diff-emitting code for empty-tree cases Kirill Smelkov
2014-03-24 21:18 ` Junio C Hamano
2014-03-25 9:20 ` Kirill Smelkov
2014-03-25 17:45 ` Junio C Hamano
2014-03-26 18:32 ` Kirill Smelkov
2014-03-25 22:07 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 13/19] tree-diff: diff_tree() should now be static Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based Kirill Smelkov
2014-03-24 21:36 ` Junio C Hamano
2014-03-25 9:22 ` Kirill Smelkov
2014-03-25 17:46 ` Junio C Hamano
2014-03-26 19:52 ` Kirill Smelkov
2014-03-26 21:34 ` Junio C Hamano
2014-03-27 14:24 ` Kirill Smelkov
2014-03-27 18:48 ` Junio C Hamano
2014-03-27 19:43 ` Kirill Smelkov
2014-03-28 6:52 ` Johannes Sixt
2014-03-28 17:06 ` Junio C Hamano
2014-03-28 17:46 ` Johannes Sixt
2014-03-28 18:36 ` Junio C Hamano
2014-03-28 19:08 ` Johannes Sixt
2014-03-28 19:27 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 15/19] tree-diff: no need to call "full" diff_tree_sha1 from show_path() Kirill Smelkov
2014-03-27 14:21 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion Kirill Smelkov
2014-03-24 21:43 ` Junio C Hamano
2014-03-25 9:23 ` Kirill Smelkov [this message]
2014-03-27 14:22 ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 17/19] Portable alloca for Git Kirill Smelkov
2014-02-28 10:58 ` Thomas Schwinge
2014-02-28 13:44 ` Erik Faye-Lund
2014-02-28 13:50 ` Erik Faye-Lund
2014-02-28 17:00 ` Kirill Smelkov
2014-02-28 17:19 ` Erik Faye-Lund
2014-03-05 9:31 ` Kirill Smelkov
2014-03-24 21:47 ` Junio C Hamano
2014-03-27 14:22 ` Kirill Smelkov
2014-04-09 12:48 ` Kirill Smelkov
2014-04-09 13:01 ` Erik Faye-Lund
2014-04-10 17:30 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH v2 18/19] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov
2014-03-27 14:23 ` Kirill Smelkov
2014-04-04 18:42 ` Junio C Hamano
2014-04-06 21:46 ` Kirill Smelkov
2014-04-07 17:29 ` Junio C Hamano
2014-04-07 20:26 ` Kirill Smelkov
2014-04-07 18:07 ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 19/19] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov
2014-02-24 23:43 ` [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Duy Nguyen
2014-02-25 10:38 ` Kirill Smelkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140325092320.GC3777@mini.zxlink \
--to=kirr@navytux.spb.ru \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=kirr@mns.spb.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.