git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Jeff King <peff@peff.net>
Cc: Martin Fick <mfick@codeaurora.org>,
	Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk()
Date: Tue, 03 Apr 2012 00:54:05 +0200	[thread overview]
Message-ID: <4F7A2E0D.9030402@lsrfire.ath.cx> (raw)
In-Reply-To: <20120402201432.GA26503@sigill.intra.peff.net>

Am 02.04.2012 22:14, schrieb Jeff King:
> On Sun, Apr 01, 2012 at 12:11:01AM +0200, René Scharfe wrote:
>
>> Speed up prepare_revision_walk() by adding commits without sorting
>> to the commit_list and at the end sort the list in one go.  Thanks
>> to mergesort() working behind the scenes, this is a lot faster for
>> large numbers of commits than the current insert sort.
>
> I think this is probably a sane thing to do, but I have two slight
> misgivings:
>
>    1. Is it worth the complexity of the linked-list mergesort? I was
>       planning to just build an array, qsort it, and then put the results
>       into a linked list. The patch for that is below for reference.
>
>       It's a lot less code and complexity for the same performance
>       (actually, I measured it at 1% faster, but that is probably
>       negligible). The downside is that it is not nicely encapsulated in
>       commit_list_sort_by_date(). We call the latter from two other
>       places; I don't know if they can be fed with enough commits to
>       actually benefit from the performance gain or not.

Using a temporary array here is just sad, because linked lists are 
already sortable, albeit not with qsort().  Your measurements seem to 
answer my question regarding the overhead of the callback functions of 
mergesort(), in any case. :)

Adding mergesort() only pays if we have other linked lists that we want 
to sort in-place.  I didn't search thoroughly for such a use case, but I 
think we tended to prefer arrays so far instead.

>       So I wonder if in the long term we would benefit from a better data
>       structure, which would make these problems just go away. That being
>       said, there is a lot of code to be updated with such a change, so
>       even if we do want to do that eventually, a quick fix like this is
>       probably still a good thing.

Using a more appropriate data structure sounds good in general. How 
about using a skip list?  (Or perhaps I need to lay the hammer of linked 
lists to rest for a while to stop seeing all data structures as the 
proverbial nails, or something. ;-)

> Here's the qsort-in-array patch, for reference.

It looks nice and to the point, but breaks several tests for me (t3508, 
t4013, t4041, t4202, t6003, t6009, t6016, t6018 and t7401).  Not sure why.

René

  reply	other threads:[~2012-04-02 22:54 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-30  0:18 Git push performance problems with ~100K refs Martin Fick
2012-03-30  2:12 ` Junio C Hamano
2012-03-30  2:43   ` Martin Fick
2012-03-30  9:32     ` Jeff King
2012-03-30  9:40       ` Jeff King
2012-03-30 14:22         ` Martin Fick
2012-03-31 22:10         ` [PATCH 1/3] add mergesort() for linked lists René Scharfe
2012-04-05 19:17           ` Junio C Hamano
2012-04-08 20:32             ` René Scharfe
2012-04-09 18:26               ` Junio C Hamano
2012-04-11  6:19           ` Stephen Boyd
2012-04-11 16:44             ` Junio C Hamano
2012-03-31 22:10         ` [PATCH 2/3] commit: use mergesort() in commit_list_sort_by_date() René Scharfe
2012-03-31 22:11         ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() René Scharfe
2012-03-31 22:36           ` Martin Fick
2012-03-31 23:45           ` Junio C Hamano
2012-04-02 16:24           ` Martin Fick
2012-04-02 16:39             ` Shawn Pearce
2012-04-02 16:49               ` Martin Fick
2012-04-02 16:51                 ` Shawn Pearce
2012-04-02 20:37                   ` Jeff King
2012-04-02 20:51                     ` Jeff King
2012-04-02 23:16                     ` Martin Fick
2012-04-03  3:49                     ` Nguyen Thai Ngoc Duy
2012-04-03  5:55                       ` Martin Fick
2012-04-03  6:55                         ` [PATCH 0/3] Commit cache Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 1/3] parse_commit_buffer: rename a confusing variable name Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 2/3] Add commit cache to help speed up commit traversal Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 3/3] Add parse_commit_for_rev() to take advantage of sha1-cache Nguyễn Thái Ngọc Duy
2012-04-05 13:02                       ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() Nguyen Thai Ngoc Duy
2012-04-06 19:21                         ` Shawn Pearce
2012-04-07  4:20                           ` Nguyen Thai Ngoc Duy
2012-04-03  3:44                   ` Nguyen Thai Ngoc Duy
2012-04-02 20:14           ` Jeff King
2012-04-02 22:54             ` René Scharfe [this message]
2012-04-03  8:40               ` Jeff King
2012-04-03  9:19                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F7A2E0D.9030402@lsrfire.ath.cx \
    --to=rene.scharfe@lsrfire.ath.cx \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mfick@codeaurora.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).