* [PATCH] fetch-pack: binary search when storing wanted-refs
@ 2019-03-27 21:11 Jonathan Tan
2019-03-28 3:13 ` Jeff King
0 siblings, 1 reply; 2+ messages in thread
From: Jonathan Tan @ 2019-03-27 21:11 UTC (permalink / raw)
To: git; +Cc: Jonathan Tan
In do_fetch_pack_v2(), the "sought" array is sorted by name, and it is
not subsequently reordered (within the function). Therefore,
receive_wanted_refs() can assume that "sought" is sorted, and can thus
use a binary search when storing wanted-refs retrieved from the server.
Replace the existing linear search with a binary search. This improves
performance significantly when mirror cloning a repository with more
than 1 million refs.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
fetch-pack.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/fetch-pack.c b/fetch-pack.c
index e69993b2eb..e8266bd45a 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1298,6 +1298,11 @@ static void receive_shallow_info(struct fetch_pack_args *args,
}
}
+static int cmp_name_ref(const void *name, const void *ref)
+{
+ return strcmp(name, (*(struct ref **)ref)->name);
+}
+
static void receive_wanted_refs(struct packet_reader *reader,
struct ref **sought, int nr_sought)
{
@@ -1305,20 +1310,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
struct object_id oid;
const char *end;
- int i;
+ struct ref **found;
if (parse_oid_hex(reader->line, &oid, &end) || *end++ != ' ')
die(_("expected wanted-ref, got '%s'"), reader->line);
- for (i = 0; i < nr_sought; i++) {
- if (!strcmp(end, sought[i]->name)) {
- oidcpy(&sought[i]->old_oid, &oid);
- break;
- }
- }
-
- if (i == nr_sought)
+ found = bsearch(end, sought, nr_sought, sizeof(*sought),
+ cmp_name_ref);
+ if (!found)
die(_("unexpected wanted-ref: '%s'"), reader->line);
+ oidcpy(&(*found)->old_oid, &oid);
}
if (reader->status != PACKET_READ_DELIM)
--
2.21.0.197.gd478713db0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] fetch-pack: binary search when storing wanted-refs
2019-03-27 21:11 [PATCH] fetch-pack: binary search when storing wanted-refs Jonathan Tan
@ 2019-03-28 3:13 ` Jeff King
0 siblings, 0 replies; 2+ messages in thread
From: Jeff King @ 2019-03-28 3:13 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git
On Wed, Mar 27, 2019 at 02:11:10PM -0700, Jonathan Tan wrote:
> In do_fetch_pack_v2(), the "sought" array is sorted by name, and it is
> not subsequently reordered (within the function). Therefore,
> receive_wanted_refs() can assume that "sought" is sorted, and can thus
> use a binary search when storing wanted-refs retrieved from the server.
>
> Replace the existing linear search with a binary search. This improves
> performance significantly when mirror cloning a repository with more
> than 1 million refs.
This looks good.
The flow in do_fetch_pack_v2() is a little funny. I wanted to
double-check that we always sorted the sought list, because it only
happens in the CHECK_LOCAL state of the state machine.
But as far as I can tell we always begin the function in CHECK_LOCAL, it
always transitions out to another state, and we never go back to it. So
all of that part of the state-machine switch could really just be done
once before entering the state-machine's while loop.
Not really relevant to your patch, but maybe worth tweaking separately
(or maybe not, if people find the all-in-a-state-machine style more
readable; I found it more confusing to reason about).
> +static int cmp_name_ref(const void *name, const void *ref)
> +{
> + return strcmp(name, (*(struct ref **)ref)->name);
> +}
After some errors with qsort comparison functions a while back[1], I
double-checked that this has the right number of asterisks. I believe it
does. :)
-Peff
[1] https://public-inbox.org/git/d1b58614-989f-5998-6c53-c19eee409a2f@web.de/
and the child thread it spawned are interesting reading, though I
don't think we ever followed up on it.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2019-03-28 3:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-27 21:11 [PATCH] fetch-pack: binary search when storing wanted-refs Jonathan Tan
2019-03-28 3:13 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).