All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Derrick Stolee <dstolee@microsoft.com>
Cc: git@vger.kernel.org, stolee@gmail.com
Subject: Re: [PATCH] sha1_name: use bsearch_hash() for abbreviations
Date: Tue, 20 Mar 2018 15:25:05 -0700	[thread overview]
Message-ID: <20180320152505.bd66f0deaecf6d92fa6d62de@google.com> (raw)
In-Reply-To: <20180320200325.168147-1-dstolee@microsoft.com>

On Tue, 20 Mar 2018 16:03:25 -0400
Derrick Stolee <dstolee@microsoft.com> wrote:

> This patch updates the abbreviation code to use bsearch_hash() as defined
> in [1]. It gets a nice speedup since the old implementation did not use
> the fanout table at all.

You can refer to the patch as:

  b4e00f7306a1 ("packfile: refactor hash search with fanout table",
  2018-02-15)

Also, might be worth noting that this patch builds on
jt/binsearch-with-fanout.

> One caveat about the patch: there is a place where I cast a sha1 hash
> into a struct object_id pointer. This is because the abbreviation code
> still uses 'const unsigned char *' instead of structs. I wanted to avoid
> a hashcpy() in these calls, but perhaps that is not too heavy a cost.

I recall a discussion that there were alignment issues with doing this,
but I might have be remembering wrongly - in my limited knowledge of C
alignment, both "unsigned char *" and "struct object_id *" have the same
constraints, but I'm not sure.

> +	const unsigned char *index_fanout = p->index_data;
[snip]
> +	return bsearch_hash(oid->hash, (const uint32_t*)index_fanout,
> +			    index_lookup, index_lookup_width, result);

This cast to "const uint32_t *" is safe, because p->index_data points to
a mmap-ed region (which has very good alignment, as far as I know). I
wonder if we should document alignment guarantees on p->index_data, and
if yes, what guarantees to declare.

  reply	other threads:[~2018-03-20 22:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 20:03 [PATCH] sha1_name: use bsearch_hash() for abbreviations Derrick Stolee
2018-03-20 22:25 ` Jonathan Tan [this message]
2018-03-21 13:24   ` Derrick Stolee
2018-03-21 22:42     ` brian m. carlson
2018-03-22 17:40       ` [PATCH v2 0/3] Use " Derrick Stolee
2018-03-22 17:40         ` [PATCH v2 1/3] sha1_name: convert struct min_abbrev_data to object_id Derrick Stolee
2018-03-22 17:40         ` [PATCH v2 2/3] packfile: define and use bsearch_pack() Derrick Stolee
2018-03-22 17:40         ` [PATCH v2 3/3] sha1_name: use bsearch_pack() for abbreviations Derrick Stolee
2018-03-24 16:41         ` [PATCH 4/3] sha1_name: use bsearch_pack() in unique_in_pack() René Scharfe
2018-03-25 16:19           ` Junio C Hamano
2018-03-25 16:32             ` René Scharfe
2018-03-25 18:21           ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180320152505.bd66f0deaecf6d92fa6d62de@google.com \
    --to=jonathantanmy@google.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.