From: Eric Biggers <ebiggers3@gmail.com>
To: Ondrej Mosnacek <omosnacek@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
"David S. Miller" <davem@davemloft.net>,
linux-crypto@vger.kernel.org, Milan Broz <gmazyland@gmail.com>
Subject: Re: [PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h
Date: Thu, 30 Mar 2017 12:55:46 -0700 [thread overview]
Message-ID: <20170330195546.GA60896@gmail.com> (raw)
In-Reply-To: <20170330192535.23123-1-omosnacek@gmail.com>
Hi Ondrej,
On Thu, Mar 30, 2017 at 09:25:35PM +0200, Ondrej Mosnacek wrote:
> The gf128mul_x_ble function is currently defined in gf128mul.c, because
> it depends on the gf128mul_table_be multiplication table.
>
> However, since the function is very small and only uses two values from
> the table, it is better for it to be defined as inline function in
> gf128mul.h. That way, the function can be inlined by the compiler for
> better performance.
>
> After this change, the speed of the generic 'xts(aes)' implementation
> increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
> benchmark' on an Intel system with CRYPTO_AES_X86_64 and
> CRYPTO_AES_NI_INTEL disabled).
>
> Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
...
>
> -/* multiply by x in ble format, needed by XTS */
> -void gf128mul_x_ble(be128 *a, const be128 *b);
> +/* Multiply by x in ble format, needed by XTS.
> + * Defined here for performance. */
> +static inline void gf128mul_x_ble(be128 *r, const be128 *x)
> +{
> + u64 a = le64_to_cpu(x->a);
> + u64 b = le64_to_cpu(x->b);
> + /* equivalent to gf128mul_table_be[b >> 63] (see crypto/gf128mul.c): */
> + u64 _tt = (b & ((u64)1 << 63)) ? 0x87 : 0x00;
> +
> + r->a = cpu_to_le64((a << 1) ^ _tt);
> + r->b = cpu_to_le64((b << 1) | (a >> 63));
> +}
>
> /* 4k table optimization */
>
> --
> 2.9.3
>
This is an improvement; I'm just thinking that maybe this should be done for all
the gf128mul_x_*() functions, if only so that they use a consistent style and
are all defined next to each other.
Also note that '(b & ((u64)1 << 63)) ? 0x87 : 0x00;' is actually getting
compiled as '((s64)b >> 63) & 0x87', which is branchless and therefore makes the
new version more efficient than one might expect:
sar $0x3f,%rax
and $0x87,%eax
It could even be written the branchless way explicitly, but it shouldn't matter.
- Eric
next prev parent reply other threads:[~2017-03-30 19:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-30 19:25 [PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h Ondrej Mosnacek
2017-03-30 19:55 ` Eric Biggers [this message]
2017-03-30 21:32 ` Ondrej Mosnáček
2017-03-31 6:05 ` Jeffrey Walton
2017-03-31 9:17 ` Ondrej Mosnáček
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170330195546.GA60896@gmail.com \
--to=ebiggers3@gmail.com \
--cc=davem@davemloft.net \
--cc=gmazyland@gmail.com \
--cc=herbert@gondor.apana.org.au \
--cc=linux-crypto@vger.kernel.org \
--cc=omosnacek@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.