All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers3@gmail.com>
To: Ondrej Mosnacek <omosnacek@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-crypto@vger.kernel.org, Milan Broz <gmazyland@gmail.com>
Subject: Re: [PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h
Date: Thu, 30 Mar 2017 12:55:46 -0700	[thread overview]
Message-ID: <20170330195546.GA60896@gmail.com> (raw)
In-Reply-To: <20170330192535.23123-1-omosnacek@gmail.com>

Hi Ondrej,

On Thu, Mar 30, 2017 at 09:25:35PM +0200, Ondrej Mosnacek wrote:
> The gf128mul_x_ble function is currently defined in gf128mul.c, because
> it depends on the gf128mul_table_be multiplication table.
> 
> However, since the function is very small and only uses two values from
> the table, it is better for it to be defined as inline function in
> gf128mul.h. That way, the function can be inlined by the compiler for
> better performance.
> 
> After this change, the speed of the generic 'xts(aes)' implementation
> increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
> benchmark' on an Intel system with CRYPTO_AES_X86_64 and
> CRYPTO_AES_NI_INTEL disabled).
> 
> Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
...
>  
> -/* multiply by x in ble format, needed by XTS */
> -void gf128mul_x_ble(be128 *a, const be128 *b);
> +/* Multiply by x in ble format, needed by XTS.
> + * Defined here for performance. */
> +static inline void gf128mul_x_ble(be128 *r, const be128 *x)
> +{
> +	u64 a = le64_to_cpu(x->a);
> +	u64 b = le64_to_cpu(x->b);
> +	/* equivalent to gf128mul_table_be[b >> 63] (see crypto/gf128mul.c): */
> +	u64 _tt = (b & ((u64)1 << 63)) ? 0x87 : 0x00;
> +
> +	r->a = cpu_to_le64((a << 1) ^ _tt);
> +	r->b = cpu_to_le64((b << 1) | (a >> 63));
> +}
>  
>  /* 4k table optimization */
>  
> -- 
> 2.9.3
> 

This is an improvement; I'm just thinking that maybe this should be done for all
the gf128mul_x_*() functions, if only so that they use a consistent style and
are all defined next to each other.

Also note that '(b & ((u64)1 << 63)) ? 0x87 : 0x00;' is actually getting
compiled as '((s64)b >> 63) & 0x87', which is branchless and therefore makes the
new version more efficient than one might expect:

	sar    $0x3f,%rax
	and    $0x87,%eax

It could even be written the branchless way explicitly, but it shouldn't matter.

- Eric

  reply	other threads:[~2017-03-30 19:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 19:25 [PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h Ondrej Mosnacek
2017-03-30 19:55 ` Eric Biggers [this message]
2017-03-30 21:32   ` Ondrej Mosnáček
2017-03-31  6:05     ` Jeffrey Walton
2017-03-31  9:17       ` Ondrej Mosnáček

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170330195546.GA60896@gmail.com \
    --to=ebiggers3@gmail.com \
    --cc=davem@davemloft.net \
    --cc=gmazyland@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=omosnacek@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.