All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: linux-kernel@vger.kernel.org, patches@lists.linux.dev,
	tglx@linutronix.de, linux-crypto@vger.kernel.org,
	linux-api@vger.kernel.org, x86@kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	Carlos O'Donell <carlos@redhat.com>,
	Florian Weimer <fweimer@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>, Jann Horn <jannh@google.com>,
	Christian Brauner <brauner@kernel.org>,
	Samuel Neves <sneves@dei.uc.pt>
Subject: Re: [PATCH v13 7/7] x86: vdso: Wire up getrandom() vDSO implementation
Date: Wed, 21 Dec 2022 15:27:04 -0800	[thread overview]
Message-ID: <Y6OWSM18QL977nbC@sol.localdomain> (raw)
In-Reply-To: <20221221142327.126451-8-Jason@zx2c4.com>

On Wed, Dec 21, 2022 at 03:23:27PM +0100, Jason A. Donenfeld wrote:
> diff --git a/arch/x86/entry/vdso/vgetrandom-chacha.S b/arch/x86/entry/vdso/vgetrandom-chacha.S
> new file mode 100644
> index 000000000000..91fbb7ac7af4
> --- /dev/null
> +++ b/arch/x86/entry/vdso/vgetrandom-chacha.S
> @@ -0,0 +1,177 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2022 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/frame.h>
> +
> +.section	.rodata.cst16.CONSTANTS, "aM", @progbits, 16
> +.align 16
> +CONSTANTS:	.octa 0x6b20657479622d323320646e61707865
> +.text

For simplicity, maybe leave off the section mergeability stuff and just have
plain ".section .rodata"?

> +/*
> + * Very basic SSE2 implementation of ChaCha20. Produces a given positive number
> + * of blocks of output with a nonce of 0, taking an input key and 8-byte
> + * counter. Importantly does not spill to the stack. Its arguments are:
> + *
> + *	rdi: output bytes
> + *	rsi: 32-byte key input
> + *	rdx: 8-byte counter input/output
> + *	rcx: number of 64-byte blocks to write to output
> + */
> +SYM_FUNC_START(__arch_chacha20_blocks_nostack)
> +
> +#define output  %rdi
> +#define key     %rsi
> +#define counter %rdx
> +#define nblocks %rcx
> +#define i       %al
> +#define state0  %xmm0
> +#define state1  %xmm1
> +#define state2  %xmm2
> +#define state3  %xmm3
> +#define copy0   %xmm4
> +#define copy1   %xmm5
> +#define copy2   %xmm6
> +#define copy3   %xmm7
> +#define temp    %xmm8
> +#define one     %xmm9

It would be worth mentioning in the function comment that none of the xmm
registers are callee-save.  That was not obvious to me.  I know that on arm64,
*kernel* code doesn't need to save/restore NEON registers, so it's not something
that arch/arm64/crypto/ does.  But, it *is* needed in arm64 userspace code.  So
I was worried that something similar would apply to x86_64, but it seems not.

> +	/* state1[0,1,2,3] = state1[0,3,2,1] */
> +	pshufd		$0x39,state1,state1
> +	/* state2[0,1,2,3] = state2[1,0,3,2] */
> +	pshufd		$0x4e,state2,state2
> +	/* state3[0,1,2,3] = state3[2,1,0,3] */
> +	pshufd		$0x93,state3,state3

The comments don't match the pshufd constants.  The code is correct but the
comments are not.  They should be:

	/* state1[0,1,2,3] = state1[1,2,3,0] */
	pshufd		$0x39,state1,state1
	/* state2[0,1,2,3] = state2[2,3,0,1] */
	pshufd		$0x4e,state2,state2
	/* state3[0,1,2,3] = state3[3,0,1,2] */
	pshufd		$0x93,state3,state3

> +	/* state0 += state1, state3 = rotl32(state3 ^ state0, 16) */
> +	paddd		state1,state0
> +	pxor		state0,state3
> +	movdqa		state3,temp
> +	pslld		$16,temp
> +	psrld		$16,state3
> +	por		temp,state3
> +
> +	/* state2 += state3, state1 = rotl32(state1 ^ state2, 12) */
> +	paddd		state3,state2
> +	pxor		state2,state1
> +	movdqa		state1,temp
> +	pslld		$12,temp
> +	psrld		$20,state1
> +	por		temp,state1
> +
> +	/* state0 += state1, state3 = rotl32(state3 ^ state0, 8) */
> +	paddd		state1,state0
> +	pxor		state0,state3
> +	movdqa		state3,temp
> +	pslld		$8,temp
> +	psrld		$24,state3
> +	por		temp,state3
> +
> +	/* state2 += state3, state1 = rotl32(state1 ^ state2, 7) */
> +	paddd		state3,state2
> +	pxor		state2,state1
> +	movdqa		state1,temp
> +	pslld		$7,temp
> +	psrld		$25,state1
> +	por		temp,state1

The above sequence of 24 instructions is repeated twice, so maybe it should be a
macro (".chacha_round"?).

> +	/* state1[0,1,2,3] = state1[2,1,0,3] */
> +	pshufd		$0x93,state1,state1
> +	/* state2[0,1,2,3] = state2[1,0,3,2] */
> +	pshufd		$0x4e,state2,state2
> +	/* state3[0,1,2,3] = state3[0,3,2,1] */
> +	pshufd		$0x39,state3,state3

Similarly, the above comments are wrong.  They should be:

	/* state1[0,1,2,3] = state1[3,0,1,2] */
	pshufd		$0x93,state1,state1
	/* state2[0,1,2,3] = state2[2,3,0,1] */
	pshufd		$0x4e,state2,state2
	/* state3[0,1,2,3] = state3[1,2,3,0] */
	pshufd		$0x39,state3,state3

- Eric

  reply	other threads:[~2022-12-21 23:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-21 14:23 [PATCH v13 0/7] implement getrandom() in vDSO Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 1/7] x86: lib: Separate instruction decoder MMIO type from MMIO trace Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 2/7] mm: add VM_DROPPABLE for designating always lazily freeable mappings Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 3/7] x86: mm: Skip faulting instruction for VM_DROPPABLE faults Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 4/7] random: add vgetrandom_alloc() syscall Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 5/7] arch: allocate vgetrandom_alloc() syscall number Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 6/7] random: introduce generic vDSO getrandom() implementation Jason A. Donenfeld
2022-12-21 22:23   ` Eric Biggers
2023-01-01 15:53     ` Jason A. Donenfeld
2022-12-21 14:23 ` [PATCH v13 7/7] x86: vdso: Wire up getrandom() vDSO implementation Jason A. Donenfeld
2022-12-21 23:27   ` Eric Biggers [this message]
2023-01-01 16:21     ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6OWSM18QL977nbC@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=Jason@zx2c4.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=arnd@arndb.de \
    --cc=brauner@kernel.org \
    --cc=carlos@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=sneves@dei.uc.pt \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.