From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9544C61D94 for ; Tue, 21 Nov 2023 13:55:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=6ArAPux9c1QxmNkpo8+i/Wz7q2BSR8iXhlMXlLOVqJs=; b=MYFX48adSUyk2o6ltMzZ+xyVTL /Qb3xbAl48dZ4mOdBEpac+sy34GZjDIfGIxTZAo52P1A6Fp8yfBwOMgWBZKzZg3f7Fh8P3PgdRPKh AHKgn6x1rQlJw5c+y9Akd0YFbaz47eoP47x7Mf5bcw3NjkDCGEcYfqVgJsGz2MpY4No+793sfdFjO ZUVj6NkA88sXgs2Jvfui2D32uaFy4EXNM1JlvKSnrs4OJRfha3dvNL9LkUdiJyvuuDafrtA1ljIEH KNXT7cggRAjJsFRU+2pOd7RUqwkVCv1YV2lKvDWfvqz9UFTRyrX/ZrgZCyYY56jzpSe3K71+z7Mnb WBdAaa0w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r5RD0-00Gwpu-0k; Tue, 21 Nov 2023 13:54:34 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r5RB5-00Guri-2h for linux-arm-kernel@lists.infradead.org; Tue, 21 Nov 2023 13:52:40 +0000 Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-daf702bde7eso6070887276.3 for ; Tue, 21 Nov 2023 05:52:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700574752; x=1701179552; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kaSC3L3YwvO+bfj5eG+L95pXriCO0HJKc2OF1l60Qgo=; b=A1rZEUY5AjAXcH8MMeu6FR027WQM5I8S8tQGGwCq7hsHJ4exaz1B/aud1u28pyIOjJ 4oPF31XtTLxwDegIclY+uy69udrZWcC2zSVtg9D9bWWHjDimyXPDfEBLYetsfhdkERCt J1A7iqnru9YBX/TEI6IJ/yQ5LjM0MI+tvOcKlaB8uLUk6RqALNH5OUmhlnFWFnsSuvmw cWPqPLeSOTb+AUVIpA8wfLm4Y/3iDozNbPVJewYRhXq6OnO6DXhDn/RQkfsEDlvY43Zp v+WRJzgj3zBwR4zagBXmL/kNjECucDcyuT9bXNSag4TZQ2qpy4F/d+c1GUocqj3szCoq ctZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700574752; x=1701179552; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kaSC3L3YwvO+bfj5eG+L95pXriCO0HJKc2OF1l60Qgo=; b=HhWxOz/N3PNjRwXdL2Qa3yGaeZkYoPAc743prPicnQWH6KtG6CwoI7h8eyng7W4OCd DfysB9t8sF68zjj57KeNDPn0djzMdfLlAd8ypJPKrFmS2TCIpA0hK2/sPOGa1OUjUHR4 C0aEvUxj1wPY2KmeXQHxHOl78rcengXt287YBLPhFAxX+LyyAhvEENfcHJwOcPKtAYSY leQ2jGN2hijIS5z7aAB14DIDpsEAM47R2/OB6g5UuqMvWATR9Cgu/Ed+8Rd3i7DFgwgi FceTmi1n92EnFMXmF3j4/Z52xseFo5DUvdlGFpGlNTzEzOEB3jh8DfGOqGdEeOEZsCka If2A== X-Gm-Message-State: AOJu0YxsudCWsn7xncJSzAvtKKEIMH29LYWRY7d9ydVlnkbAhCGVFAJ0 4+e1XDHzadOXEmzgFSFvjeKWraK7X3HIDH1pHWo9TqA/NtWkUwRzqF8x88Cx/1SEt5xm6J2ORMG QnmAxNiYUK8+j14XLVetyYF8a+61l68sUMUjff70jgqr2yrZAWbQ6+Y9KAvlZoPxa0TmFK4+DLM w= X-Google-Smtp-Source: AGHT+IEdQi4sUMXQPUyYlZ8pGoJrzr/B3CtNk1HQSx2iSiu49N5KXuYd96gNP7/mHPFFAd7fwAe/uWe5 X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:6902:34e:b0:da3:b880:180e with SMTP id e14-20020a056902034e00b00da3b880180emr293536ybs.2.1700574751865; Tue, 21 Nov 2023 05:52:31 -0800 (PST) Date: Tue, 21 Nov 2023 14:52:16 +0100 In-Reply-To: <20231121135212.2235251-5-ardb@google.com> Mime-Version: 1.0 References: <20231121135212.2235251-5-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=12032; i=ardb@kernel.org; h=from:subject; bh=D72j3DC5vgqO9EGv5hX4n06FbnbgTsBV1r/hwKGgAlI=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JITVmmwDfV/X3sVNf73husPCZkOafMN7XQS13/sRUcc5Kz f9wSYyno5SFQYyDQVZMkUVg9t93O09PlKp1niULM4eVCWQIAxenAEzkszXDfy/T09927Vu5/rzY Df4nd/6e+zLP0bliVUajeHqf86u9YqcYGdbNOpnQN/+F8YVA9tsVzzaZeU4rlOaZcoV1llRC6YX vFlwA X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231121135212.2235251-8-ardb@google.com> Subject: [PATCH 3/3] arm64: crypto: remove conditional yield logic From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231121_055235_921390_16550827 X-CRM114-Status: GOOD ( 20.48 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Ard Biesheuvel Some classes of crypto algorithms (such as skciphers or aeads) have natural yield points, but SIMD based shashes yield the NEON unit manually to avoid causing scheduling blackouts when operating on large inputs. This is no longer necessary now that kernel mode NEON runs with preemption enabled, so remove this logic from the crypto code. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 21 +++++--------- arch/arm64/crypto/aes-modes.S | 2 -- arch/arm64/crypto/sha1-ce-core.S | 2 -- arch/arm64/crypto/sha1-ce-glue.c | 19 ++++--------- arch/arm64/crypto/sha2-ce-core.S | 2 -- arch/arm64/crypto/sha2-ce-glue.c | 19 ++++--------- arch/arm64/crypto/sha3-ce-core.S | 4 +-- arch/arm64/crypto/sha3-ce-glue.c | 14 ++++------ arch/arm64/crypto/sha512-ce-core.S | 2 -- arch/arm64/crypto/sha512-ce-glue.c | 16 ++++------- arch/arm64/include/asm/assembler.h | 29 -------------------- arch/arm64/kernel/asm-offsets.c | 4 --- 12 files changed, 30 insertions(+), 104 deletions(-) diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 162787c7aa86..c42c903b7d60 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -109,9 +109,9 @@ asmlinkage void aes_essiv_cbc_decrypt(u8 out[], u8 const in[], u32 const rk1[], int rounds, int blocks, u8 iv[], u32 const rk2[]); -asmlinkage int aes_mac_update(u8 const in[], u32 const rk[], int rounds, - int blocks, u8 dg[], int enc_before, - int enc_after); +asmlinkage void aes_mac_update(u8 const in[], u32 const rk[], int rounds, + int blocks, u8 dg[], int enc_before, + int enc_after); struct crypto_aes_xts_ctx { struct crypto_aes_ctx key1; @@ -880,17 +880,10 @@ static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks, int rounds = 6 + ctx->key_length / 4; if (crypto_simd_usable()) { - int rem; - - do { - kernel_neon_begin(); - rem = aes_mac_update(in, ctx->key_enc, rounds, blocks, - dg, enc_before, enc_after); - kernel_neon_end(); - in += (blocks - rem) * AES_BLOCK_SIZE; - blocks = rem; - enc_before = 0; - } while (blocks); + kernel_neon_begin(); + aes_mac_update(in, ctx->key_enc, rounds, blocks, dg, + enc_before, enc_after); + kernel_neon_end(); } else { if (enc_before) aes_encrypt(ctx, dg, dg); diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 0e834a2c062c..4d68853d0caf 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -842,7 +842,6 @@ AES_FUNC_START(aes_mac_update) cbz w5, .Lmacout encrypt_block v0, w2, x1, x7, w8 st1 {v0.16b}, [x4] /* return dg */ - cond_yield .Lmacout, x7, x8 b .Lmacloop4x .Lmac1x: add w3, w3, #4 @@ -861,6 +860,5 @@ AES_FUNC_START(aes_mac_update) .Lmacout: st1 {v0.16b}, [x4] /* return dg */ - mov w0, w3 ret AES_FUNC_END(aes_mac_update) diff --git a/arch/arm64/crypto/sha1-ce-core.S b/arch/arm64/crypto/sha1-ce-core.S index 9b1f2d82a6fe..113deb09dbf6 100644 --- a/arch/arm64/crypto/sha1-ce-core.S +++ b/arch/arm64/crypto/sha1-ce-core.S @@ -121,7 +121,6 @@ CPU_LE( rev32 v11.16b, v11.16b ) add dgav.4s, dgav.4s, dg0v.4s cbz w2, 2f - cond_yield 3f, x5, x6 b 0b /* @@ -145,6 +144,5 @@ CPU_LE( rev32 v11.16b, v11.16b ) /* store new state */ 3: st1 {dgav.4s}, [x0] str dgb, [x0, #16] - mov w0, w2 ret SYM_FUNC_END(__sha1_ce_transform) diff --git a/arch/arm64/crypto/sha1-ce-glue.c b/arch/arm64/crypto/sha1-ce-glue.c index 1dd93e1fcb39..c1c5c5cb104b 100644 --- a/arch/arm64/crypto/sha1-ce-glue.c +++ b/arch/arm64/crypto/sha1-ce-glue.c @@ -29,23 +29,16 @@ struct sha1_ce_state { extern const u32 sha1_ce_offsetof_count; extern const u32 sha1_ce_offsetof_finalize; -asmlinkage int __sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src, - int blocks); +asmlinkage void __sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src, + int blocks); static void sha1_ce_transform(struct sha1_state *sst, u8 const *src, int blocks) { - while (blocks) { - int rem; - - kernel_neon_begin(); - rem = __sha1_ce_transform(container_of(sst, - struct sha1_ce_state, - sst), src, blocks); - kernel_neon_end(); - src += (blocks - rem) * SHA1_BLOCK_SIZE; - blocks = rem; - } + kernel_neon_begin(); + __sha1_ce_transform(container_of(sst, struct sha1_ce_state, sst), src, + blocks); + kernel_neon_end(); } const u32 sha1_ce_offsetof_count = offsetof(struct sha1_ce_state, sst.count); diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/crypto/sha2-ce-core.S index fce84d88ddb2..c9629992d3cf 100644 --- a/arch/arm64/crypto/sha2-ce-core.S +++ b/arch/arm64/crypto/sha2-ce-core.S @@ -129,7 +129,6 @@ CPU_LE( rev32 v19.16b, v19.16b ) /* handled all input blocks? */ cbz w2, 2f - cond_yield 3f, x5, x6 b 0b /* @@ -152,6 +151,5 @@ CPU_LE( rev32 v19.16b, v19.16b ) /* store new state */ 3: st1 {dgav.4s, dgbv.4s}, [x0] - mov w0, w2 ret SYM_FUNC_END(__sha256_ce_transform) diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c index 0a44d2e7ee1f..f785a66a1de4 100644 --- a/arch/arm64/crypto/sha2-ce-glue.c +++ b/arch/arm64/crypto/sha2-ce-glue.c @@ -30,23 +30,16 @@ struct sha256_ce_state { extern const u32 sha256_ce_offsetof_count; extern const u32 sha256_ce_offsetof_finalize; -asmlinkage int __sha256_ce_transform(struct sha256_ce_state *sst, u8 const *src, - int blocks); +asmlinkage void __sha256_ce_transform(struct sha256_ce_state *sst, u8 const *src, + int blocks); static void sha256_ce_transform(struct sha256_state *sst, u8 const *src, int blocks) { - while (blocks) { - int rem; - - kernel_neon_begin(); - rem = __sha256_ce_transform(container_of(sst, - struct sha256_ce_state, - sst), src, blocks); - kernel_neon_end(); - src += (blocks - rem) * SHA256_BLOCK_SIZE; - blocks = rem; - } + kernel_neon_begin(); + __sha256_ce_transform(container_of(sst, struct sha256_ce_state, sst), + src, blocks); + kernel_neon_end(); } const u32 sha256_ce_offsetof_count = offsetof(struct sha256_ce_state, diff --git a/arch/arm64/crypto/sha3-ce-core.S b/arch/arm64/crypto/sha3-ce-core.S index 9c77313f5a60..10c74f19054d 100644 --- a/arch/arm64/crypto/sha3-ce-core.S +++ b/arch/arm64/crypto/sha3-ce-core.S @@ -184,18 +184,16 @@ SYM_FUNC_START(sha3_ce_transform) eor v0.16b, v0.16b, v31.16b cbnz w8, 3b - cond_yield 4f, x8, x9 cbnz w2, 0b /* save state */ -4: st1 { v0.1d- v3.1d}, [x0], #32 + st1 { v0.1d- v3.1d}, [x0], #32 st1 { v4.1d- v7.1d}, [x0], #32 st1 { v8.1d-v11.1d}, [x0], #32 st1 {v12.1d-v15.1d}, [x0], #32 st1 {v16.1d-v19.1d}, [x0], #32 st1 {v20.1d-v23.1d}, [x0], #32 st1 {v24.1d}, [x0] - mov w0, w2 ret SYM_FUNC_END(sha3_ce_transform) diff --git a/arch/arm64/crypto/sha3-ce-glue.c b/arch/arm64/crypto/sha3-ce-glue.c index 250e1377c481..d689cd2bf4cf 100644 --- a/arch/arm64/crypto/sha3-ce-glue.c +++ b/arch/arm64/crypto/sha3-ce-glue.c @@ -28,8 +28,8 @@ MODULE_ALIAS_CRYPTO("sha3-256"); MODULE_ALIAS_CRYPTO("sha3-384"); MODULE_ALIAS_CRYPTO("sha3-512"); -asmlinkage int sha3_ce_transform(u64 *st, const u8 *data, int blocks, - int md_len); +asmlinkage void sha3_ce_transform(u64 *st, const u8 *data, int blocks, + int md_len); static int sha3_update(struct shash_desc *desc, const u8 *data, unsigned int len) @@ -59,15 +59,11 @@ static int sha3_update(struct shash_desc *desc, const u8 *data, blocks = len / sctx->rsiz; len %= sctx->rsiz; - while (blocks) { - int rem; - + if (blocks) { kernel_neon_begin(); - rem = sha3_ce_transform(sctx->st, data, blocks, - digest_size); + sha3_ce_transform(sctx->st, data, blocks, digest_size); kernel_neon_end(); - data += (blocks - rem) * sctx->rsiz; - blocks = rem; + data += blocks * sctx->rsiz; } } diff --git a/arch/arm64/crypto/sha512-ce-core.S b/arch/arm64/crypto/sha512-ce-core.S index 91ef68b15fcc..bfaa4f591290 100644 --- a/arch/arm64/crypto/sha512-ce-core.S +++ b/arch/arm64/crypto/sha512-ce-core.S @@ -195,12 +195,10 @@ CPU_LE( rev64 v19.16b, v19.16b ) add v10.2d, v10.2d, v2.2d add v11.2d, v11.2d, v3.2d - cond_yield 3f, x4, x5 /* handled all input blocks? */ cbnz w2, 0b /* store new state */ 3: st1 {v8.2d-v11.2d}, [x0] - mov w0, w2 ret SYM_FUNC_END(__sha512_ce_transform) diff --git a/arch/arm64/crypto/sha512-ce-glue.c b/arch/arm64/crypto/sha512-ce-glue.c index f3431fc62315..70eef74fe031 100644 --- a/arch/arm64/crypto/sha512-ce-glue.c +++ b/arch/arm64/crypto/sha512-ce-glue.c @@ -26,23 +26,17 @@ MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("sha384"); MODULE_ALIAS_CRYPTO("sha512"); -asmlinkage int __sha512_ce_transform(struct sha512_state *sst, u8 const *src, - int blocks); +asmlinkage void __sha512_ce_transform(struct sha512_state *sst, u8 const *src, + int blocks); asmlinkage void sha512_block_data_order(u64 *digest, u8 const *src, int blocks); static void sha512_ce_transform(struct sha512_state *sst, u8 const *src, int blocks) { - while (blocks) { - int rem; - - kernel_neon_begin(); - rem = __sha512_ce_transform(sst, src, blocks); - kernel_neon_end(); - src += (blocks - rem) * SHA512_BLOCK_SIZE; - blocks = rem; - } + kernel_neon_begin(); + __sha512_ce_transform(sst, src, blocks); + kernel_neon_end(); } static void sha512_arm64_transform(struct sha512_state *sst, u8 const *src, diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 376a980f2bad..f0da53a0388f 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -759,35 +759,6 @@ alternative_endif set_sctlr sctlr_el2, \reg .endm - /* - * Check whether preempt/bh-disabled asm code should yield as soon as - * it is able. This is the case if we are currently running in task - * context, and either a softirq is pending, or the TIF_NEED_RESCHED - * flag is set and re-enabling preemption a single time would result in - * a preempt count of zero. (Note that the TIF_NEED_RESCHED flag is - * stored negated in the top word of the thread_info::preempt_count - * field) - */ - .macro cond_yield, lbl:req, tmp:req, tmp2:req - get_current_task \tmp - ldr \tmp, [\tmp, #TSK_TI_PREEMPT] - /* - * If we are serving a softirq, there is no point in yielding: the - * softirq will not be preempted no matter what we do, so we should - * run to completion as quickly as we can. - */ - tbnz \tmp, #SOFTIRQ_SHIFT, .Lnoyield_\@ -#ifdef CONFIG_PREEMPTION - sub \tmp, \tmp, #PREEMPT_DISABLE_OFFSET - cbz \tmp, \lbl -#endif - adr_l \tmp, irq_stat + IRQ_CPUSTAT_SOFTIRQ_PENDING - get_this_cpu_offset \tmp2 - ldr w\tmp, [\tmp, \tmp2] - cbnz w\tmp, \lbl // yield on pending softirq in task context -.Lnoyield_\@: - .endm - /* * Branch Target Identifier (BTI) */ diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 5ff1942b04fc..fb9e9ef9b527 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -116,10 +116,6 @@ int main(void) DEFINE(DMA_TO_DEVICE, DMA_TO_DEVICE); DEFINE(DMA_FROM_DEVICE, DMA_FROM_DEVICE); BLANK(); - DEFINE(PREEMPT_DISABLE_OFFSET, PREEMPT_DISABLE_OFFSET); - DEFINE(SOFTIRQ_SHIFT, SOFTIRQ_SHIFT); - DEFINE(IRQ_CPUSTAT_SOFTIRQ_PENDING, offsetof(irq_cpustat_t, __softirq_pending)); - BLANK(); DEFINE(CPU_BOOT_TASK, offsetof(struct secondary_data, task)); BLANK(); DEFINE(FTR_OVR_VAL_OFFSET, offsetof(struct arm64_ftr_override, val)); -- 2.43.0.rc1.413.gea7ed67945-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel