From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6936C4167B for ; Mon, 27 Nov 2023 12:24:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=oVX0FxEDEWC315KSNuEmwMEVxEnYsC3TozADp61iuOc=; b=ngkd/75FjqzB+GPtCBppueQFY+ ntj957RhEdLopklTBheBzdj9nd/DY12+oaRMeMo46sa4jWJ0wmbfeJ1/BHEuS76DgduGv3xr1LcwF PjXCW7bAcjBlERPX2E2MZfeqwCbk99BcfCrNV/bFU43OIqQVUxidRS49ojsQegfSx5UuwKLM7rmPZ 7hZc8MZj4rubhgu8HZ7gNtHq05eeu0DUY98EhWV3Yoeqfux8Ok9DolZZ9B+whLjLI5a3neIETxnt0 HbhQV+A5RNWktlxvrdfl6h/A7MF+c9qDbzIQUmiCuPTg662kXzhOT17/VxQrJXMwhmQjHW7X7r+dy dDN/giwg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7aeQ-002Reu-1i; Mon, 27 Nov 2023 12:23:47 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ae6-002RWa-0i for linux-arm-kernel@lists.infradead.org; Mon, 27 Nov 2023 12:23:28 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5cccfca81b2so55683217b3.2 for ; Mon, 27 Nov 2023 04:23:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701087804; x=1701692604; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LtKj4jG97S5ROkd3H2gfeK9FLeAMjZ/gxMMmIHx9jAA=; b=1kxsDXOW8AQBIznUKthkh9OTSVEnpXVOF2dnKvxZ2MtoLnfkq8jPg1cvJg5z3sdtZr Q8BVtxnow99OmQjciU7ddL+f+8uFtzqxuy1qZbfeHkRBvONKbRZSRw7DfJjC+93Mc06S Usb0UzRuiqv3wUrE361fGDoIittHVThiq1v+P8+JafIRQ2TR9Q7FO7QnzXbc345uMh28 2P6M6Y84g/FFb7bvJguX93mGNt8PS2Uu+MP/zDThTc+Cmo8QGA2TPppmqM9113Iqb3xD 2/s0taMTLHzWVIlYc8a0+vrjJ2hlORnKPL6sPgIhvAoGG38yE+5i4hbFX7aoY1208xSZ 7RQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701087804; x=1701692604; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LtKj4jG97S5ROkd3H2gfeK9FLeAMjZ/gxMMmIHx9jAA=; b=IvxJL0icrYilT2ro8vT+dSFvD3NuwZ9hyLSEf7Hiz7hghr08k3J+RRRLu4MkZCoMxQ L/uFCqgx4emNepILBi2DhBSf3JiSpRRc+sGm26Iu/QZdDsegqZiRdmhj+83LqmKYV15V mRzYAH7m/40YnPtCKIbMRp/LOOrMx8Mi7vrKi7Cuy1pn2SeUODNi9FSqprkdilcH6A8E ce8hKhP7r9jB5TGziY1q6E9vKXeZ87Q54V3zPr2gKb5zEcgbK6MLlWHFRBfWbW+WPH4E y3UHa3SIQdJ/yGDUfGffXtC8+G28As/YsdAJCMZooXyliuSrvaZJHco1cpag3A+nRSFt ePSQ== X-Gm-Message-State: AOJu0Yw+bsuONNP/4h9dqjuMikg7H6X+JlobdfyafuanizrgP+Wv81ce l4lzQeX9Fm/gUmK7BXgXLudwxWkc9BIENw15FIDkaVsO88f5VHl7tjptmScxTvRyqiJnUupDyjV 8/vBMHSX69g6LbXNABWXMh3HS73HZWrAcJTMvow/5Ut2CymF2oW4SLAy2Cr5hdFC3qViuc5byAd M= X-Google-Smtp-Source: AGHT+IEe7Hy+nDV+k7QymmY1pBDw6FA1EP22u6IitdInsZJo4dzubHu7ZgGyB0HUS9r6tQh4h5B1ojyq X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a25:738c:0:b0:d90:e580:88e5 with SMTP id o134-20020a25738c000000b00d90e58088e5mr311965ybc.10.1701087804633; Mon, 27 Nov 2023 04:23:24 -0800 (PST) Date: Mon, 27 Nov 2023 13:23:05 +0100 In-Reply-To: <20231127122259.2265164-7-ardb@google.com> Mime-Version: 1.0 References: <20231127122259.2265164-7-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=6539; i=ardb@kernel.org; h=from:subject; bh=KAO1mMfveVRlNLtpRgzO/obJW9IGCdblJhF8gKSOjpo=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JITWlS3NjhFD+pbQDzg/KdTX2f7X5xen80lOvftfNdJFLH jyH9fU6SlkYxDgYZMUUWQRm/3238/REqVrnWbIwc1iZQIYwcHEKwESijjP8j58bxVy6sD5o9Rb+ rJYZHfv+nuOc0TNTqal5id3KdsZSVob/qaJXNblWRYaEHHTj5il+Hi+6xW2zUq7gFe5kjXzHGWb 8AA== X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127122259.2265164-12-ardb@google.com> Subject: [PATCH v3 5/5] arm64: crypto: Remove FPSIMD yield logic from glue code From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown , Eric Biggers , Sebastian Andrzej Siewior X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231127_042326_324551_385272D4 X-CRM114-Status: GOOD ( 17.39 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Ard Biesheuvel A previous patch already removed the assembler logic that was used to check periodically whether a task has its TIF_NEED_RESCHED set, and to yield the FPSIMD unit and the timeslice if this is the case. This is no longer necessary now that we no longer disable preemption when using the FPSIMD in kernel mode. Let's also remove the remaining C logic that yields the FPSIMD unit after every 4 KiB of input, which is arguably worse in terms of overhead, given that it is unconditional and therefore mostly unnecessary. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 5 ---- arch/arm64/crypto/chacha-neon-glue.c | 14 ++------- arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++---------------- arch/arm64/crypto/nhpoly1305-neon-glue.c | 12 ++------ arch/arm64/crypto/poly1305-glue.c | 15 +++------- arch/arm64/crypto/polyval-ce-glue.c | 5 ++-- 6 files changed, 18 insertions(+), 63 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index 25cd3808ecbe..a92ca6de1f96 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -125,16 +125,11 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) scatterwalk_start(&walk, sg_next(walk.sg)); n = scatterwalk_clamp(&walk, len); } - n = min_t(u32, n, SZ_4K); /* yield NEON at least every 4k */ p = scatterwalk_map(&walk); macp = ce_aes_ccm_auth_data(mac, p, n, macp, ctx->key_enc, num_rounds(ctx)); - if (len / SZ_4K > (len - n) / SZ_4K) { - kernel_neon_end(); - kernel_neon_begin(); - } len -= n; scatterwalk_unmap(p); diff --git a/arch/arm64/crypto/chacha-neon-glue.c b/arch/arm64/crypto/chacha-neon-glue.c index af2bbca38e70..37ca3e889848 100644 --- a/arch/arm64/crypto/chacha-neon-glue.c +++ b/arch/arm64/crypto/chacha-neon-glue.c @@ -87,17 +87,9 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, !crypto_simd_usable()) return chacha_crypt_generic(state, dst, src, bytes, nrounds); - do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); - - kernel_neon_begin(); - chacha_doneon(state, dst, src, todo, nrounds); - kernel_neon_end(); - - bytes -= todo; - src += todo; - dst += todo; - } while (bytes); + kernel_neon_begin(); + chacha_doneon(state, dst, src, bytes, nrounds); + kernel_neon_end(); } EXPORT_SYMBOL(chacha_crypt_arch); diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c index 09eb1456aed4..ccc3f6067742 100644 --- a/arch/arm64/crypto/crct10dif-ce-glue.c +++ b/arch/arm64/crypto/crct10dif-ce-glue.c @@ -37,18 +37,9 @@ static int crct10dif_update_pmull_p8(struct shash_desc *desc, const u8 *data, u16 *crc = shash_desc_ctx(desc); if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && crypto_simd_usable()) { - do { - unsigned int chunk = length; - - if (chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE) - chunk = SZ_4K; - - kernel_neon_begin(); - *crc = crc_t10dif_pmull_p8(*crc, data, chunk); - kernel_neon_end(); - data += chunk; - length -= chunk; - } while (length); + kernel_neon_begin(); + *crc = crc_t10dif_pmull_p8(*crc, data, length); + kernel_neon_end(); } else { *crc = crc_t10dif_generic(*crc, data, length); } @@ -62,18 +53,9 @@ static int crct10dif_update_pmull_p64(struct shash_desc *desc, const u8 *data, u16 *crc = shash_desc_ctx(desc); if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && crypto_simd_usable()) { - do { - unsigned int chunk = length; - - if (chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE) - chunk = SZ_4K; - - kernel_neon_begin(); - *crc = crc_t10dif_pmull_p64(*crc, data, chunk); - kernel_neon_end(); - data += chunk; - length -= chunk; - } while (length); + kernel_neon_begin(); + *crc = crc_t10dif_pmull_p64(*crc, data, length); + kernel_neon_end(); } else { *crc = crc_t10dif_generic(*crc, data, length); } diff --git a/arch/arm64/crypto/nhpoly1305-neon-glue.c b/arch/arm64/crypto/nhpoly1305-neon-glue.c index e4a0b463f080..7df0ab811c4e 100644 --- a/arch/arm64/crypto/nhpoly1305-neon-glue.c +++ b/arch/arm64/crypto/nhpoly1305-neon-glue.c @@ -22,15 +22,9 @@ static int nhpoly1305_neon_update(struct shash_desc *desc, if (srclen < 64 || !crypto_simd_usable()) return crypto_nhpoly1305_update(desc, src, srclen); - do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); - - kernel_neon_begin(); - crypto_nhpoly1305_update_helper(desc, src, n, nh_neon); - kernel_neon_end(); - src += n; - srclen -= n; - } while (srclen); + kernel_neon_begin(); + crypto_nhpoly1305_update_helper(desc, src, srclen, nh_neon); + kernel_neon_end(); return 0; } diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c index 1fae18ba11ed..326871897d5d 100644 --- a/arch/arm64/crypto/poly1305-glue.c +++ b/arch/arm64/crypto/poly1305-glue.c @@ -143,20 +143,13 @@ void poly1305_update_arch(struct poly1305_desc_ctx *dctx, const u8 *src, unsigned int len = round_down(nbytes, POLY1305_BLOCK_SIZE); if (static_branch_likely(&have_neon) && crypto_simd_usable()) { - do { - unsigned int todo = min_t(unsigned int, len, SZ_4K); - - kernel_neon_begin(); - poly1305_blocks_neon(&dctx->h, src, todo, 1); - kernel_neon_end(); - - len -= todo; - src += todo; - } while (len); + kernel_neon_begin(); + poly1305_blocks_neon(&dctx->h, src, len, 1); + kernel_neon_end(); } else { poly1305_blocks(&dctx->h, src, len, 1); - src += len; } + src += len; nbytes %= POLY1305_BLOCK_SIZE; } diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c index 0a3b5718df85..8c83e5f44e51 100644 --- a/arch/arm64/crypto/polyval-ce-glue.c +++ b/arch/arm64/crypto/polyval-ce-glue.c @@ -122,9 +122,8 @@ static int polyval_arm64_update(struct shash_desc *desc, tctx->key_powers[NUM_KEY_POWERS-1]); } - while (srclen >= POLYVAL_BLOCK_SIZE) { - /* allow rescheduling every 4K bytes */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + if (srclen >= POLYVAL_BLOCK_SIZE) { + nblocks = srclen / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; -- 2.43.0.rc1.413.gea7ed67945-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel