From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3CBDAC4167B for ; Mon, 27 Nov 2023 12:23:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: Mime-Version:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=gyGOA/B1eRtySfKMN3GqpeAOjVUh5qPiJszj2yA1Umo=; b=nh1 w73gfzcAhvZBM2hx4mSZ1eEB1WY1pQsrKPkCqO7RzdyKiamUNuKjBYszMxzmZmpbCnj+JhBjLDgbc ExlGMtfhH7BZmoaqgjTzM8faBrkAuW+Pwjqov7ABprVNUB9CTXcPD4DDMTmzMtMgjD5sDu5Y2gPjW FjXFM4lAxp8J4xgzDeafWD5Yfyf4J3aZtWStyecztlV/+9qm4z7NYSNYvVtyWICHUNSqajpZGR8p6 UV2amhsBJXcxSPjLLpoCUYxhYgk4hY22KtdRy4TVVJ0DvvyGiS6j1Wd9y5lLnvaSLPxzAm0xdtuqu s7/PUTlTmaLaAb6wzeouZ3AkhWvX4oQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7adx-002RTs-1s; Mon, 27 Nov 2023 12:23:17 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7adu-002RT1-1E for linux-arm-kernel@lists.infradead.org; Mon, 27 Nov 2023 12:23:15 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5cdde93973aso48862457b3.1 for ; Mon, 27 Nov 2023 04:23:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701087792; x=1701692592; darn=lists.infradead.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=mTAbE6A7h/+GSMJqHsdCqMe6niueGin0KE9lvobW3U0=; b=0QBppHmvhUllcgo3bzRWhX2E4jqLVgNUar4PCW4iGBjPpe1/MDI8THSJpxMsTFLPUb qBG765TcqZ6SDmeBbpsoR59KyNpbt2Mx6Lphg2GX2wxz9rOvUtOOoX2m2ib9PR2PXTi7 pntT39uoI55c18CZOQArrhUZYTVLui0/ZElQ3hPMntnPyRsIO52Zi4MLk1FY3JHq0ZEY LKUJtqMoHdEGKrr6pWGpigpNxj9r1Ih4ufsgXnHryTWoip+F4Di919Asq2pzNOv/3uUI pcqDw41t8XOM3R6FGR3embBt1qZA3BgShF6UXmBluSOYSaC1Os7pusXxziLhxnMjBuXf yE3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701087792; x=1701692592; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=mTAbE6A7h/+GSMJqHsdCqMe6niueGin0KE9lvobW3U0=; b=SO+fBSjyIqJOVMgMkeX1EQjq6HzueCE07jLOse5cbNMYaLQMjpyhC3knsjOCj8MebM ftm54GB9RpMiX8pfuI+U36ZH+8NnFCwtoaosn9d8fQvHH/bkfjlaOqoPq0gU4edvDJd1 nVQLS3r8csKUcJgOUhnfnEDXBtPVde15LrreoB53/6mJxhgEYXhNvTxJwmw8umLw/eof 1Nu2AKxY9Zsl8LG0MejSm4LJqQnY4TlITj2WRlbqyagdVEH7UW5UfedGoS0E9UV2PorI s0Jb/Jb2sssg6SqGCSnLXlMKf3pRW7FsSsEhrbmKkfXEj8awY72MjxjKym4ddbnkVxpz vTZg== X-Gm-Message-State: AOJu0YzMac6OqNWMqwCYdAJh6i7AtyQMt0A12Y2wPjZdwqRNi1ElPpTq 3xSQPGQZJaD0+tdJDkkuPteh3kOi04vSYH1ZByPhcX5InmqnEWY1fHJ88eLKK/T0hWCNzpcpEYO t8LkNHEDgCsicTDJ1X0cdRWgpcACOLY1KTLO5kMg+/yIIgzT+zrFuXiou2MtcRPpcaiGkrGKA1g g= X-Google-Smtp-Source: AGHT+IH5ulONuTTxC9bKKsnKzTzA3Qs/xq+rt4C2SrYe4kZFYnEjeY6RMoPR8fy1OPDslL9g38vcqXTA X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:690c:88f:b0:5be:9742:cc3a with SMTP id cd15-20020a05690c088f00b005be9742cc3amr410515ywb.4.1701087792362; Mon, 27 Nov 2023 04:23:12 -0800 (PST) Date: Mon, 27 Nov 2023 13:23:00 +0100 Mime-Version: 1.0 X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=4415; i=ardb@kernel.org; h=from:subject; bh=9wokBVtyyMHJ2nK6yySXmr5CwNmsgT1Rtbt/AXle+a4=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JITWlS/l5i4Tg1ouWq/nY5zKc8PnMfdLQ5+9iTaY5Tw5Zb 526pm1aRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZhIrSrD/7hbHK80OpeuudB0 8oSq4aNlC87s3RiQbnWn6uEO6/7TDOUM/4s3ta7hXPlS3bsm4mnoDE39l3cXCdZN2Z222O6W8I/ 6Hk4A X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231127122259.2265164-7-ardb@google.com> Subject: [PATCH v3 0/5] arm64: Run kernel mode NEON with preemption enabled From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown , Eric Biggers , Sebastian Andrzej Siewior X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231127_042314_420914_0A62DE81 X-CRM114-Status: GOOD ( 21.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Ard Biesheuvel Currently, kernel mode NEON (SIMD) support is implemented in a way that requires preemption to be disabled while the SIMD registers are live. The reason for this is that those registers are not in the set that is preserved/restored on exception entry/exit and context switch, as this would impact performance generally, even for workloads where kernel mode SIMD is not the bottleneck. However, doing substantial work with preemption disabled is not great, as it affects scheduling latency, which is especially problematic for real-time use cases. So ideally, we should keep preemption enabled when we can, and find another way to ensure that this does not corrupt the NEON register state of in-kernel SIMD users. This series implements a suggestion by Mark Rutland, and introduces a thread_info flag TIF_USING_KMODE_FPSIMD, which indicates to the thread switch machinery that the task in question has live kernel mode SIMD state which needs to be preserved and restored. The space needed for this is allocated in thread_struct. (*) Given that currently, we run kernel mode NEON with softirqs disabled (to avoid the need for preserving kernel mode NEON context belonging to task context while the SIMD unit is being used by code running in softirq context), just removing the preempt_disable/enable calls is not sufficient, and we also need to leave softirqs enabled. This means that we may need to preserve kernel mode NEON state not only on a context switch, but also when code running in softirq context takes ownership of the SIMD unit, but this is straight-forward once we add the scratch space to thread_struct. (On PREEMPT_RT, softirqs execute with preemption enabled, making kernel mode FPSIMD in softirq context preemptible as well. We rely on the fact that the task that hosts the softirq dispatch logic does not itself use kernel mode FPSIMD in task context to ensure that there is only a single kernel mode FPSIMD state that may need to be preserved and restored.) (*) We might decide to allocate this space (~512 bytes) dynamically, if the thread_struct memory footprint causes issues. However, we should also explore doing the same for the user space FPSIMD state, as kernel threads never return to user space and have no need for this allocation. v3: - add patch to drop yield logic from crypto C glue code - add R-b from Mark v2: - tweak some commit logs for clarity - integrate with the existing lazy restore logic - add Mark's R-b to patch #1 Cc: Marc Zyngier Cc: Will Deacon Cc: Mark Rutland Cc: Kees Cook Cc: Catalin Marinas Cc: Mark Brown Cc: Eric Biggers Cc: Sebastian Andrzej Siewior Ard Biesheuvel (5): arm64: fpsimd: Drop unneeded 'busy' flag arm64: fpsimd: Preserve/restore kernel mode NEON at context switch arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: crypto: Remove conditional yield logic arm64: crypto: Remove FPSIMD yield logic from glue code arch/arm64/crypto/aes-ce-ccm-glue.c | 5 - arch/arm64/crypto/aes-glue.c | 21 +-- arch/arm64/crypto/aes-modes.S | 2 - arch/arm64/crypto/chacha-neon-glue.c | 14 +- arch/arm64/crypto/crct10dif-ce-glue.c | 30 +--- arch/arm64/crypto/nhpoly1305-neon-glue.c | 12 +- arch/arm64/crypto/poly1305-glue.c | 15 +- arch/arm64/crypto/polyval-ce-glue.c | 5 +- arch/arm64/crypto/sha1-ce-core.S | 6 +- arch/arm64/crypto/sha1-ce-glue.c | 19 +-- arch/arm64/crypto/sha2-ce-core.S | 6 +- arch/arm64/crypto/sha2-ce-glue.c | 19 +-- arch/arm64/crypto/sha3-ce-core.S | 6 +- arch/arm64/crypto/sha3-ce-glue.c | 14 +- arch/arm64/crypto/sha512-ce-core.S | 8 +- arch/arm64/crypto/sha512-ce-glue.c | 16 +- arch/arm64/include/asm/assembler.h | 29 ---- arch/arm64/include/asm/processor.h | 3 + arch/arm64/include/asm/simd.h | 11 +- arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/asm-offsets.c | 4 - arch/arm64/kernel/fpsimd.c | 163 +++++++++++++------- 22 files changed, 165 insertions(+), 244 deletions(-) -- 2.43.0.rc1.413.gea7ed67945-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel