From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from BN8PR05CU002.outbound.protection.outlook.com (mail-eastus2azon11021084.outbound.protection.outlook.com [52.101.57.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A624B3ED11B; Thu, 25 Jun 2026 17:34:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.57.84 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782408887; cv=fail; b=qr+MzCmlFKCZB4LOiKXa53qzPpojM49d9yFIunZ8CdxI/O0ffI1XwAkkrSTSv1IeyYUJLXAjNOPSDi4BliiPEUxRlbpX/4Gyu9NFKLYfY+PbVfmWsKyDA+yf+UQx8IBubEWV19fxzxEQ66vALs8kf575OgtWRSIh2EvkStTivaQ= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782408887; c=relaxed/simple; bh=jAPfTzWPlbvDyBhtGkVeqmhCjEY1uCCV+b0diQVJU8c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cQNZ9X5IMiHCCLk350D3+w9Unyal+vkYacHARpkYUZCI+8fq8nu6kD5I+9F0rh4XYC5QzVsmxehCTs5qZ3THn5lpF8KnBZNxjftH2T4JK+JHgZY3XrFnPxHg/dY4SFN9w2IDR4szp/A++8W+VB00/B0/D9lJ3Iqz4AQdXp7MKpo= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=rambus.com; spf=fail smtp.mailfrom=rambus.com; dkim=pass (2048-bit key) header.d=rambus.com header.i=@rambus.com header.b=n/pzwdJR; arc=fail smtp.client-ip=52.101.57.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=rambus.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=rambus.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rambus.com header.i=@rambus.com header.b="n/pzwdJR" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=jVYeEx/FXjw6bPD8P1mprr5Lu3HpDFYndNIupwdIg9g0v+D+pRvPZsKN03NEtpvrqaUgIC1A2u10/SjU8XO7HrVhwdTlSuo1kqNo0ASGtkkmrJRDeKfMr0FzLSwdcGZRDFR+tVwdbKqUMyW2N3Jb/qQalqUj+YREmUDTQAmXnpbzgiA2n6T2SXnCiKV7vsQNCap6dX7rSeh32VOBeZq7AEVFNwXUhPO8kM6/W/oUXxtG0rbyTyYCob9MESz+UavDIQNemaEEBr3YRDNnEVhsgMZGTM0AWylKmLuQORl7vttv2Hes3yUaSsMoV/n56Mf/ejrD0V0Y4zbRRhMM5MyR0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1P24rb4YDFrOe64yoUjdDpx1xTAvH0zT9sSnF2ziRLg=; b=ZFUonXA1MolQh+h8ODBPZnvEZaoNa6Ru7pDUfFzznyLDxjOXm5hCYdCJ874H0hE04ewaAvD5gz6kvNiOkRtlf4L1xv6QgiNtYEtDDzrNzGPUvFsO4b5ni6mmuV+OY+c22vPfRF/FO+gSlYlSfJZYVsDw5+qJXlpdLeLCaxXKH+QZJ2u/c0TJExnB0P0lg1YyFkH5Ghjmt9yElXGdPn0ngUzw0yWCwOML0Nb1qufUWofyG15rn3sfEN0JGL64dEc9+p/3yO3KcGMScyGUifZTFQ4iw62SXiep3I5fqb/DwGeo5smzsIBxhSOVUDaU4ktbCDiMPOLUE+3oW/GPyreNlw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.86.86.210) smtp.rcpttodomain=cryptography.com smtp.mailfrom=rambus.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=rambus.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rambus.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1P24rb4YDFrOe64yoUjdDpx1xTAvH0zT9sSnF2ziRLg=; b=n/pzwdJRpiCPeat3hDzZm0Xuw+gu0hzvFs/AktKI3RlQLkBRkYW4BWh9Vr+RvL0Q3m2AaRDJifw7cFWKOV0mI6VC3a9EsNG5Yu0IjT6EPVramFGM3xUdvMUAqkrjZa+2ykQCWGPjS+yEEjum5SJuKPjpm68Q33TqfywWi2UieHCc6X9+8QZiJp9eTeu6eh6BLY5/1vDk9byvtt4JnFEDux3Bfv0rv1PCr7CwteuUlUhsAcrOUwAC10rVqoD2O0E0ZDN89TOGREErGHnv0rrTvqUrY0S/YfcCFTINdlggxrY1usN1pzav4V9SmsgSmqdHbXi9rht5lvYrlTsXFIodAg== Received: from DS7P220CA0011.NAMP220.PROD.OUTLOOK.COM (2603:10b6:8:1ca::16) by LV3PR04MB9348.namprd04.prod.outlook.com (2603:10b6:408:281::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.21.159.17; Thu, 25 Jun 2026 17:34:07 +0000 Received: from DS3PEPF000099E1.namprd04.prod.outlook.com (2603:10b6:8:1ca:cafe::80) by DS7P220CA0011.outlook.office365.com (2603:10b6:8:1ca::16) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.21.159.17 via Frontend Transport; Thu, 25 Jun 2026 17:34:07 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.86.86.210) smtp.mailfrom=rambus.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=rambus.com; Received-SPF: Pass (protection.outlook.com: domain of rambus.com designates 192.86.86.210 as permitted sender) receiver=protection.outlook.com; client-ip=192.86.86.210; helo=hqxsv-psmtppxy02.rambus.com; pr=C Received: from hqxsv-psmtppxy02.rambus.com (192.86.86.210) by DS3PEPF000099E1.mail.protection.outlook.com (10.167.17.196) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.21.181.6 via Frontend Transport; Thu, 25 Jun 2026 17:34:06 +0000 Received: from hqxsv-cmdev3-skrishnamoorthy.rambus.com (hqn-lb-int-float.rambus.com [10.12.20.20]) by hqxsv-psmtppxy02.rambus.com (Postfix) with ESMTPS id B597C1801753; Thu, 25 Jun 2026 17:34:05 +0000 (UTC) From: Saravanakrishnan Krishnamoorthy To: Albert Ou , Alex Ousherovitch , Conor Dooley , "David S. Miller" , Herbert Xu , Jonathan Corbet , Krzysztof Kozlowski , Palmer Dabbelt , Paul Walmsley , Rob Herring , Saravanakrishnan Krishnamoorthy , Shuah Khan Cc: Alexandre Ghiti , devicetree@vger.kernel.org, Joel Wittenauer , linux-api@vger.kernel.org, linux-crypto@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-riscv@lists.infradead.org, Shuah Khan , sipsupport@rambus.com, Thi Nguyen Subject: [PATCH 02/19] crypto: cmh - add core platform driver Date: Thu, 25 Jun 2026 10:33:10 -0700 Message-ID: <20260625173328.1140487-3-skrishnamoorthy@rambus.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260625173328.1140487-1-skrishnamoorthy@rambus.com> References: <20260625173328.1140487-1-skrishnamoorthy@rambus.com> Precedence: bulk X-Mailing-List: linux-api@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099E1:EE_|LV3PR04MB9348:EE_ Content-Type: text/plain X-MS-Office365-Filtering-Correlation-Id: dad797a3-bdb6-4f35-2e7c-08ded2dff534 X-LD-Processed: bd0ba799-c2b9-413c-9c56-5d1731c4827c,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|1800799024|23010399003|36860700016|82310400026|921020|7136999003|6133799003|3023799007|22082099003|18002099003|56012099006|5023799004|11063799006; X-Microsoft-Antispam-Message-Info: 1i6lptL0UYMB39ey+HC9I2RxB2OjSlPHuVtzcVqEHWQlSEMQz5lmwCvz7YrV0GFtS7KGTNj6rrwOQSpDkz0SICB/UOoth5sCIFbCYznKmfudn9JsN6RpVat3a1MXeXEWHHtNBIdH5P97GNLh9C22BhImHRE0cMI/KtT7vD9GJr0jREpqGnimAJlI7JaB6NBCEft3DGWgNo16fpkh3xmGsMwDUpyb9T/Xs5ZflW0JvY9+jZDDYGrRE38fmePB4GxKM3s37EVvq/1L+AOpYnGJJWPLZeBNutAesj0zchP5ds2R9sVsMd0nmU1iNUIf/H8zC9qx2q9qyKJ4YBBlQbc9fCCHwZMbTx4NVwfPlI/FlR8jwqTWEQoWCePSB08Bf+IQKcgYdaP55foDUMedf7SJigLZR4eR37GMJtWj0uoMYSyhambDb1CKUwonMPybqA/eQsm87tnB6LxLvrW+Q7VOr+mbjHGLHy9MDXSD6Gl2Okv3rs/VqoeVewEbz1xN8zfXz19A8urE7PrbTBsL58rdsRNGG51m53aBV26qyAE81WriXi3jR9SopzL900xEHRTPz9AhGEZb8EpsKOWBFSsPw/au7V1+ZkdH6zKeSQN4Yq6gbzOUlTUma1HXDk+b6/xe09doZJgrjkhFHFGrc9JQNmzECFEl8f5SnRaSKsOW4v03DBGo8B5ylL9knMgLiEpZfTNvV1m9bDnQoKL2KqXuVAxvJrfg5+pW+9B964TzyydPhIt+HPK0ohCG6HgVOtB4 X-Forefront-Antispam-Report: CIP:192.86.86.210;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:hqxsv-psmtppxy02.rambus.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(7416014)(376014)(1800799024)(23010399003)(36860700016)(82310400026)(921020)(7136999003)(6133799003)(3023799007)(22082099003)(18002099003)(56012099006)(5023799004)(11063799006);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: R7IQ1XieN0r9KNrLSBXAA2mtmVqn4LyuIYxZDgyoNvfR4dmXEJIsgQDygZ7ZKiqqLQN8W27ehu5if43Ujim3DzlXO5H/3XqokN4OEJ7xi3MfnV5cm/9KD1kVrdlqUjOetRd1nYE6QED0gYdyPmix4Xi4+105Bcku+WjANyvfZWBpU7e7fpnvMkAjuvDaWa5LK6a0eo4tpoZ/LSp7U3qkhAxGjUVa5t/zaa63QZutAI6BKEVBQjfJEhXwkoogNUamlrJJ3QefU822wK/rNrelTmWpq7VbA7lrYQrQMn2y2+XDHDTHQjq5Fxtx14wLr7llwaWT2qW1JazQrFjFJC46ehiIjnhh22+UStel6UY6Fw/1xJ7UrP8ePVQV/hO/GJHeQvyT01wEPct3xBH3vfGt4eVQ50i6juK7CVEKCnPZ6TlEkmiMx9ODqzI0mPO5uNjI X-OriginatorOrg: rambus.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Jun 2026 17:34:06.4760 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dad797a3-bdb6-4f35-2e7c-08ded2dff534 X-MS-Exchange-CrossTenant-Id: bd0ba799-c2b9-413c-9c56-5d1731c4827c X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=bd0ba799-c2b9-413c-9c56-5d1731c4827c;Ip=[192.86.86.210];Helo=[hqxsv-psmtppxy02.rambus.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099E1.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV3PR04MB9348 From: Alex Ousherovitch Add the CRI CryptoManager Hub (CMH) hardware crypto accelerator core platform driver. This patch provides: - Platform driver registration and probe/remove lifecycle - Hardware configuration and core discovery - Mailbox Queue Interface (MQI) for VCQ command submission - Transaction manager with async completion and backlog support - Result handler with threaded-IRQ completion - DMA buffer management - Debugfs instrumentation (when CONFIG_CRYPTO_DEV_CMH_DEBUG=3Dy) - Sysfs attributes (fw_version, hw_version, product, algorithms) - Kconfig and Makefile integration No crypto algorithms are registered yet -- those follow in subsequent patches. The driver communicates with the hardware via a mailbox-based VCQ (Virtual Command Queue) interface. Each crypto operation is packed into VCQ command entries, submitted to a mailbox, and completed asynchronously via interrupt. MODULE_IMPORT_NS(CRYPTO_INTERNAL) imports two symbols: - crypto_cipher_setkey() [crypto/cipher.c, EXPORT_SYMBOL_NS_GPL] - crypto_cipher_encrypt_one() [crypto/cipher.c, EXPORT_SYMBOL_NS_GPL] These are the single-block cipher API used for software-fallback paths: CCM empty-input tag computation (2 ECB encryptions + XOR) and XCBC(SM4) empty-message workaround (3 ECB encryptions + XOR). No public wrapper exists; this is the same pattern used by in-tree crypto/ccm.c, crypto/cmac.c, and crypto/xcbc.c. Co-developed-by: Saravanakrishnan Krishnamoorthy Signed-off-by: Saravanakrishnan Krishnamoorthy Signed-off-by: Alex Ousherovitch Reviewed-by: Joel Wittenauer Reviewed-by: Thi Nguyen --- Documentation/ABI/testing/debugfs-driver-cmh | 155 ++ Documentation/ABI/testing/sysfs-driver-cmh | 66 + Documentation/crypto/device_drivers/cmh.rst | 356 +++ Documentation/crypto/device_drivers/index.rst | 1 + drivers/crypto/Kconfig | 1 + drivers/crypto/Makefile | 1 + drivers/crypto/cmh/Kconfig | 46 + drivers/crypto/cmh/Makefile | 25 + drivers/crypto/cmh/cmh_config.c | 476 ++++ drivers/crypto/cmh/cmh_debugfs.c | 286 +++ drivers/crypto/cmh/cmh_dma.c | 373 ++++ drivers/crypto/cmh/cmh_main.c | 365 +++ drivers/crypto/cmh/cmh_mqi.c | 355 +++ drivers/crypto/cmh/cmh_rh.c | 1145 ++++++++++ drivers/crypto/cmh/cmh_sysfs.c | 108 + drivers/crypto/cmh/cmh_txn.c | 1978 +++++++++++++++++ drivers/crypto/cmh/include/cmh.h | 27 + drivers/crypto/cmh/include/cmh_aes_abi.h | 97 + drivers/crypto/cmh/include/cmh_ccp_abi.h | 108 + drivers/crypto/cmh/include/cmh_config.h | 91 + drivers/crypto/cmh/include/cmh_debugfs.h | 90 + drivers/crypto/cmh/include/cmh_dma.h | 219 ++ drivers/crypto/cmh/include/cmh_drbg_abi.h | 67 + drivers/crypto/cmh/include/cmh_eac_abi.h | 44 + drivers/crypto/cmh/include/cmh_hc_abi.h | 162 ++ drivers/crypto/cmh/include/cmh_hcq_abi.h | 221 ++ drivers/crypto/cmh/include/cmh_kic_abi.h | 77 + drivers/crypto/cmh/include/cmh_mqi.h | 36 + drivers/crypto/cmh/include/cmh_pke_abi.h | 272 +++ drivers/crypto/cmh/include/cmh_qse_abi.h | 181 ++ drivers/crypto/cmh/include/cmh_registers.h | 145 ++ drivers/crypto/cmh/include/cmh_rh.h | 93 + drivers/crypto/cmh/include/cmh_rng.h | 31 + drivers/crypto/cmh/include/cmh_sm3_abi.h | 79 + drivers/crypto/cmh/include/cmh_sm4_abi.h | 101 + drivers/crypto/cmh/include/cmh_sys_abi.h | 148 ++ drivers/crypto/cmh/include/cmh_sysfs.h | 14 + drivers/crypto/cmh/include/cmh_txn.h | 463 ++++ drivers/crypto/cmh/include/cmh_vcq.h | 283 +++ 39 files changed, 8786 insertions(+) create mode 100644 Documentation/ABI/testing/debugfs-driver-cmh create mode 100644 Documentation/ABI/testing/sysfs-driver-cmh create mode 100644 Documentation/crypto/device_drivers/cmh.rst create mode 100644 drivers/crypto/cmh/Kconfig create mode 100644 drivers/crypto/cmh/Makefile create mode 100644 drivers/crypto/cmh/cmh_config.c create mode 100644 drivers/crypto/cmh/cmh_debugfs.c create mode 100644 drivers/crypto/cmh/cmh_dma.c create mode 100644 drivers/crypto/cmh/cmh_main.c create mode 100644 drivers/crypto/cmh/cmh_mqi.c create mode 100644 drivers/crypto/cmh/cmh_rh.c create mode 100644 drivers/crypto/cmh/cmh_sysfs.c create mode 100644 drivers/crypto/cmh/cmh_txn.c create mode 100644 drivers/crypto/cmh/include/cmh.h create mode 100644 drivers/crypto/cmh/include/cmh_aes_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_ccp_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_config.h create mode 100644 drivers/crypto/cmh/include/cmh_debugfs.h create mode 100644 drivers/crypto/cmh/include/cmh_dma.h create mode 100644 drivers/crypto/cmh/include/cmh_drbg_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_eac_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_hc_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_hcq_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_kic_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_mqi.h create mode 100644 drivers/crypto/cmh/include/cmh_pke_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_qse_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_registers.h create mode 100644 drivers/crypto/cmh/include/cmh_rh.h create mode 100644 drivers/crypto/cmh/include/cmh_rng.h create mode 100644 drivers/crypto/cmh/include/cmh_sm3_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_sm4_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_sys_abi.h create mode 100644 drivers/crypto/cmh/include/cmh_sysfs.h create mode 100644 drivers/crypto/cmh/include/cmh_txn.h create mode 100644 drivers/crypto/cmh/include/cmh_vcq.h diff --git a/Documentation/ABI/testing/debugfs-driver-cmh b/Documentation/A= BI/testing/debugfs-driver-cmh new file mode 100644 index 000000000000..3bbf903a4511 --- /dev/null +++ b/Documentation/ABI/testing/debugfs-driver-cmh @@ -0,0 +1,155 @@ +What: /sys/kernel/debug/cmh/mbx/vcqs_submitted +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Total number of VCQ command entries submitted to + mailbox N since the driver was loaded. + +What: /sys/kernel/debug/cmh/mbx/vcqs_completed +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Total number of VCQ command completions received + from mailbox N. + +What: /sys/kernel/debug/cmh/mbx/vcqs_errors +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Total number of error completions received from + mailbox N. + +What: /sys/kernel/debug/cmh/mbx/queue_full_count +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Number of times the transaction manager skipped + mailbox N because its in-flight queue was full. + +What: /sys/kernel/debug/cmh/mbx/max_queue_depth +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) High-water mark of in-flight transactions on + mailbox N. + +What: /sys/kernel/debug/cmh/mbx/inject_abort +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (WO) Write any value to inject an MBX_COMMAND_ABORT on + mailbox N. The abort triggers error-IRQ handling that + completes all in-flight transactions with -EIO and then + issues MBX_COMMAND_RESTART to resume the mailbox. + Only available when CONFIG_CRYPTO_DEV_CMH_DEBUG is enabled. + +What: /sys/kernel/debug/cmh/mbx/force_drain +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (WO) Write any value to unconditionally FLUSH and drain + all pending transactions on mailbox N, completing each + with -ECANCELED, and reset all recovery bookkeeping + (including the wedged flag). The mailbox is re-enabled + for new work immediately; no hardware health verification + is performed. Use as a last-resort recovery when the eSW + is unresponsive and normal ABORT/RESTART escalation has + not recovered the mailbox. + Only available when CONFIG_CRYPTO_DEV_CMH_DEBUG is enabled. + +What: /sys/kernel/debug/cmh/tm/cmq_posts +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Total number of cmh_tm_post_command() calls (one + per crypto request submitted to the transaction manager). + +What: /sys/kernel/debug/cmh/tm/cmq_depth_max +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) High-water mark of the command queue length. + +What: /sys/kernel/debug/cmh/tm/cmq_eagain_count +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Number of times the command queue was full and + returned -EAGAIN to the caller. + +What: /sys/kernel/debug/cmh/tm/backoff_count +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Number of times the transaction manager backed off + because all mailbox queues were full. + +What: /sys/kernel/debug/cmh/tm/async_timeout_count +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RO) Number of async crypto requests that timed out + waiting for hardware completion. + +What: /sys/kernel/debug/cmh/config/async_timeout_ms +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RW) Async request timeout in milliseconds. On timeout + the driver issues MBX_COMMAND_ABORT; if the eSW is + unresponsive, the watchdog escalates through RESTART, + FLUSH, and force-drain to bound D-state duration. + +What: /sys/kernel/debug/cmh/config/vcq_timeout_ms +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RW) VCQ command timeout in milliseconds. + +What: /sys/kernel/debug/cmh/config/slow_op_timeout_ms +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RW) Slow-operation timeout in milliseconds. Used for + operations known to take longer (e.g. RSA key generation, + PQC key generation). + +What: /sys/kernel/debug/cmh/config/drain_timeout_ms +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RW) Drain timeout in milliseconds. Maximum time to wait + for all in-flight transactions to complete during driver + removal or suspend. + +What: /sys/kernel/debug/cmh/config/watchdog_ms +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RW) Result-handler watchdog interval in milliseconds. + Detects missed IRQs, stuck mailboxes, and abort-stall + conditions. Clamped to a 10 ms minimum. + +What: /sys/kernel/debug/cmh/config/drbg_timeout_ms +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + (RW) DRBG self-seed timeout in milliseconds. diff --git a/Documentation/ABI/testing/sysfs-driver-cmh b/Documentation/ABI= /testing/sysfs-driver-cmh new file mode 100644 index 000000000000..62e593fac6fe --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-cmh @@ -0,0 +1,66 @@ +What: /sys/devices/platform//fw_version +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + Reports the CryptoManager Hub embedded software (eSW) firmw= are + version as a 32-bit hexadecimal value read from the SIC + SW_VERSION register. + + Example: "0x00010002" + + Read-only. + +What: /sys/devices/platform//hw_version +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + Reports the CryptoManager Hub hardware version as a 32-bit + hexadecimal value read from the SIC HW_VERSION0 register. + + Example: "0x00000000" + + Read-only. + +What: /sys/devices/platform//boot_status +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + Reports the CryptoManager Hub boot status register as a 32-= bit + hexadecimal value. This reflects the firmware boot + progress and final state: + + 0x00000066 - firmware booted (post-self-test) + other - firmware boot in progress or failed + + Read-only. + +What: /sys/devices/platform//mbx_available +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + Reports the mailbox availability bitmap as a 32-bit + hexadecimal value read from the SIC MBX_AVAILABILITY + register. Each set bit indicates a hardware mailbox + instance that the firmware has made available. + + Example: "0x00000003" (mailboxes 0 and 1 available) + + Read-only. + +What: /sys/devices/platform//mbx_count +Date: June 2026 +KernelVersion: 7.1 +Contact: linux-crypto@vger.kernel.org +Description: + Reports the number of mailboxes the driver has configured, + as a decimal integer. This reflects the driver's active + configuration (from DT properties or module parameters), + which may be fewer than illustrated by mbx_available. + + Example: "2" + + Read-only. diff --git a/Documentation/crypto/device_drivers/cmh.rst b/Documentation/cr= ypto/device_drivers/cmh.rst new file mode 100644 index 000000000000..4319b9ff1ab1 --- /dev/null +++ b/Documentation/crypto/device_drivers/cmh.rst @@ -0,0 +1,356 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +CRI CryptoManager Hub (CMH) Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Overview +=3D=3D=3D=3D=3D=3D=3D=3D + +The ``cmh`` driver supports the CRI CryptoManager Hub hardware cryptograph= ic +accelerator. The hardware is accessed through a mailbox-based VCQ +(Virtual Command Queue) interface: the driver writes command sequences +into per-mailbox DMA queue buffers and rings a doorbell register; the +CryptoManager Hub embedded software (eSW) processes the commands and signa= ls +completion via a per-mailbox interrupt. + +The driver registers algorithms with the Linux kernel crypto subsystem +and exposes a management character device (``/dev/cmh_mgmt``) for +operations that have no standard crypto API binding. + +Hardware Interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The CryptoManager Hub is presented as a platform device matched via Device= Tree +(compatible ``"cri,cmh"``). The driver maps a single MMIO region +(the SIC -- System Interface Controller) whose sub-regions contain +per-mailbox doorbell, status, and command queue registers. + +The driver manages a configurable number of mailboxes (default 2). +Each mailbox has a configurable number of slots (default 32) and a +configurable stride (default 128 bytes per slot). The driver allocates +DMA-coherent memory for each mailbox queue during probe. + +Interrupts are per-mailbox completion/error interrupts. The driver +registers a threaded IRQ handler for each configured mailbox. + +The eSW is loaded independently of this driver -- typically by the +boot firmware or a platform-specific loader -- so the driver does not +use ``request_firmware()``. Instead it waits for the eSW to reach +mission mode during probe, bounded by ``fw_ready_timeout_ms``. + +Supported Algorithms +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The driver registers the following algorithm families: + +Hash (ahash) + SHA-224, SHA-256, SHA-384, SHA-512, SHA3-224, SHA3-256, SHA3-384, + SHA3-512, SHAKE-128, SHAKE-256, cSHAKE-128, cSHAKE-256, KMAC-128, + KMAC-256, SM3 (10 hash + 2 cSHAKE + 2 KMAC + 1 SM3 =3D 15 algorithms) + +HMAC (ahash) + HMAC-SHA-224, HMAC-SHA-256, HMAC-SHA-384, HMAC-SHA-512, + HMAC-SHA3-224, HMAC-SHA3-256, HMAC-SHA3-384, HMAC-SHA3-512 + (8 algorithms) + +Symmetric Ciphers (skcipher) + AES: ECB, CBC, CTR, CFB, XTS (5 algorithms) + SM4: ECB, CBC, CTR, CFB, XTS (5 algorithms) + ChaCha20 (1 algorithm) + +AEAD + AES-GCM, AES-CCM (2 algorithms) + SM4-GCM, SM4-CCM (2 algorithms) + ``rfc7539(chacha20,poly1305)``, ``rfc7539esp(chacha20,poly1305)`` + (2 algorithms) + +MAC (ahash) + CMAC(AES) (1 algorithm) + CMAC(SM4), XCBC(SM4) (2 algorithms) + Poly1305 (1 algorithm) + +Public-Key, Key Agreement, and PQC Signatures + RSA (akcipher, 1 algorithm) + ECDSA P-256, P-384, P-521 (sig, 3 algorithms) + SM2 (sig, verify-only, 1 algorithm) + ECDH P-256, P-384, X25519 (kpp, 3 algorithms) + ML-DSA-44, ML-DSA-65, ML-DSA-87 (sig, 3 algorithms) + SLH-DSA: all 12 parameter sets (sig, 12 algorithms) + LMS, LMS-HSS (sig, verify-only, 2 algorithms) + XMSS, XMSS-MT (sig, verify-only, 2 algorithms) + (ML-KEM keygen/encaps/decaps is available via ``/dev/cmh_mgmt`` + only -- see `Limitations`_.) + +Hardware RNG + DRBG-backed hwrng (``/dev/hwrng``, 1 algorithm) + +All algorithm driver names use the ``cri-cmh-`` prefix (e.g. +``cri-cmh-sha256``, ``cri-cmh-ecb-aes``, ``cri-cmh-gcm-aes``, +``cri-cmh-mldsa44``). Names generally follow the kernel's hyphenated +template name; families that have no kernel template (e.g. ML-DSA) use +the concatenated upstream algorithm name (``mldsa44``). + +Most algorithms register at priority 300 (301 for AES-CCM). +The ML-DSA ``sig`` algorithms register at priority 5001 to +outrank the kernel's generic software ML-DSA (priority 5000, which is +verify-only); the CMH driver provides full hardware sign and verify. + +Request model +------------- + +All crypto API operations are asynchronous: the driver queues each +request to its transaction-manager kthread and returns +``-EINPROGRESS``, invoking the caller's completion callback when the +hardware finishes. Requests that set ``CRYPTO_TFM_REQ_MAY_BACKLOG`` +are queued on a backlog of up to ``backlog_max_depth`` entries when the +command queue is full; without that flag a full queue is reported as +``-EBUSY``. Hardware or eSW failures surface as ``-EIO``, malformed +requests as ``-EINVAL``, oversized requests as ``-EMSGSIZE`` or +``-EINVAL`` (see `Data-Size Limits`_), and unresponsive hardware as +``-ETIMEDOUT``. The ``/dev/cmh_mgmt`` ioctls are, by contrast, +synchronous -- each ioctl blocks until the hardware completes. + +Driver Architecture +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The driver is structured as follows: + +Platform Driver + Matches DT compatible ``"cri,cmh"``. Probe initializes all + subsystems in order; remove tears them down in reverse. + +Configuration + Parses DT properties and module parameter overrides. Validates + mailbox counts, slot sizes, and stride values. + +MQI (Mailbox Queue Interface) + Allocates DMA-coherent queue memory per mailbox. Manages slot + allocation, VCQ command writing, and doorbell ringing. + +Transaction Manager + A dedicated kthread dequeues crypto requests from a central command + queue, builds VCQ command sequences, and submits them to mailbox + slots. Completion is signaled via wait queues. + +Response Handler + Per-mailbox threaded IRQ handlers walk completed slots, parse + results, and fire request completions. A configurable watchdog + timer (the ``watchdog_ms`` debugfs knob, default 200 ms) detects + stuck requests and escalates through ABORT, RESTART, and FLUSH + recovery. + +Key Management (``/dev/cmh_mgmt``) + A misc character device providing ioctl-based access to datastore + key CRUD, key derivation (KIC), PKE operations (EdDSA, SM2), + PQC operations (ML-KEM, ML-DSA, SLH-DSA), + EAC error register readback, and DRBG runtime configuration. + See ``Documentation/ABI/testing/cmh-mgmt`` for the full ioctl list. + +Power Management + The driver implements ``DEFINE_SIMPLE_DEV_PM_OPS`` suspend/resume. + On suspend, the transaction-manager kthread is stopped and pending + transactions are drained, waiting up to ``drain_timeout_ms`` + (default 10000 ms); resume restarts the kthread. + +Module Parameters +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The driver defines four production module parameters and five +debug-only parameters (compiled only with +``CONFIG_CRYPTO_DEV_CMH_DEBUG``). In production, all mailbox topology, +per-core affinity, slot counts, strides, and timeout tuning are taken +from Device Tree properties, not module parameters. The debug-only +parameters exist solely to force alternate geometries at ``insmod`` +time during bringup and validation (for example, to drive the +mailbox-contention and cross-mailbox dispatch paths without +rebuilding the Device Tree); they default to "use the DT value" +and have no effect in a production build. + +Production: + +``fw_ready_timeout_ms`` (uint, default 5000, RO) + Timeout in milliseconds to wait for CMH eSW to reach mission mode + during probe. + +``cmq_max_depth`` (uint, default 256, RO) + Maximum number of pending commands in the central Command Message + Queue. + +``backlog_max_depth`` (uint, default 1024, RO) + Maximum depth of the backlog queue for ``CRYPTO_TFM_REQ_MAY_BACKLOG`` + requests. Set to 0 to disable backlogs. + +``hwrng_quality`` (int, default 0, RO) + Quality value passed to ``hwrng_register()``. 0 disables kernel CRNG + seeding; 1-1024 sets the quality directly. + +Debug-only (``CONFIG_CRYPTO_DEV_CMH_DEBUG``): + +``mbx_count_override`` (uint, default 0, RO) + Override the DT mailbox count (0 =3D use DT) to force fewer + mailboxes than the hardware provides. + +``mbx_slots_override`` (uint, default 0, RO) + Override all MBX slots_log2 values (0 =3D use DT). + +``mbx_round_robin`` (bool, default false, RO) + Ignore DT ``cri,mbx`` affinity pins and round-robin all cores + across the configured mailboxes (0 =3D use DT affinity). Restores + the unpinned dispatch that exercises cross-mailbox distribution. + +``drbg_config`` (charp, default "auto", RO) + DRBG configuration at probe: ``"auto"`` (normal) or ``"skip"`` + (skip initial DRBG configuration). + +``skip_fw_check`` (bool, default false, RO) + Skip the SIC boot status and eSW mission-mode checks at probe. + Allows the module to load before the eSW has booted. + +Runtime-tunable timeout knobs are exposed via debugfs rather than +module parameters; see `debugfs Counters`_ below. + +sysfs Attributes +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The driver exposes five read-only attributes under the platform +device sysfs directory: ``fw_version``, ``hw_version``, +``boot_status``, ``mbx_available``, and ``mbx_count``. See +``Documentation/ABI/testing/sysfs-driver-cmh`` for the authoritative +per-attribute description. + +debugfs Counters +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +When built with ``CONFIG_CRYPTO_DEV_CMH_DEBUG``, the driver creates +``/sys/kernel/debug/cmh/`` with three groups: per-mailbox counters +(``mbxN/``), transaction-manager statistics (``tm/``), and +runtime-tunable timeout knobs (``config/``, including +``drain_timeout_ms`` and ``watchdog_ms``). See +``Documentation/ABI/testing/debugfs-driver-cmh`` for the authoritative +per-file description. + +Device Tree Binding +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +See ``Documentation/devicetree/bindings/crypto/cri,cmh.yaml`` for the +full DT binding schema and complete, schema-validated examples +(including the per-mailbox topology properties ``cri,mbx-instances``, +``cri,mbx-slots-log2``, and ``cri,mbx-strides-log2``). When those +properties are omitted the driver falls back to two mailboxes +(instances 0 and 1) with the slot/stride defaults described above. + +User-Space Interfaces +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +``/dev/cmh_mgmt`` + Management character device. Opening it requires ``CAP_SYS_ADMIN``. + See ``Documentation/ABI/testing/cmh-mgmt`` for ioctl documentation. + The UAPI header is ````. + +In-kernel crypto API + All algorithms register with the standard kernel crypto API and are + consumed by in-kernel users (dm-crypt, fscrypt, IPsec, kTLS, etc.). + + Keys provisioned inside the hardware via ``/dev/cmh_mgmt`` are + referenced by an opaque hardware key identifier and are operated on + through the ``/dev/cmh_mgmt`` ioctl interface, without ever exposing + plaintext key material to user space. See + ``Documentation/ABI/testing/cmh-mgmt`` for key provisioning. + +``/dev/hwrng`` + The DRBG-backed hardware RNG is available as a standard hwrng device. + +Limitations +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +- LMS and XMSS support verify-only (no sign/keygen in hardware for + stateful hash-based signatures). +- SM2 sig registration is verify-only (sign via ``/dev/cmh_mgmt`` ioctl). +- EdDSA (Ed25519/Ed448) is available only through ``/dev/cmh_mgmt`` + ioctls; no kernel ``sig`` registration. +- ML-KEM operations (encapsulate/decapsulate/keygen) are available only + through ``/dev/cmh_mgmt`` ioctls; no standard kernel crypto API + binding exists for KEM. + +Data-Size Limits +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The driver imposes data-size limits on several APIs. These are +driver-level safety caps for kernel memory allocation unless noted +otherwise. + +Symmetric / AEAD / MAC linearization caps: + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +Scope Limit Origin +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +AES skcipher 32 MiB Driver-imposed DMA linearization = cap +SM4 skcipher 32 MiB Driver-imposed DMA linearization = cap +All AEAD + ChaCha20 skcipher 1 MiB Driver-imposed DMA linearization = cap +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D + +MAC and keyed-hash algorithms that buffer all input in kernel memory +(hardware lacks context save/restore): + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D= =3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Algorithm Limit Reason +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D= =3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +``cmac(aes)`` 64 KiB AES core has no external save/restore +``cmac(sm4)`` 64 KiB SM4 core has no external save/restore +``xcbc(sm4)`` 64 KiB SM4 core has no external save/restore +``poly1305`` 64 KiB CCP core has no external save/restore +``hmac(sha*)`` 64 KiB HMAC save/restore not supported (see below) +``hmac(sha3-*)`` 64 KiB HMAC save/restore not supported (see below) +``kmac128`` 64 KiB eSW rejects save when outlen !=3D 0 +``kmac256`` 64 KiB eSW rejects save when outlen !=3D 0 +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D= =3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +HMAC save/restore is unsupported by the eSW firmware. For HMAC-SHA3, +exposing the Keccak sponge state would allow key recovery because the +sponge permutation is invertible; HMAC-SHA2 save/restore is likewise +not exposed by the eSW. + +HMAC ``.export()``/``.import()`` (used for request cloning) is limited +to a single-page accumulated-data window of 4092 bytes (one page minus +a 4-byte length header), since the crypto subsystem pre-allocates the +state buffer per request. Cloning a request that has accumulated more +input than this window fails. + +Requests exceeding the limit are rejected with ``-EINVAL``. Pure hash +algorithms (SHA-2, SHA-3, SHAKE, cSHAKE, SM3) have no data limit because +the hardware supports incremental save/restore. + +cSHAKE uses save/restore for ``.export()``/``.import()`` but accumulates +data in ``.update()`` by design (the Keccak sponge has no block-alignment +boundary to trigger per-update HW submission, and HC_CMD_GATHER amortizes +the cost into a single finalize-time submission). + +Asymmetric / PQC algorithm limits: + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D +Scope Limit Origin +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D +RSA key size 4096 bit HW-imposed +ML-DSA message 10 KiB eSW-imposed (QSE ABI) +SLH-DSA message 128 B eSW-imposed (HCQ ABI) +SLH-DSA context 255 B Spec-imposed (FIPS 205) +LMS public key 60 B eSW-imposed (HCQ ABI) +LMS message 256 B eSW-imposed (HCQ ABI) +LMS signature 13,364 B eSW-imposed (HCQ ABI) +XMSS public key 136 B eSW-imposed (HCQ ABI) +XMSS message 64 B eSW-imposed (HCQ ABI) +XMSS signature 27,688 B eSW-imposed (HCQ ABI) +SM2 encrypt message 32 B eSW KDF (single SM3 block) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D + +Miscellaneous limits: + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D +Scope Limit Origin +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D +cSHAKE/KMAC customization 256 B VCQ slot layout constraint +KIC HKDF key 64 B Partially eSW-derived +KIC HKDF label 56 B VCQ slot layout constraint +Key/blob mgmt ioctls 256 KiB Driver-imposed sanity cap +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D diff --git a/Documentation/crypto/device_drivers/index.rst b/Documentation/= crypto/device_drivers/index.rst index c81d311ac61b..c0247fc97bf8 100644 --- a/Documentation/crypto/device_drivers/index.rst +++ b/Documentation/crypto/device_drivers/index.rst @@ -6,4 +6,5 @@ Hardware Device Driver Specific Documentation .. toctree:: :maxdepth: 1 + cmh octeontx2 diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index 216a00bad5d7..777c12496bdf 100644 --- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -839,4 +839,5 @@ source "drivers/crypto/starfive/Kconfig" source "drivers/crypto/inside-secure/eip93/Kconfig" source "drivers/crypto/ti/Kconfig" +source "drivers/crypto/cmh/Kconfig" endif # CRYPTO_HW diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index 5a950c7abc39..7ad69348f0f0 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -47,3 +47,4 @@ obj-y +=3D intel/ obj-y +=3D starfive/ obj-y +=3D cavium/ obj-y +=3D ti/ +obj-$(CONFIG_CRYPTO_DEV_CMH) +=3D cmh/ diff --git a/drivers/crypto/cmh/Kconfig b/drivers/crypto/cmh/Kconfig new file mode 100644 index 000000000000..fa5adeca2512 --- /dev/null +++ b/drivers/crypto/cmh/Kconfig @@ -0,0 +1,46 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# CRI CryptoManager Hub (CMH) hardware crypto accelerator +# + +config CRYPTO_DEV_CMH + tristate "CRI CryptoManager Hub (CMH) hardware crypto accelerator" + depends on CRYPTO && OF && HAS_IOMEM && (64BIT || COMPILE_TEST) + select CRYPTO_HASH + select CRYPTO_SKCIPHER + select CRYPTO_AEAD + select CRYPTO_AKCIPHER + select CRYPTO_SIG + select CRYPTO_KPP + select CRYPTO_ECC + select CRYPTO_RSA + select CRYPTO_AES + select CRYPTO_CCM + select CRYPTO_SM4_GENERIC + select HW_RANDOM + help + Driver for the CRI CryptoManager Hub (CMH) hardware crypto accele= rator. + Accesses the hardware via a mailbox-based VCQ (Virtual Command + Queue) interface and registers algorithms with the kernel + crypto subsystem. + + Supported algorithm families: AES (ECB/CBC/CTR/XTS/CFB), + SM4 (ECB/CBC/CTR/XTS/CFB), ChaCha20-Poly1305, AES-GCM, AES-CCM, + SHA-2, SHA-3, SHAKE, CSHAKE, KMAC, SM3, HMAC, AES-CMAC, + SM4-CMAC, SM4-XCBC, RSA, ECDSA, ECDH, SM2, and DRBG (hwrng). + Ioctl-only algorithms: EdDSA, ML-KEM. + + To compile this driver as a module, choose M here. + +config CRYPTO_DEV_CMH_DEBUG + bool "CMH debug instrumentation (debugfs counters)" + depends on CRYPTO_DEV_CMH && DEBUG_FS + help + Enable per-mailbox debugfs counters under + /sys/kernel/debug/cmh/ for the CMH driver. + Exposes VCQ submit/complete/error counts, queue depth + high-water marks, and transaction manager backoff statistics. + + Useful for bringup, validation, and performance analysis. + Not recommended for production. + diff --git a/drivers/crypto/cmh/Makefile b/drivers/crypto/cmh/Makefile new file mode 100644 index 000000000000..0a4591c9fd86 --- /dev/null +++ b/drivers/crypto/cmh/Makefile @@ -0,0 +1,25 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the CRI CryptoManager Hub (CMH) hardware crypto accelerator= driver. +# + +obj-$(CONFIG_CRYPTO_DEV_CMH) +=3D cmh.o + +cmh-y :=3D \ + cmh_main.o \ + cmh_config.o \ + cmh_mqi.o \ + cmh_txn.o \ + cmh_rh.o \ + cmh_dma.o \ + cmh_sysfs.o + +ccflags-y +=3D -I$(src)/include + +# Suppress -Woverride-init for the [0 ... N] =3D -1 range-initializer patt= ern +# (standard kernel idiom for sparse lookup tables with a default value). +CFLAGS_cmh_config.o +=3D -Wno-override-init + +# Debug instrumentation: per-mailbox debugfs counters. +# cmh_debugfs.o is linked into the composite cmh.o (same tristate). +cmh-$(CONFIG_CRYPTO_DEV_CMH_DEBUG) +=3D cmh_debugfs.o diff --git a/drivers/crypto/cmh/cmh_config.c b/drivers/crypto/cmh/cmh_confi= g.c new file mode 100644 index 000000000000..4631eebb1556 --- /dev/null +++ b/drivers/crypto/cmh/cmh_config.c @@ -0,0 +1,476 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Configuration from Device Tree + * + * The CMH device tree node provides: + * - reg: SIC base + size (mandatory) + * - interrupts: per-MBX IRQs (mandatory for IRQ mode) + * - cri,mbx-instances: array of MBX instance IDs + * - cri,mbx-slots-log2: per-MBX slot count as log2 + * - cri,mbx-strides-log2: per-MBX stride as log2 + * + * Per-core-type child nodes (e.g. aes@3, pke@a): + * - reg: hardware core ID (CORE_ID_* from cmh_vcq.h) + * - cri,mbx: (optional) pin to a specific MBX index + * + * Module parameters (non-topology): + * - fw_ready_timeout_ms: CMH eSW mission-mode boot timeout + * (hwrng_quality, cmq_max_depth, backlog_max_depth live in other files) + */ + +#include +#include +#include +#include + +#include "cmh_config.h" +#include "cmh_dma.h" + +/* -- Module parameters ------------------------------------------------- = */ + +static unsigned int fw_ready_timeout_ms =3D CMH_DEFAULT_FW_READY_TIMEOUT_M= S; +module_param(fw_ready_timeout_ms, uint, 0444); +MODULE_PARM_DESC(fw_ready_timeout_ms, + "Timeout in ms to wait for CMH eSW mission mode (default 5= 000)"); + +/* + * Debug-only MBX overrides for stress testing. + * When non-zero, these override the corresponding DT values, enabling + * contention stress tests to force a minimal MBX config + * (e.g. mbx_count_override=3D1 mbx_slots_override=3D1 for 1 MBX, 2 slots)= . + */ +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +static unsigned int mbx_count_override; +module_param(mbx_count_override, uint, 0444); +MODULE_PARM_DESC(mbx_count_override, + "[debug] Override DT MBX count (0 =3D use DT, default: 0)"= ); + +static unsigned int mbx_slots_override; +module_param(mbx_slots_override, uint, 0444); +MODULE_PARM_DESC(mbx_slots_override, + "[debug] Override all MBX slots_log2 (0 =3D use DT, defaul= t: 0)"); + +static bool mbx_round_robin; +module_param(mbx_round_robin, bool, 0444); +MODULE_PARM_DESC(mbx_round_robin, + "[debug] Ignore DT cri,mbx pins and round-robin all cores = across MBXes (0 =3D use DT affinity, default: 0)"); +#endif + +/* -- Core ID -> core_type lookup --------------------------------------- = */ + +/* + * Map hardware core IDs (from DT child "reg") to enum cmh_core_type. + * + * Entries set to -1 are not dispatchable crypto cores: system cores + * (SYS, DMA, KIC, TIC, MPU, EMC, EAC) and the DRBG singleton + * (handled separately in cmh_rng.c). + */ +static const int core_id_to_type[CORE_ID_NUM] =3D { + [0 ... CORE_ID_NUM - 1] =3D -1, + [CORE_ID_HC] =3D CMH_CORE_HC, + [CORE_ID_AES] =3D CMH_CORE_AES, + [CORE_ID_SM4] =3D CMH_CORE_SM4, + [CORE_ID_SM3] =3D CMH_CORE_SM3, + [CORE_ID_CCP] =3D CMH_CORE_CCP, + [CORE_ID_PKE] =3D CMH_CORE_PKE, + [CORE_ID_QSE] =3D CMH_CORE_QSE, + [CORE_ID_HCQ] =3D CMH_CORE_HCQ, +}; + +/* Human-readable names for error messages */ +static const char * const core_type_names[CMH_NUM_CORE_TYPES] =3D { + [CMH_CORE_HC] =3D "hc", + [CMH_CORE_AES] =3D "aes", + [CMH_CORE_SM4] =3D "sm4", + [CMH_CORE_SM3] =3D "sm3", + [CMH_CORE_CCP] =3D "ccp", + [CMH_CORE_PKE] =3D "pke", + [CMH_CORE_QSE] =3D "qse", + [CMH_CORE_HCQ] =3D "hcq", +}; + +/* -- DT child node enumeration ----------------------------------------- = */ + +static int cmh_config_populate_cores(struct cmh_config *cfg, + struct device_node *np) +{ + struct device_node *child; + u32 core_id, mbx_val; + int type, ret; + + for_each_child_of_node(np, child) { + ret =3D of_property_read_u32(child, "reg", &core_id); + if (ret) { + dev_warn(cmh_dev(), + "DT child %pOFn: missing 'reg', skipping\n= ", + child); + continue; + } + + if (core_id >=3D CORE_ID_NUM) { + dev_info(cmh_dev(), + "DT child %pOFn: core_id 0x%02x unknown, s= kipping\n", + child, core_id); + continue; + } + + type =3D core_id_to_type[core_id]; + if (type < 0) { + /* Not a dispatchable crypto core (DRBG, SYS, etc.)= */ + dev_dbg(cmh_dev(), + "DT child %pOFn: core_id 0x%02x not dispatc= hable\n", + child, core_id); + continue; + } + + if (cfg->core_types[type].num_instances >=3D + CMH_MAX_CORE_INSTANCES) { + dev_err(cmh_dev(), + "DT: too many instances for %s (max %u)\n", + core_type_names[type], + CMH_MAX_CORE_INSTANCES); + of_node_put(child); + return -EINVAL; + } + + { + struct cmh_core_type_cfg *ct =3D &cfg->core_types[t= ype]; + u32 idx =3D ct->num_instances; + + ct->core_ids[idx] =3D core_id; + ret =3D of_property_read_u32(child, "cri,mbx", &mbx= _val); + ct->mbx[idx] =3D ret ? -1 : (s32)mbx_val; +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG + /* + * Debug knob for the cross-core stress test: drop = the + * DT MBX pin so cmh_core_select_instance() round-r= obins + * this core across all configured MBXes (the unpin= ned + * dispatch behaviour exercised before cri,mbx affi= nity + * was added to the baseline device tree). + */ + if (mbx_round_robin) + ct->mbx[idx] =3D -1; +#endif + ct->num_instances++; + } + } + + return 0; +} + +/* -- Validation -------------------------------------------------------- = */ + +static int cmh_config_validate_core_types(struct cmh_config *cfg) +{ + unsigned int i, j, k; + + for (i =3D 0; i < CMH_NUM_CORE_TYPES; i++) { + struct cmh_core_type_cfg *ct =3D &cfg->core_types[i]; + const char *name =3D core_type_names[i]; + + /* Zero instances is valid -- core absent from DT */ + if (ct->num_instances =3D=3D 0) + continue; + + if (ct->num_instances > CMH_MAX_CORE_INSTANCES) { + dev_err(cmh_dev(), "%s: num_instances %u > max %u\n= ", + name, ct->num_instances, + CMH_MAX_CORE_INSTANCES); + return -EINVAL; + } + + /* Validate MBX indices */ + for (j =3D 0; j < ct->num_instances; j++) { + if (ct->mbx[j] >=3D 0 && + (u32)ct->mbx[j] >=3D cfg->mbx_count) { +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG + if (mbx_count_override > 0) { + dev_info(cmh_dev(), + "%s: mbx[%u]=3D%d >=3D ove= rridden mbx_count %u, auto-assigning\n", + name, j, ct->mbx[j], + cfg->mbx_count); + ct->mbx[j] =3D -1; + continue; + } +#endif + dev_err(cmh_dev(), "%s: mbx[%u]=3D%d >=3D m= bx_count %u\n", + name, j, ct->mbx[j], + cfg->mbx_count); + return -EINVAL; + } + } + + /* No duplicate core IDs within this type */ + for (j =3D 1; j < ct->num_instances; j++) { + for (k =3D 0; k < j; k++) { + if (ct->core_ids[j] =3D=3D ct->core_ids[k])= { + dev_err(cmh_dev(), + "%s: duplicate core_id 0x%0= 2x at [%u] and [%u]\n", + name, ct->core_ids[j], + k, j); + return -EINVAL; + } + } + } + + /* No duplicate MBX within this type (if explicit) */ + for (j =3D 1; j < ct->num_instances; j++) { + if (ct->mbx[j] < 0) + continue; + for (k =3D 0; k < j; k++) { + if (ct->mbx[k] =3D=3D ct->mbx[j]) { + dev_err(cmh_dev(), + "%s: duplicate mbx %d at [%= u] and [%u]\n", + name, ct->mbx[j], k, j); + return -EINVAL; + } + } + } + + /* All core IDs must fit in VCQ 8-bit field */ + for (j =3D 0; j < ct->num_instances; j++) { + if (ct->core_ids[j] > CORE_ID_MAX) { + dev_err(cmh_dev(), + "%s: core_ids[%u]=3D0x%02x > CORE_I= D_MAX\n", + name, j, ct->core_ids[j]); + return -EINVAL; + } + } + } + + /* Cross-type: no core ID used by more than one type */ + for (i =3D 0; i < CMH_NUM_CORE_TYPES; i++) { + struct cmh_core_type_cfg *ct_i =3D &cfg->core_types[i]; + + for (j =3D i + 1; j < CMH_NUM_CORE_TYPES; j++) { + struct cmh_core_type_cfg *ct_j =3D &cfg->core_types= [j]; + + for (k =3D 0; k < ct_i->num_instances; k++) { + unsigned int m; + + for (m =3D 0; m < ct_j->num_instances; m++)= { + if (ct_i->core_ids[k] !=3D + ct_j->core_ids[m]) + continue; + dev_err(cmh_dev(), + "core_id 0x%02x conflict: %= s[%u] and %s[%u]\n", + ct_i->core_ids[k], + core_type_names[i], k, + core_type_names[j], m); + return -EINVAL; + } + } + } + } + + return 0; +} + +static int cmh_config_validate(struct cmh_config *cfg) +{ + unsigned int i, j; + unsigned long max_instance_end; + + if (cfg->mbx_count =3D=3D 0 || cfg->mbx_count > CMH_MAX_CONFIGURED_= MBX) { + dev_err(cmh_dev(), "mbx_count %u out of range (1..%u)\n", + cfg->mbx_count, CMH_MAX_CONFIGURED_MBX); + return -EINVAL; + } + + for (i =3D 0; i < cfg->mbx_count; i++) { + struct cmh_mbx_config *m =3D &cfg->mailboxes[i]; + + if (m->instance >=3D CMH_MAX_MBX_INSTANCES) { + dev_err(cmh_dev(), "mbx_instances[%u]=3D%u >=3D %u\= n", + i, m->instance, CMH_MAX_MBX_INSTANCES); + return -EINVAL; + } + + if (m->slots_log2 < CMH_MBX_SLOTS_LOG2_MIN || + m->slots_log2 > CMH_MBX_SLOTS_LOG2_MAX) { + dev_err(cmh_dev(), "mbx_slots[%u]=3D%u out of range= (%u..%u)\n", + i, m->slots_log2, + CMH_MBX_SLOTS_LOG2_MIN, CMH_MBX_SLOTS_LOG2_M= AX); + return -EINVAL; + } + + if (m->stride_log2 < CMH_MBX_STRIDE_LOG2_MIN || + m->stride_log2 > CMH_MBX_STRIDE_LOG2_MAX) { + dev_err(cmh_dev(), "mbx_strides[%u]=3D%u out of ran= ge (%u..%u)\n", + i, m->stride_log2, + CMH_MBX_STRIDE_LOG2_MIN, CMH_MBX_STRIDE_LOG2= _MAX); + return -EINVAL; + } + + /* Check for duplicate instance indices */ + for (j =3D 0; j < i; j++) { + if (cfg->mailboxes[j].instance =3D=3D m->instance) = { + dev_err(cmh_dev(), "duplicate instance %u a= t indices %u and %u\n", + m->instance, j, i); + return -EINVAL; + } + } + } + + /* Ensure SIC region is large enough for all requested instances */ + max_instance_end =3D 0; + for (i =3D 0; i < cfg->mbx_count; i++) { + unsigned long end =3D ((unsigned long)cfg->mailboxes[i].ins= tance + 1) + << CMH_MBX_INSTANCE_SHIFT; + if (end > max_instance_end) + max_instance_end =3D end; + } + + if (max_instance_end > cfg->sic_size) { + dev_err(cmh_dev(), "sic_size 0x%zx too small for instance r= equiring 0x%lx\n", + cfg->sic_size, max_instance_end); + return -EINVAL; + } + + return 0; +} + +/* -- Public Interface -------------------------------------------------- = */ + +/** + * cmh_config_init() - Initialize device configuration from platform/DT da= ta + * @cfg: Configuration structure to populate + * @pdev: Platform device providing DT node and resources + * + * Parse the "cri,cmh" device tree node for MMIO base address, interrupt + * specifiers, and per-mailbox properties (instance indices, slot counts, + * strides). When DT properties are absent, fall back to module parameter + * arrays. Populate per-core-type instance configuration from module + * parameters, then validate the complete configuration. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_config_init(struct cmh_config *cfg, struct platform_device *pdev) +{ + struct device_node *np =3D pdev->dev.of_node; + struct resource *res; + unsigned int i; + int ret, irq, nr; + + if (!np) { + dev_err(&pdev->dev, "device tree node required\n"); + return -ENODEV; + } + + cfg->of_node =3D np; + + /* SIC base + size from DT "reg" property (mandatory) */ + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!res) { + dev_err(cmh_dev(), "missing DT reg resource\n"); + return -EINVAL; + } + cfg->sic_base =3D res->start; + cfg->sic_size =3D resource_size(res); + + /* + * IRQ resolution order: + * 1. Platform-level IRQ from the first DT "interrupts" entry. + * 2. If absent (cfg->irq =3D=3D -1), cmh_rh_resolve_irqs() tries + * per-MBX of_irq_get() for per-mailbox interrupt routing. + * 3. If no IRQs are available at all, the response handler + * falls back to watchdog-timer polling (200 ms default). + */ + irq =3D platform_get_irq_optional(pdev, 0); + cfg->irq =3D (irq >=3D 0) ? irq : -1; + + cfg->sic_mapped =3D NULL; + cfg->fw_ready_timeout_ms =3D fw_ready_timeout_ms; + + /* -- MBX configuration from DT --------------------------------- *= / + + nr =3D of_property_count_u32_elems(np, "cri,mbx-instances"); + if (nr <=3D 0) { + dev_err(cmh_dev(), "missing or empty cri,mbx-instances in D= T\n"); + return -EINVAL; + } + if ((unsigned int)nr > CMH_MAX_CONFIGURED_MBX) { + dev_err(cmh_dev(), "too many MBX instances in DT (%d > %u)\= n", + nr, CMH_MAX_CONFIGURED_MBX); + return -EINVAL; + } + cfg->mbx_count =3D nr; + +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG + if (mbx_count_override > 0) { + if (mbx_count_override > cfg->mbx_count) { + dev_err(cmh_dev(), + "mbx_count_override %u > DT count %u\n", + mbx_count_override, cfg->mbx_count); + return -EINVAL; + } + dev_info(cmh_dev(), "[debug] overriding mbx_count: %u -> %u= \n", + cfg->mbx_count, mbx_count_override); + cfg->mbx_count =3D mbx_count_override; + } +#endif + + for (i =3D 0; i < cfg->mbx_count; i++) { + struct cmh_mbx_config *m =3D &cfg->mailboxes[i]; + u32 val; + + ret =3D of_property_read_u32_index(np, "cri,mbx-instances", + i, &val); + if (ret) { + dev_err(cmh_dev(), "missing cri,mbx-instances[%u] i= n DT\n", + i); + return ret; + } + m->instance =3D val; + + ret =3D of_property_read_u32_index(np, "cri,mbx-slots-log2"= , + i, &val); + if (ret) { + m->slots_log2 =3D CMH_DEFAULT_SLOTS_LOG2; + dev_info(cmh_dev(), + "MBX[%u]: cri,mbx-slots-log2 absent, using= default %u\n", + i, CMH_DEFAULT_SLOTS_LOG2); + } else { + m->slots_log2 =3D val; + } + + ret =3D of_property_read_u32_index(np, "cri,mbx-strides-log= 2", + i, &val); + if (ret) { + m->stride_log2 =3D CMH_DEFAULT_STRIDE_LOG2; + dev_info(cmh_dev(), + "MBX[%u]: cri,mbx-strides-log2 absent, usi= ng default %u\n", + i, CMH_DEFAULT_STRIDE_LOG2); + } else { + m->stride_log2 =3D val; + } + +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG + if (mbx_slots_override > 0) { + m->slots_log2 =3D mbx_slots_override; + if (i =3D=3D 0) + dev_info(cmh_dev(), + "[debug] overriding slots_log2 -> = %u\n", + mbx_slots_override); + } +#endif + + m->queue_size =3D (1UL << m->slots_log2) << m->stride_log2; + m->dma_handle =3D 0; + m->virt_addr =3D NULL; + m->reg_base =3D NULL; + } + + /* -- Core-type enumeration from DT child nodes ----------------- *= / + + ret =3D cmh_config_populate_cores(cfg, np); + if (ret) + return ret; + + ret =3D cmh_config_validate(cfg); + if (ret) + return ret; + + return cmh_config_validate_core_types(cfg); +} diff --git a/drivers/crypto/cmh/cmh_debugfs.c b/drivers/crypto/cmh/cmh_debu= gfs.c new file mode 100644 index 000000000000..bd7b083b9ef1 --- /dev/null +++ b/drivers/crypto/cmh/cmh_debugfs.c @@ -0,0 +1,286 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- debugfs Per-MBX Counters and Fault Injection + * + * Creates the /sys/kernel/debug/cmh/ tree with: + * mbxN/vcqs_submitted (ro) Total VCQs sent to MBX N + * mbxN/vcqs_completed (ro) Total completions received + * mbxN/vcqs_errors (ro) Total error completions + * mbxN/queue_full_count (ro) Times select_mailbox() skipped this MBX + * mbxN/max_queue_depth (ro) High-water mark of in-flight transactions + * mbxN/inject_abort (wo) Write any value to inject MBX_COMMAND_ABOR= T + * mbxN/force_drain (wo) Write any value to force-drain all pending= txns + * tm/cmq_posts (ro) Total cmh_tm_post_command() calls + * tm/cmq_depth_max (ro) High-water mark of CMQ length + * tm/cmq_eagain_count (ro) Times CMQ was full (-EAGAIN) + * tm/backoff_count (ro) Times TM backed off (all MBX queues full) + * tm/async_timeout_count (ro) Async requests that timed out + * + * This file is only compiled when CONFIG_CRYPTO_DEV_CMH_DEBUG=3Dy (see Kb= uild). + * Requires CONFIG_DEBUG_FS=3Dy in the kernel (standard for dev builds). + */ + +#include +#include +#include +#include + +#include "cmh_debugfs.h" +#include "cmh_config.h" +#include "cmh_registers.h" +#include "cmh_dma.h" +#include "cmh_txn.h" +#include "cmh_rh.h" +#include "cmh_rng.h" + +/* -- Module State -------------------------------------------------------= --- */ + +static struct { + struct dentry *root; /* /sys/kernel/debug/cmh/ *= / + struct cmh_mbx_stats *mbx; /* array[mbx_count] */ + struct cmh_tm_stats tm; + struct cmh_config *cfg; /* for inject_abort registe= r access */ + u32 mbx_count; +} dbgfs; + +/* -- debugfs file ops for atomic64_t ------------------------------------= --- */ + +static int cmh_dbgfs_u64_get(void *data, u64 *val) +{ + *val =3D (u64)atomic64_read((atomic64_t *)data); + return 0; +} + +DEFINE_DEBUGFS_ATTRIBUTE(cmh_dbgfs_u64_ro_fops, + cmh_dbgfs_u64_get, NULL, "%llu\n"); + +/* -- Per-MBX directory --------------------------------------------------= --- */ + +/* + * inject_abort -- write-only debugfs file for fault injection. + * + * Writing any value triggers MBX_COMMAND_ABORT on this mailbox. + * The eSW calls mbx_abort() -> mbx_cmd_error(mbx, -EPIPE), fires the + * error IRQ, and the LKM RH completes in-flight transactions with -EIO + * then issues MBX_COMMAND_RESTART to resume the mailbox. + * + * Private data points to the MBX index (cast to void *). + */ +static ssize_t inject_abort_write(struct file *file, + const char __user *ubuf, + size_t count, loff_t *ppos) +{ + u32 idx =3D (u32)(unsigned long)file->private_data; + void __iomem *base; + + if (!dbgfs.cfg || idx >=3D dbgfs.cfg->mbx_count) + return -EINVAL; + + base =3D dbgfs.cfg->mailboxes[idx].reg_base; + dev_warn(cmh_dev(), "debugfs: injecting ABORT on mbx[%u]\n", idx); + cmh_reg_write32(MBX_COMMAND_ABORT, base, R_MBX_COMMAND); + + return count; +} + +static const struct file_operations inject_abort_fops =3D { + .owner =3D THIS_MODULE, + .open =3D simple_open, + .write =3D inject_abort_write, + .llseek =3D noop_llseek, +}; + +/* + * force_drain -- write-only debugfs file for administrative recovery. + * + * Writing any value issues MBX_COMMAND_FLUSH, drains all pending + * transactions on this mailbox (completing each with -ECANCELED), + * and resets all recovery bookkeeping (abort_stall_ticks, + * restart_pending, restart_retries, flush_count, wedged). + * + * Use this to recover D-state processes when the eSW is dead and + * normal ABORT/RESTART escalation has not recovered the mailbox. + */ +static ssize_t force_drain_write(struct file *file, + const char __user *ubuf, + size_t count, loff_t *ppos) +{ + u32 idx =3D (u32)(unsigned long)file->private_data; + + if (!dbgfs.cfg || idx >=3D dbgfs.cfg->mbx_count) + return -EINVAL; + + cmh_rh_force_drain_mbx(idx); + return count; +} + +static const struct file_operations force_drain_fops =3D { + .owner =3D THIS_MODULE, + .open =3D simple_open, + .write =3D force_drain_write, + .llseek =3D noop_llseek, +}; + +static void create_mbx_dir(u32 idx, struct dentry *parent) +{ + struct cmh_mbx_stats *s =3D &dbgfs.mbx[idx]; + struct dentry *d; + char name[16]; + + snprintf(name, sizeof(name), "mbx%u", idx); + d =3D debugfs_create_dir(name, parent); + + debugfs_create_file("vcqs_submitted", 0444, d, + &s->vcqs_submitted, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("vcqs_completed", 0444, d, + &s->vcqs_completed, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("vcqs_errors", 0444, d, + &s->vcqs_errors, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("queue_full_count", 0444, d, + &s->queue_full_count, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("max_queue_depth", 0444, d, + &s->max_queue_depth, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("inject_abort", 0200, d, + (void *)(uintptr_t)idx, &inject_abort_fops); + debugfs_create_file("force_drain", 0200, d, + (void *)(uintptr_t)idx, &force_drain_fops); +} + +/* -- TM directory -------------------------------------------------------= --- */ + +static void create_tm_dir(struct dentry *parent) +{ + struct cmh_tm_stats *s =3D &dbgfs.tm; + struct dentry *d; + + d =3D debugfs_create_dir("tm", parent); + + debugfs_create_file("cmq_posts", 0444, d, + &s->cmq_posts, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("cmq_depth_max", 0444, d, + &s->cmq_depth_max, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("cmq_eagain_count", 0444, d, + &s->cmq_eagain_count, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("backoff_count", 0444, d, + &s->backoff_count, &cmh_dbgfs_u64_ro_fops); + debugfs_create_file("async_timeout_count", 0444, d, + &s->async_timeout_count, &cmh_dbgfs_u64_ro_fops= ); +} + +/* -- Config directory: timeout tuning ---------------------------------- = */ + +static void create_config_dir(struct dentry *parent) +{ + struct dentry *d; + + d =3D debugfs_create_dir("config", parent); + + /* TM timeouts */ + debugfs_create_u32("async_timeout_ms", 0644, d, + cmh_tm_timeout_async_ptr()); + debugfs_create_u32("vcq_timeout_ms", 0644, d, + cmh_tm_timeout_vcq_ptr()); + debugfs_create_u32("slow_op_timeout_ms", 0644, d, + cmh_tm_timeout_slow_op_ptr()); + debugfs_create_u32("drain_timeout_ms", 0644, d, + cmh_tm_timeout_drain_ptr()); + + /* RH watchdog */ + debugfs_create_u32("watchdog_ms", 0644, d, + cmh_rh_timeout_watchdog_ptr()); + + /* DRBG timeout */ + debugfs_create_u32("drbg_timeout_ms", 0644, d, + cmh_rng_timeout_drbg_ptr()); +} + +/* -- Public Interface ---------------------------------------------------= --- */ + +/** + * cmh_debugfs_init() - Create debugfs directory hierarchy for CMH + * @cfg: Platform configuration containing mailbox count and register base= s. + * + * Allocates per-mailbox statistics and creates the /sys/kernel/debug/cmh/ + * tree with per-mailbox counters, fault-injection files, and transaction + * manager statistics. debugfs is optional; failure to create entries doe= s + * not prevent module initialisation. + * + * Return: 0 on success (always returns 0 -- debugfs is best-effort). + */ +int cmh_debugfs_init(struct cmh_config *cfg) +{ + u32 mbx_count =3D cfg->mbx_count; + u32 i; + + dbgfs.root =3D debugfs_create_dir("cmh", NULL); + if (IS_ERR_OR_NULL(dbgfs.root)) { + if (!IS_ERR(dbgfs.root)) + dev_warn(cmh_dev(), "debugfs: creation returned NUL= L -- counters disabled\n"); + else + dev_warn(cmh_dev(), "debugfs: creation failed (rc= =3D%ld) -- counters disabled\n", + PTR_ERR(dbgfs.root)); + dbgfs.root =3D NULL; + return 0; /* debugfs is optional -- never fail module init= */ + } + + dbgfs.mbx_count =3D mbx_count; + dbgfs.cfg =3D cfg; + dbgfs.mbx =3D kcalloc(mbx_count, sizeof(*dbgfs.mbx), GFP_KERNEL); + if (!dbgfs.mbx) { + debugfs_remove_recursive(dbgfs.root); + dbgfs.root =3D NULL; + return 0; + } + + for (i =3D 0; i < mbx_count; i++) + create_mbx_dir(i, dbgfs.root); + + create_tm_dir(dbgfs.root); + + create_config_dir(dbgfs.root); + + dev_dbg(cmh_dev(), "debugfs: initialized (%u mailboxes)\n", mbx_cou= nt); + return 0; +} + +/** + * cmh_debugfs_cleanup() - Remove all CMH debugfs entries + * + * Tears down the /sys/kernel/debug/cmh/ tree and frees per-mailbox + * statistics memory. Safe to call even if cmh_debugfs_init() was never + * called or failed. + */ +void cmh_debugfs_cleanup(void) +{ + debugfs_remove_recursive(dbgfs.root); + dbgfs.root =3D NULL; + kfree(dbgfs.mbx); + dbgfs.mbx =3D NULL; + dev_dbg(cmh_dev(), "debugfs: cleaned up\n"); +} + +/** + * cmh_debugfs_mbx_stats() - Return per-mailbox statistics pointer + * @mbx_idx: Zero-based mailbox index. + * + * Return: Pointer to the statistics structure for @mbx_idx, or NULL if + * debugfs is disabled or @mbx_idx is out of range. + */ +struct cmh_mbx_stats *cmh_debugfs_mbx_stats(u32 mbx_idx) +{ + if (!dbgfs.mbx || mbx_idx >=3D dbgfs.mbx_count) + return NULL; + return &dbgfs.mbx[mbx_idx]; +} + +/** + * cmh_debugfs_tm_stats() - Return transaction manager statistics pointer + * + * Return: Pointer to the singleton TM statistics structure. The pointer + * is always valid (points to static storage). + */ +struct cmh_tm_stats *cmh_debugfs_tm_stats(void) +{ + return &dbgfs.tm; +} diff --git a/drivers/crypto/cmh/cmh_dma.c b/drivers/crypto/cmh/cmh_dma.c new file mode 100644 index 000000000000..36ea277420cf --- /dev/null +++ b/drivers/crypto/cmh/cmh_dma.c @@ -0,0 +1,373 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- DMA Operations + * + * Implements the cmh_dma.h interface using the kernel DMA API + * (dma_map_single, dma_alloc_coherent, etc.). + * + * Scatterlist linearization rationale + * ------------------------------------ + * The eSW firmware supports SCATTERGATHER commands for all core + * types (AES_CMD_SCATTERGATHER, SM4_CMD_SCATTERGATHER, + * CCP_CMD_SCATTERGATHER, HC_CMD_GATHER), using a proprietary + * linked-list-item (LLI) descriptor chain format. The hash driver + * already uses this via cmh_dma_build_sg() + HC_CMD_GATHER. + * + * For symmetric cipher and AEAD commands, the LKM currently + * linearizes scatterlist input into contiguous bounce buffers via + * scatterwalk_map_and_copy() rather than building LLI chains from + * kernel scatterlists. This is a deliberate first-submission + * simplification with a concrete technical justification: + * + * - The hash SG path is unidirectional (DMA_TO_DEVICE gather only). + * Skcipher and AEAD require bidirectional handling: separate src + * and dst scatterlists (which may alias for in-place operations), + * plus AAD and authentication tag regions with distinct DMA + * directions and alignment constraints. + * - The CMH LLI format requires 64-byte aligned descriptor chain + * pointers (the .lli field) with 32-bit length fields. This + * alignment is automatically satisfied by dma_alloc_coherent() + * for the descriptor array; data buffer addresses have no + * hardware alignment requirement. Kernel SG entries have no + * alignment guarantee for data, so direct SG-to-LLI translation + * requires per-segment validation, potential splitting at + * descriptor boundaries, and separate chains for src/dst/AAD -- + * substantially more complex than the unidirectional hash + * gather case. + * - Each skcipher/AEAD driver caps linearization at + * CMH_AES_MAX_CRYPTLEN / CMH_SM4_MAX_CRYPTLEN (32 MiB). + * Requests exceeding this cap are rejected with -EINVAL. + * In practice, crypto API callers (dm-crypt, IPsec, kernel TLS) + * send page-sized or smaller buffers, so the bounce allocation + * is typically <=3D PAGE_SIZE and succeeds even under GFP_ATOMIC. + * + * A shared SG-to-LLI adapter handling bidirectional mappings, + * alignment splitting, and in-place src=3D=3Ddst detection for the + * skcipher/AEAD/MAC paths is planned as a follow-up series once the + * core driver is accepted. + * + * This linearization pattern is consistent with other upstream HW + * crypto drivers that use bounce buffers in their initial + * submissions (e.g. ccree, sa2ul, omap-aes). + */ + +#include +#include +#include +#include +#include +#include + +#include "cmh_dma.h" + +/* Module-global device pointer, set in cmh_dma_init() */ +static struct device *cmh_device; + +/** + * cmh_dma_init() - Initialize the standard DMA backend + * @pdev: Platform device providing the struct device for DMA ops + * + * Stores the device pointer for use by all DMA wrapper functions. + * + * Return: 0 (always succeeds for the standard backend). + */ +int cmh_dma_init(struct platform_device *pdev) +{ + cmh_device =3D &pdev->dev; + return 0; +} + +/** + * cmh_dma_cleanup() - Tear down the standard DMA backend + * + * Clears the stored device pointer. + */ +void cmh_dma_cleanup(void) +{ + cmh_device =3D NULL; +} + +/** + * cmh_dev() - Return the platform device pointer + * + * Return: struct device pointer, or NULL outside probe/remove lifecycle. + */ +struct device *cmh_dev(void) +{ + return cmh_device; +} + +/* -- Streaming DMA ------------------------------------------------------= -- */ + +/** + * cmh_dma_map_single() - Map a kernel buffer for streaming DMA + * @buf: Kernel virtual address + * @size: Buffer length in bytes + * @dir: DMA direction + * + * Return: DMA address, or a DMA_MAPPING_ERROR value on failure. + */ +dma_addr_t cmh_dma_map_single(void *buf, size_t size, + enum dma_data_direction dir) +{ + return dma_map_single(cmh_device, buf, size, dir); +} + +/** + * cmh_dma_unmap_single() - Unmap a streaming DMA buffer + * @addr: DMA address returned by cmh_dma_map_single() + * @size: Buffer length in bytes + * @dir: DMA direction (must match the map call) + */ +void cmh_dma_unmap_single(dma_addr_t addr, size_t size, + enum dma_data_direction dir) +{ + dma_unmap_single(cmh_device, addr, size, dir); +} + +/** + * cmh_dma_sync_for_cpu() - Sync a DMA buffer for CPU access + * @addr: DMA address of the mapped buffer + * @size: Region length in bytes + * @dir: DMA direction + */ +void cmh_dma_sync_for_cpu(dma_addr_t addr, size_t size, + enum dma_data_direction dir) +{ + dma_sync_single_for_cpu(cmh_device, addr, size, dir); +} + +/** + * cmh_dma_sync_for_device() - Sync a DMA buffer for device access + * @addr: DMA address of the mapped buffer + * @size: Region length in bytes + * @dir: DMA direction + */ +void cmh_dma_sync_for_device(dma_addr_t addr, size_t size, + enum dma_data_direction dir) +{ + dma_sync_single_for_device(cmh_device, addr, size, dir); +} + +/** + * cmh_dma_map_error() - Check whether a DMA mapping failed + * @addr: DMA address to check + * + * Return: Non-zero if @addr indicates a mapping error. + */ +int cmh_dma_map_error(dma_addr_t addr) +{ + return dma_mapping_error(cmh_device, addr); +} + +/* -- Coherent DMA -------------------------------------------------------= -- */ + +/** + * cmh_dma_alloc() - Allocate coherent DMA memory + * @size: Allocation size in bytes + * @handle: Output DMA address + * @gfp: GFP allocation flags + * + * Return: Kernel virtual address, or NULL on failure. + */ +void *cmh_dma_alloc(size_t size, dma_addr_t *handle, gfp_t gfp) +{ + return dma_alloc_coherent(cmh_device, size, handle, gfp); +} + +/** + * cmh_dma_free() - Free coherent DMA memory + * @size: Allocation size (must match cmh_dma_alloc) + * @virt: Kernel virtual address + * @handle: DMA address + */ +void cmh_dma_free(size_t size, void *virt, dma_addr_t handle) +{ + dma_free_coherent(cmh_device, size, virt, handle); +} + +/* -- Buffer write helpers -----------------------------------------------= -- */ + +/** + * cmh_dma_write() - Copy data into a DMA buffer + * @dst: Destination (from cmh_dma_alloc) + * @src: Source kernel buffer + * @len: Byte count + */ +void cmh_dma_write(void *dst, const void *src, size_t len) +{ + memcpy(dst, src, len); +} + +/** + * cmh_dma_fence() - No-op on standard DMA API platforms (coherent) + * @ptr: Unused -- present for interface compatibility + */ +void cmh_dma_fence(void *ptr) +{ + /* Standard DMA API: coherent memory, no cross-slave fence needed *= / +} + +/** + * cmh_dma_zero() - Zero a DMA buffer + * @dst: Destination (from cmh_dma_alloc) + * @len: Byte count + */ +void cmh_dma_zero(void *dst, size_t len) +{ + memset(dst, 0, len); +} + +/** + * cmh_dma_build_sg() - Build a scatter-gather DMA mapping + * @bufs: Array of buffer descriptors to map + * @count: Number of entries in @bufs + * @gfp: GFP flags for memory allocation + * + * Allocates a streaming-DMA descriptor array and maps each buffer in @buf= s + * for DMA-to-device transfer, filling CMH eSW-format scatter-gather + * descriptors with linked-list pointers. + * + * The descriptor array uses streaming DMA (kmalloc + dma_map_single) rath= er + * than dma_alloc_coherent so that cmh_dma_free_sg() -- which calls + * dma_unmap_single + kfree -- is safe from any context including BH-disab= led + * completion callbacks. + * + * Return: Pointer to the allocated cmh_sg_map on success, NULL on failure= . + */ +struct cmh_sg_map *cmh_dma_build_sg(const struct cmh_dma_buf *bufs, u32 co= unt, + gfp_t gfp) +{ + struct cmh_sg_map *sgm; + u32 i; + + if (!count) + return NULL; + + sgm =3D kzalloc(struct_size(sgm, bufs, count), gfp); + if (!sgm) + return NULL; + + sgm->count =3D count; + sgm->items_size =3D array_size(count, sizeof(*sgm->items)); + if (sgm->items_size =3D=3D SIZE_MAX) + goto err_free_sgm; + + /* + * Allocate descriptor array with kmalloc and map for streaming DMA= . + * We map first to obtain items_dma (needed for .lli pointers), + * then sync-for-cpu, fill descriptors, and sync-for-device. + */ + sgm->items =3D kzalloc(sgm->items_size, gfp); + if (!sgm->items) + goto err_free_sgm; + + sgm->items_dma =3D cmh_dma_map_single(sgm->items, sgm->items_size, + DMA_TO_DEVICE); + if (cmh_dma_map_error(sgm->items_dma)) + goto err_free_items; + + /* Map each source buffer for device read */ + for (i =3D 0; i < count; i++) { + dma_addr_t dma; + + if (!bufs[i].len) + goto err_unmap; + sgm->bufs[i].len =3D bufs[i].len; + dma =3D cmh_dma_map_single(bufs[i].data, bufs[i].len, + DMA_TO_DEVICE); + if (cmh_dma_map_error(dma)) + goto err_unmap; + sgm->bufs[i].dma =3D dma; + } + + /* + * Reclaim CPU ownership of the descriptor buffer. After + * dma_map_single the device owns the mapping; we must call + * sync_for_cpu before writing regardless of direction. The + * direction matches the original mapping (DMA_TO_DEVICE) -- + * this tells the DMA layer which cache operations apply: + * invalidate so the CPU sees coherent data before we fill + * the SG descriptors and later sync_for_device. + */ + cmh_dma_sync_for_cpu(sgm->items_dma, sgm->items_size, + DMA_TO_DEVICE); + + /* Fill CMH eSW SG descriptors */ + for (i =3D 0; i < count; i++) { + u64 lli_val; + + if (i + 1 < count) + lli_val =3D (u64)(sgm->items_dma + + (i + 1) * sizeof(*sgm->items)); + else + lli_val =3D 0; + + sgm->items[i].lli =3D lli_val; + sgm->items[i].src =3D (u64)sgm->bufs[i].dma; + sgm->items[i].dst =3D 0; + sgm->items[i].len =3D (u64)bufs[i].len; + } + + /* Flush descriptor writes to device */ + cmh_dma_sync_for_device(sgm->items_dma, sgm->items_size, + DMA_TO_DEVICE); + + return sgm; + +err_unmap: + while (i--) + cmh_dma_unmap_single(sgm->bufs[i].dma, + sgm->bufs[i].len, DMA_TO_DEVICE); + cmh_dma_unmap_single(sgm->items_dma, sgm->items_size, + DMA_TO_DEVICE); +err_free_items: + kfree(sgm->items); +err_free_sgm: + kfree(sgm); + return NULL; +} + +/** + * cmh_dma_free_sg() - Unmap and free a scatter-gather mapping + * @sgm: Scatter-gather mapping created by cmh_dma_build_sg(), or NULL + * + * Unmaps all DMA-mapped buffers, unmaps and frees the descriptor array, + * and releases the cmh_sg_map structure. Safe to call from any context + * (including BH-disabled completion callbacks) because it uses only + * dma_unmap_single + kfree -- no vunmap/dma_free_coherent. + */ +void cmh_dma_free_sg(struct cmh_sg_map *sgm) +{ + u32 i; + + if (!sgm) + return; + + for (i =3D 0; i < sgm->count; i++) + cmh_dma_unmap_single(sgm->bufs[i].dma, + sgm->bufs[i].len, DMA_TO_DEVICE); + + cmh_dma_unmap_single(sgm->items_dma, sgm->items_size, + DMA_TO_DEVICE); + kfree(sgm->items); + kfree(sgm); +} + +/** + * cmh_dma_orphan_free() - Orphan cleanup callback for abandoned DMA buffe= rs + * @data: Pointer to a struct cmh_dma_orphan describing the orphaned mappi= ng + * + * Called by the transaction manager when a synchronous operation times ou= t + * and the caller has already returned. Unmaps the DMA buffer and frees + * the backing memory and the orphan descriptor itself. + */ +void cmh_dma_orphan_free(void *data) +{ + struct cmh_dma_orphan *o =3D data; + + cmh_dma_unmap_single(o->addr, o->len, o->dir); + kfree_sensitive(o->buf); + kfree(o); +} diff --git a/drivers/crypto/cmh/cmh_main.c b/drivers/crypto/cmh/cmh_main.c new file mode 100644 index 000000000000..452b8272908f --- /dev/null +++ b/drivers/crypto/cmh/cmh_main.c @@ -0,0 +1,365 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Platform Driver Entry and Exit + * + * Responsibilities: + * - Match "cri,cmh" DT node via platform_driver + * - Parse device-tree properties via cmh_config_init() + * - ioremap the SIC region + * - Verify CMH boot status (sanity check) + * - Compute per-instance register bases + * - Initialize MBX queues (MQI) + * - Start Transaction Manager kthread + * - Register Response Handler IRQ + * - Register Kernel Crypto API hash algorithms + * - Clean up in reverse order on exit or error + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "cmh.h" +#include "cmh_dma.h" +#include "cmh_mqi.h" +#include "cmh_txn.h" +#include "cmh_rh.h" +#include "cmh_registers.h" +#include "cmh_debugfs.h" +#include "cmh_sysfs.h" + +#include + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Alex Ousherovitch "); +MODULE_AUTHOR("Saravanakrishnan Krishnamoorthy "); +MODULE_AUTHOR("Joel Wittenauer "); +MODULE_DESCRIPTION("CRI CryptoManager Hub (CMH) hardware crypto accelerato= r"); +MODULE_ALIAS("platform:cmh"); +MODULE_IMPORT_NS("CRYPTO_INTERNAL"); + +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +static bool skip_fw_check; +module_param(skip_fw_check, bool, 0444); +MODULE_PARM_DESC(skip_fw_check, + "[debug] Skip eSW boot status check at probe (default: fal= se)"); +#else +#define skip_fw_check false +#endif + +/* Global device state (single-instance module) */ + +static struct cmh_device *g_cmh_dev; + +/* SIC Sanity Check */ + +static int cmh_check_sic(struct cmh_config *cfg) +{ + u32 boot_status; + u32 hw_version; + u32 sw_boot; + int ret; + + boot_status =3D cmh_reg_read32(cfg->sic_mapped, R_SIC_BOOT_STATUS); + hw_version =3D cmh_reg_read32(cfg->sic_mapped, R_SIC_HW_VERSION0); + + dev_info(cmh_dev(), "SIC boot_status=3D0x%08x hw_version=3D0x%08x\n= ", + boot_status, hw_version); + + if ((boot_status & SIC_BOOT_STATUS_MASK) !=3D SIC_BOOT_STATUS_PASS)= { + dev_err(cmh_dev(), "SIC boot status check failed (0x%02x != =3D 0x%02x)\n", + boot_status & SIC_BOOT_STATUS_MASK, SIC_BOOT_STATUS= _PASS); + return -EIO; + } + + /* Wait for CMH eSW to reach mission mode */ + ret =3D read_poll_timeout(ioread32, sw_boot, + sw_boot & SIC_SW_BOOT_STATUS_MISSION, + 1000, + (unsigned long)cfg->fw_ready_timeout_ms * 1= 000UL, + false, + cfg->sic_mapped + R_SIC_SW_BOOT_STATUS); + if (ret) { + sw_boot =3D cmh_reg_read32(cfg->sic_mapped, R_SIC_SW_BOOT_S= TATUS); + dev_err(cmh_dev(), "CMH eSW not ready (sw_boot_status=3D0x%= 08x, timeout=3D%ums)\n", + sw_boot, cfg->fw_ready_timeout_ms); + return -ETIMEDOUT; + } + + dev_info(cmh_dev(), "CMH eSW in mission mode (sw_boot_status=3D0x%0= 8x)\n", + sw_boot); + + return 0; +} + +/* Module Init -- platform driver probe */ + +static int cmh_probe(struct platform_device *pdev) +{ + struct cmh_device *dev; + struct cmh_config *cfg; + unsigned int i; + int ret; + + /* Single-instance guard: reject if already probed */ + if (g_cmh_dev) + return -EBUSY; + + dev_info(&pdev->dev, "loading v%s\n", CMH_VERSION); + + dev =3D devm_kzalloc(&pdev->dev, sizeof(*dev), GFP_KERNEL); + if (!dev) + return -ENOMEM; + + dev->dev =3D &pdev->dev; + cfg =3D &dev->config; + + /* Declare DMA addressing capability */ + ret =3D dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (ret) { + dev_err(&pdev->dev, "dma_set_mask_and_coherent failed (rc= =3D%d)\n", + ret); + goto err_free_dev; + } + + /* Initialize DMA backend (standard API or FPGA pool) */ + ret =3D cmh_dma_init(pdev); + if (ret) { + dev_err(&pdev->dev, "DMA init failed (rc=3D%d)\n", ret); + goto err_free_dev; + } + + /* Step 1: Parse and validate configuration (DT + module params) */ + ret =3D cmh_config_init(cfg, pdev); + if (ret) + goto err_dma_init; + + dev_info(cmh_dev(), "sic_base=3D0x%llx size=3D0x%zx mbx_count=3D%u = irq=3D%d\n", + (unsigned long long)cfg->sic_base, cfg->sic_size, + cfg->mbx_count, cfg->irq); + + /* Step 2: ioremap the SIC region */ + cfg->sic_mapped =3D devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(cfg->sic_mapped)) { + ret =3D PTR_ERR(cfg->sic_mapped); + cfg->sic_mapped =3D NULL; + dev_err(cmh_dev(), "ioremap failed for SIC region (rc=3D%d)= \n", + ret); + goto err_dma_init; + } + + /* Step 3: Verify CMH is alive */ + if (skip_fw_check) { + dev_info(cmh_dev(), "skipping eSW boot check (skip_fw_check= =3D1)\n"); + } else { + ret =3D cmh_check_sic(cfg); + if (ret) + goto err_dma_init; + } + + /* Step 4: Compute per-instance register bases */ + for (i =3D 0; i < cfg->mbx_count; i++) { + struct cmh_mbx_config *m =3D &cfg->mailboxes[i]; + + m->reg_base =3D cmh_mbx_instance_base(cfg->sic_mapped, + m->instance); + + dev_dbg(cmh_dev(), "mbx[%u] instance=3D%u reg_base=3D%p\n", + i, m->instance, m->reg_base); + } + + (void)cmh_debugfs_init(cfg); + + /* Initialise mailbox queue interface */ + ret =3D cmh_mqi_init(cfg); + if (ret) + goto err_mqi_init; + + /* Initialise transaction manager */ + ret =3D cmh_tm_init(cfg); + if (ret) + goto err_tm_init; + + /* Initialise response handler */ + ret =3D cmh_rh_init(cfg); + if (ret) + goto err_rh_init; + + g_cmh_dev =3D dev; + platform_set_drvdata(pdev, dev); + + dev_info(cmh_dev(), "initialized successfully\n"); + return 0; + +err_rh_init: + cmh_tm_cleanup(); +err_tm_init: + cmh_mqi_cleanup(cfg); +err_mqi_init: + cmh_debugfs_cleanup(); +err_dma_init: + cmh_dma_cleanup(); +err_free_dev: + return ret; +} + +/* Module Exit -- platform driver remove */ + +static void cmh_remove(struct platform_device *pdev) +{ + struct cmh_device *dev =3D platform_get_drvdata(pdev); + struct cmh_config *cfg; + + if (!dev) + return; + + cfg =3D &dev->config; + + cmh_rh_cleanup(cfg); + cmh_tm_cleanup(); + cmh_mqi_cleanup(cfg); + cmh_debugfs_cleanup(); + cmh_dma_cleanup(); + + dev_info(&pdev->dev, "unloaded successfully\n"); + + g_cmh_dev =3D NULL; +} + +static const struct of_device_id cmh_of_match[] =3D { + { .compatible =3D "cri,cmh" }, + { /* sentinel */ } +}; +MODULE_DEVICE_TABLE(of, cmh_of_match); + +/* + * PM suspend/resume. + * + * Suspend: drain the TM first (while the RH is still active and can + * deliver completions for in-flight transactions), then quiesce the + * RH (cancel watchdog, mask HW interrupts). This ordering ensures + * the drain_timeout_ms wait in cmh_tm_quiesce() can actually succeed + * -- if we suspended RH first, no completions would be delivered and + * the drain would always hit the force-cancel path. + * + * IRQ handlers remain registered (standard PM pattern: the kernel + * disables the IRQ lines during suspend, no need to free/re-request). + * + * Resume: re-check the SIC/SW boot status, re-synchronise the RH + * with hardware (head positions, interrupt masks, watchdog), then + * restart the TM kthread. + */ + +static int cmh_suspend(struct device *dev) +{ + struct cmh_device *cmh =3D dev_get_drvdata(dev); + + if (!cmh) + return 0; + + dev_info(dev, "suspending\n"); + cmh_tm_quiesce(); + cmh_rh_suspend(&cmh->config); + return 0; +} + +static int cmh_resume(struct device *dev) +{ + struct cmh_device *cmh =3D dev_get_drvdata(dev); + int ret; + + if (!cmh) + return 0; + + ret =3D cmh_check_sic(&cmh->config); + if (ret) { + dev_err(dev, "resume: CMH eSW health check failed (%d)\n", + ret); + return ret; + } + + /* + * cmh_rh_resume() is void: it only re-syncs MMIO head pointers, + * clears stale interrupt status bits (W1C), re-enables interrupt + * masks, and re-arms the watchdog timer -- none of which can fail + * after the SIC health check above has confirmed HW accessibility. + */ + cmh_rh_resume(&cmh->config); + + ret =3D cmh_tm_resume(); + if (ret) { + dev_err(dev, "resume: TM restart failed (%d)\n", ret); + return ret; + } + dev_info(dev, "resumed successfully\n"); + return 0; +} + +static DEFINE_SIMPLE_DEV_PM_OPS(cmh_pm_ops, + cmh_suspend, + cmh_resume); + +/* + * Runtime PM is intentionally not implemented. The CMH hardware does + * not expose HLOS-accessible clock gates or power domains -- the eSW + * firmware manages HW power state independently. There is no mechanism + * for the kernel to idle, gate clocks, or power down the accelerator + * block from HLOS. If a future platform variant exposes power control + * to HLOS (e.g. via a SCMI power domain), runtime PM support can be + * added at that time using SET_RUNTIME_PM_OPS and pm_runtime_get/put + * around VCQ submission paths. + * + * System sleep (suspend/resume) is supported via DEFINE_SIMPLE_DEV_PM_OPS + * above: suspend quiesces the TM and masks IRQs; resume re-verifies + * eSW health (SIC status) and restarts the TM thread. + */ + +static struct platform_driver cmh_driver =3D { + .probe =3D cmh_probe, + .remove =3D cmh_remove, + .driver =3D { + .name =3D CMH_DRV_NAME, + .of_match_table =3D cmh_of_match, + .dev_groups =3D cmh_sysfs_groups, + .pm =3D pm_sleep_ptr(&cmh_pm_ops), + }, +}; + +static int __init cmh_init(void) +{ + int ret; + + ret =3D platform_driver_register(&cmh_driver); + if (ret) + return ret; + + /* + * platform_driver_register() does not propagate probe() errors. + * If a DT node matched but probe() failed (e.g. bad module params)= , + * g_cmh_dev will not have been set. Detect this and unregister. + * + * This is intentional for a non-discoverable accelerator with no + * hotplug or deferred-probe scenarios -- the device is either + * present at boot or not. Leaving the driver registered after a + * probe failure would silently produce a non-functional module. + */ + if (!g_cmh_dev) { + platform_driver_unregister(&cmh_driver); + return -ENODEV; + } + + return 0; +} + +static void __exit cmh_exit(void) +{ + platform_driver_unregister(&cmh_driver); +} + +module_init(cmh_init); +module_exit(cmh_exit); diff --git a/drivers/crypto/cmh/cmh_mqi.c b/drivers/crypto/cmh/cmh_mqi.c new file mode 100644 index 000000000000..9a135be58562 --- /dev/null +++ b/drivers/crypto/cmh/cmh_mqi.c @@ -0,0 +1,355 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Mailbox Queue Initializer + * + * Responsibilities: + * - Allocate queue buffers for each configured mailbox + * - Execute the MBX lock/setup/enable register sequence + * - Readback-verify all critical register writes + * - Hold lock for MBX lifetime (CMH eSW requires it for host access) + * - Clean up (flush + unlock + free) on exit or error + * + * Register sequence per instance (per CMH MBX hardware specification): + * 1. Read R_MBX_LOCK -> non-zero =3D ownership token acquired + * 2. W1C stale R_MBX_INTERRUPT bits (avoids spurious error cascade) + * 3. Set R_MBX_INTERRUPT_MASK =3D MBX_IRQ_MASK + * 4. Write QUEUE_LO/HI, SLOTS, STRIDE (queue address + geometry) + * 5. Sync TAIL =3D HEAD (CMH eSW owns HEAD; avoids stale-queue parse) + * 6. Readback verify QUEUE_LO/HI/SLOTS/STRIDE + * 7. Write HOST_INFO (signals CMH eSW "MBX configured") + * 8. Write COMMAND =3D MBX_COMMAND_RUN + * 9. Lock stays held -- released only in teardown + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "cmh_mqi.h" +#include "cmh_dma.h" +#include "cmh_registers.h" +#include "cmh_config.h" + +/* Flush polling: eSW clears R_MBX_COMMAND to 0 when flush completes */ +#define MBX_FLUSH_POLL_US 50 +#define MBX_FLUSH_TIMEOUT_US 1000000 /* 1 second */ + +/* MBX Lock / Unlock */ + +/* + * Attempt to acquire the MBX hardware lock. + * Returns the lock token (non-zero) on success, 0 on timeout. + */ +static u32 cmh_mbx_lock(void __iomem *reg_base, u32 instance) +{ + unsigned long deadline =3D jiffies + msecs_to_jiffies(MBX_LOCK_TIME= OUT_MS); + u32 lock; + + while (time_before(jiffies, deadline)) { + lock =3D cmh_reg_read32(reg_base, R_MBX_LOCK); + if (lock) { + dev_dbg(cmh_dev(), "mbx %u lock acquired (token=3D0= x%08x)\n", + instance, lock); + return lock; + } + /* HW lock may be held by CMH eSW -- back off before retry = */ + usleep_range(MBX_LOCK_POLL_MIN_US, MBX_LOCK_POLL_MAX_US); + } + + return 0; +} + +/* Release the MBX lock: clear interrupt mask, write token back */ +static void cmh_mbx_unlock(void __iomem *reg_base, u32 lock_val) +{ + cmh_reg_write32(0, reg_base, R_MBX_INTERRUPT_MASK); + cmh_reg_write32(lock_val, reg_base, R_MBX_LOCK); +} + +/* Register Readback Verification */ + +static int cmh_verify_reg(void __iomem *base, u32 offset, u32 expected, + const char *name, u32 instance) +{ + u32 actual =3D cmh_reg_read32(base, offset); + + if (actual !=3D expected) { + dev_err(cmh_dev(), "mbx %u %s readback mismatch: 0x%08x != =3D 0x%08x\n", + instance, name, actual, expected); + return -EIO; + } + return 0; +} + +/* Clear any stale interrupt bits left from a prior module lifecycle. */ +static void cmh_mbx_clear_stale_irqs(void __iomem *base, u32 instance) +{ + u32 stale =3D cmh_reg_read32(base, R_MBX_INTERRUPT); + + if (stale) { + cmh_reg_write32(stale, base, R_MBX_INTERRUPT); + dev_dbg(cmh_dev(), "mbx %u cleared stale irq bits=3D0x%x\n"= , + instance, stale); + } +} + +/* Read CMH eSW HEAD and set TAIL =3D HEAD so the queue appears empty. */ +static void cmh_mbx_sync_tail_to_head(void __iomem *base, u32 instance) +{ + u32 fw_head =3D cmh_reg_read32(base, R_MBX_QUEUE_HEAD); + + cmh_reg_write32(fw_head, base, R_MBX_QUEUE_TAIL); + if (fw_head) + dev_dbg(cmh_dev(), "mbx %u synced tail=3D%u to fw head\n", + instance, fw_head); +} + +/* Per-Mailbox Setup */ + +static int cmh_mbx_setup_one(struct cmh_mbx_config *mbx) +{ + void __iomem *base =3D mbx->reg_base; + u32 addr_lo =3D lower_32_bits(mbx->dma_handle); + u32 addr_hi =3D upper_32_bits(mbx->dma_handle); + u32 lock_val; + int ret; + + /* Step 1: Acquire exclusive access */ + lock_val =3D cmh_mbx_lock(base, mbx->instance); + if (!lock_val) { + dev_err(cmh_dev(), "mbx %u lock timeout after %u ms\n", + mbx->instance, MBX_LOCK_TIMEOUT_MS); + return -ETIMEDOUT; + } + + /* + * Step 1.5: Clear stale interrupt bits from a prior module lifecyc= le. + * + * After rmmod, the CMH eSW may leave ERROR_IRQ set in + * R_MBX_INTERRUPT even though STATUS is IDLE. If we enable + * the mask first, the stale bits immediately trigger the + * CMH eSW interrupt chain, which can cascade into ERROR + * status before the first hash operation. W1C-clear them first. + */ + cmh_mbx_clear_stale_irqs(base, mbx->instance); + + /* Step 2: Program interrupt mask (enable DONE/ERROR interrupts) */ + cmh_reg_write32(MBX_IRQ_MASK, base, R_MBX_INTERRUPT_MASK); + + /* Step 3: Configure queue address (64-bit split) */ + cmh_reg_write32(addr_lo, base, R_MBX_QUEUE_LO); + cmh_reg_write32(addr_hi, base, R_MBX_QUEUE_HI); + + /* Step 4: Configure queue geometry */ + cmh_reg_write32(mbx->slots_log2, base, R_MBX_QUEUE_SLOTS); + cmh_reg_write32(mbx->stride_log2, base, R_MBX_QUEUE_STRIDE); + + /* + * Step 5: Synchronise TAIL to CMH eSW's HEAD. + * + * R_MBX_QUEUE_HEAD is read-only from the host side -- only the + * CMH eSW can write it. On a fresh boot HEAD is 0; after an + * rmmod/insmod cycle it retains the value from the previous + * session (e.g. 44). Writing 0 from the host is silently + * dropped by the MBX HW. + * + * If we set TAIL=3D0 while HEAD=3D44 the CMH eSW sees a non-empty + * queue (head !=3D tail with wrap-around) and immediately tries + * to load a VCQ at the old head offset into our freshly-zeroed + * DMA buffer, causing an "Invalid VCQ" EFAULT -> ECHILD cascade. + * + * Fix: read HEAD and set TAIL =3D HEAD so the queue looks empty. + */ + cmh_mbx_sync_tail_to_head(base, mbx->instance); + + /* + * Step 6: Readback verify critical registers. + * HOST_INFO is deliberately deferred to after verification -- writ= ing + * it tells the CMH eSW "MBX is ready" and the CMH eSW may inspect + * (and clear) the queue registers immediately. + */ + ret =3D cmh_verify_reg(base, R_MBX_QUEUE_LO, addr_lo, + "QUEUE_LO", mbx->instance); + if (ret) + goto err_unlock; + + ret =3D cmh_verify_reg(base, R_MBX_QUEUE_HI, addr_hi, + "QUEUE_HI", mbx->instance); + if (ret) + goto err_unlock; + + ret =3D cmh_verify_reg(base, R_MBX_QUEUE_SLOTS, mbx->slots_log2, + "QUEUE_SLOTS", mbx->instance); + if (ret) + goto err_unlock; + + ret =3D cmh_verify_reg(base, R_MBX_QUEUE_STRIDE, mbx->stride_log2, + "QUEUE_STRIDE", mbx->instance); + if (ret) + goto err_unlock; + + /* + * Step 7: Host identification -- signals CMH eSW that MBX is confi= gured. + * Must come after readback verification (CMH eSW may inspect the M= BX + * immediately) and before COMMAND_RUN. + */ + cmh_reg_write32(MBX_HOST_INFO_LKM, base, R_MBX_HOST_INFO); + + /* Step 8: Enable -- start the mailbox */ + cmh_reg_write32(MBX_COMMAND_RUN, base, R_MBX_COMMAND); + + /* Read status while we still hold the lock */ + dev_dbg(cmh_dev(), "mbx %u setup: dma=3D0x%08x%08x slots=3D%u strid= e=3D%u status=3D0x%08x\n", + mbx->instance, addr_hi, addr_lo, + mbx->slots_log2, mbx->stride_log2, + cmh_reg_read32(base, R_MBX_STATUS)); + + /* + * Lock stays held for the lifetime of this MBX session. + * + * mbx->lock_val is the ownership token returned by R_MBX_LOCK at + * acquisition time. The CMH eSW validates this token on every + * register access and requires it to be written back to release. + * It is NOT a transient mutex -- it persists until teardown. + */ + mbx->lock_val =3D lock_val; + + return 0; + +err_unlock: + cmh_mbx_unlock(base, lock_val); + return ret; +} + +/* Per-Mailbox Teardown */ + +static void cmh_mbx_teardown_one(struct cmh_mbx_config *mbx) +{ + void __iomem *base =3D mbx->reg_base; + u32 status; + + if (!base || !mbx->lock_val) + return; + + if (MBX_STATUS_CODE(cmh_reg_read32(base, R_MBX_STATUS)) !=3D + MBX_STATUS_OFFLINE) { + cmh_reg_write32(MBX_COMMAND_FLUSH, base, R_MBX_COMMAND); + + /* + * Wait for the eSW to process the flush before releasing + * the DMA buffer. The eSW clears R_MBX_COMMAND to zero + * upon completion; if it doesn't within 1 s, log a + * warning and proceed (best-effort teardown). + * + * DMA safety: by this point the RH and TM are already + * shut down (remove order: algos -> RH -> TM -> MQI), + * so no new transactions can be submitted and no + * completions are in flight. The queue buffer is only + * read by the eSW during active command processing; + * after flush the eSW will not touch it again. + */ + if (read_poll_timeout(cmh_reg_read32, status, + status =3D=3D 0, + MBX_FLUSH_POLL_US, + MBX_FLUSH_TIMEOUT_US, + true, base, R_MBX_COMMAND)) + dev_warn(cmh_dev(), + "mbx %u flush timeout during teardown (sta= tus=3D0x%08x)\n", + mbx->instance, + cmh_reg_read32(base, R_MBX_STATUS)); + } + + cmh_mbx_unlock(base, mbx->lock_val); + mbx->lock_val =3D 0; +} + +/* Public Interface */ + +/** + * cmh_mqi_init() - Initialize all mailbox queues + * @cfg: CMH configuration describing the mailboxes to set up + * + * Allocates DMA queue buffers for each configured mailbox, then executes + * the MBX lock/setup/enable register sequence. On failure, all + * successfully initialized mailboxes are torn down and buffers freed. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_mqi_init(struct cmh_config *cfg) +{ + unsigned int i, j; + int ret; + + /* Allocate queue buffers */ + for (i =3D 0; i < cfg->mbx_count; i++) { + struct cmh_mbx_config *m =3D &cfg->mailboxes[i]; + + m->virt_addr =3D cmh_dma_alloc(m->queue_size, &m->dma_handl= e, + GFP_KERNEL); + if (!m->virt_addr) { + ret =3D -ENOMEM; + goto err_free_bufs; + } + + dev_dbg(cmh_dev(), "mqi[%u] alloc %zu bytes @ virt=3D%pK dm= a=3D%pad\n", + i, m->queue_size, m->virt_addr, &m->dma_handle); + } + + /* Lock/setup/enable each mailbox */ + for (i =3D 0; i < cfg->mbx_count; i++) { + ret =3D cmh_mbx_setup_one(&cfg->mailboxes[i]); + if (ret) { + dev_err(cmh_dev(), "mqi[%u] setup failed (rc=3D%d)\= n", + i, ret); + goto err_teardown; + } + } + + dev_info(cmh_dev(), "MQI init complete (%u mailboxes)\n", cfg->mbx_= count); + return 0; + +err_teardown: + for (j =3D 0; j < i; j++) + cmh_mbx_teardown_one(&cfg->mailboxes[j]); +err_free_bufs: + for (j =3D 0; j < cfg->mbx_count; j++) { + if (cfg->mailboxes[j].virt_addr) + cmh_dma_free(cfg->mailboxes[j].queue_size, + cfg->mailboxes[j].virt_addr, + cfg->mailboxes[j].dma_handle); + cfg->mailboxes[j].virt_addr =3D NULL; + cfg->mailboxes[j].dma_handle =3D 0; + } + return ret; +} + +/** + * cmh_mqi_cleanup() - Clean up all mailbox queues + * @cfg: CMH configuration describing the mailboxes to tear down + * + * Tears down each mailbox (flush + unlock) and frees the DMA queue + * buffers allocated by cmh_mqi_init(). + */ +void cmh_mqi_cleanup(struct cmh_config *cfg) +{ + unsigned int i; + + for (i =3D 0; i < cfg->mbx_count; i++) { + struct cmh_mbx_config *m =3D &cfg->mailboxes[i]; + + cmh_mbx_teardown_one(m); + + if (m->virt_addr) + cmh_dma_free(m->queue_size, m->virt_addr, + m->dma_handle); + m->virt_addr =3D NULL; + m->dma_handle =3D 0; + } + + dev_info(cmh_dev(), "MQI cleanup complete\n"); +} diff --git a/drivers/crypto/cmh/cmh_rh.c b/drivers/crypto/cmh/cmh_rh.c new file mode 100644 index 000000000000..48cb51d24a5e --- /dev/null +++ b/drivers/crypto/cmh/cmh_rh.c @@ -0,0 +1,1145 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Response Handler + * + * IRQ-driven completion processing using request_threaded_irq(): + * + * Hardirq: For each MBX, read R_MBX_INTERRUPT. If any bit is set, + * W1C-clear it and mark the MBX for threaded processing. + * Return IRQ_WAKE_THREAD if any MBX had work. + * + * Thread: For each pending MBX, read R_MBX_QUEUE_HEAD. Walk the + * per-MBX transaction queue (oldest first): for every txn + * whose last_vcq_id < new_head, check status, fire the + * completion callback, and free the transaction object. + * + * The DT "cri,cmh" node declares one PLIC interrupt per mailbox, + * matching the real CMH ch_sys_interrupt_mbx[N-1:0] topology. + * Each MBX gets its own Linux virq; the same hardirq/thread pair + * is registered for all of them. The handler still scans all + * mailboxes on every invocation -- this is intentional, as it + * provides robustness against coalesced or missed edges. + * + * IRQ source: resolved from the "cri,cmh" DT node at init time. + * The module's irq=3D parameter can override with a single shared IRQ. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cmh_rh.h" +#include "cmh_txn.h" +#include "cmh_registers.h" +#include "cmh_config.h" +#include "cmh_debugfs.h" +#include "cmh_dma.h" + +/* Per-mailbox IRQ bookkeeping */ +struct cmh_rh_mbx { + u32 last_head; /* last-observed MBX head position */ + atomic_t irq_bits; /* interrupt bits saved by hardirq (atomic_o= r) */ + bool pending; /* threaded handler should process this MBX= */ + bool restart_pending; /* RESTART issued, awaiting eSW ack */ + u32 restart_retries; /* watchdog ticks since RESTART issued = */ + u32 flush_count; /* consecutive failed FLUSH escalations= */ + bool wedged; /* recovery failed, MBX offline */ + u32 abort_stall_ticks; /* ticks since async timeout ABORT is= sued */ +}; + +/* Module-level RH state */ +static struct { + struct cmh_config *cfg; + int irqs[CMH_MAX_CONFIGURED_MBX]; /* per-MBX vi= rqs */ + u32 nirqs; /* number of registered IRQ= s */ + struct cmh_rh_mbx *mbx; /* array[cfg->mbx_count] */ + atomic_t irq_count; /* hardirq invocation count= er */ + bool active; +} rh; + +/* + * Serialise the read-last_head / process_mbx / update-last_head + * sequence between the threaded IRQ handler (process context) and + * the watchdog timer (softirq context). Without this, a timer + * softirq can preempt the kthread mid-sequence, causing both paths + * to process the same head advance and prematurely complete a + * subsequent transaction before the CMH eSW has written its DMA + * output -- leading to data corruption and SLAB freelist poisoning. + * + * The kthread acquires with spin_lock_bh (disables softirqs), the + * watchdog acquires with spin_lock (already in softirq context). + */ +static DEFINE_SPINLOCK(rh_process_lock); + +/* + * Watchdog timer -- missed-IRQ recovery. + * + * Fires every watchdog_ms while rh.active. Reads MBX head registers; + * if any head has advanced without an IRQ, processes completions and + * logs a notice. Standard kernel pattern, analogous to NIC watchdog + * timers. + * + * Safe from timer/softirq context: cmh_reg_read32() is an MMIO read, + * cmh_tm_pop_transaction() uses spin_lock_irqsave(), and TM completion + * callbacks (crypto_request_complete et al.) are documented safe from + * any context including softirq. rh_process_lock serialises the + * head-read / process / head-update sequence against the threaded + * IRQ handler to prevent double-processing of the same completion. + * + * Default 200 ms (5 fires/s) provides ~10 recovery attempts within + * the default vcq_timeout_ms (2 s). Tune via debugfs config/watchdog_ms + * for platforms where interrupt delivery is more reliable (e.g. MSI on + * FPGA/silicon -- 500 ms--1 s may suffice as a safety net). + */ +#define CMH_RH_WATCHDOG_MS_DEFAULT 200 + +/* + * Floor for watchdog_ms to prevent a zero/near-zero value from + * spinning the timer in a tight softirq loop. Enforced at the + * point of use so debugfs writes are never rejected. + */ +#define CMH_RH_WATCHDOG_MS_MIN 10 + +/* + * Maximum watchdog ticks to wait for the eSW to process RESTART + * before escalating to FLUSH. At the default 200 ms interval, + * 5 retries =3D 1 s -- generous for an operation that should take + * microseconds. If the eSW hasn't responded by then, issue + * MBX_COMMAND_FLUSH to hard-reset the mailbox state. + */ +#define CMH_RH_RESTART_MAX_RETRIES 5 + +/* + * Maximum consecutive FLUSH escalations before marking the MBX as + * wedged. Each FLUSH cycle takes RESTART_MAX_RETRIES watchdog ticks + * (~1 s at default interval). Two failed FLUSHes (~2 s total) + * strongly indicate the eSW is not processing MBX commands at all. + */ +#define CMH_RH_FLUSH_MAX_FAILURES 2 + +/* + * Time budget (ms) after an async timeout ABORT before escalating + * to FLUSH + force-drain. Converted to watchdog ticks at runtime + * via abort_stall_ms / watchdog_ms, so the actual wall-clock bound + * stays constant regardless of watchdog_ms tuning. + * + * The stall detector fires when: + * - The head-of-queue transaction is in TXN_TIMED_OUT state + * - HEAD hasn't advanced (eSW didn't process the ABORT) + * - abort_stall_ticks exceeds the derived threshold + * + * At that point we issue FLUSH + force-drain, completing all pending + * transactions with -ETIMEDOUT and waking any blocked waiters. + * + * Default 5000 ms bounds worst-case D-state to + * async_timeout (2 s) + abort_stall (5 s) =3D ~7 s. + */ +#define CMH_RH_ABORT_STALL_MS 5000 + +static unsigned int watchdog_ms =3D CMH_RH_WATCHDOG_MS_DEFAULT; + +/* + * Re-poke R_MBX_QUEUE_TAIL to generate a fresh interrupt to the eSW. + * Writing the current value back is a queue no-op but guarantees a + * SIC interrupt edge, ensuring the eSW wakes from WFI. + */ +static void cmh_rh_poke_tail(void __iomem *base) +{ + u32 tail =3D cmh_reg_read32(base, R_MBX_QUEUE_TAIL); + + cmh_reg_write32(tail, base, R_MBX_QUEUE_TAIL); +} + +/* + * Drain all remaining in-flight transactions for a mailbox, completing + * each with the given error code. Called after FLUSH (which discards + * all queued VCQs) or when marking a mailbox as wedged. Updates + * last_head to the current hardware HEAD so subsequent polls don't + * re-process the same (now-dead) VCQ IDs as successful completions. + * + * Caller must hold rh_process_lock. + */ +static void cmh_rh_drain_mbx(u32 mbx_idx, int error) +{ + struct transaction_obj *txn; + + while ((txn =3D cmh_tm_pop_transaction(mbx_idx)) !=3D NULL) { + dev_dbg(cmh_dev(), "rh: mbx[%u] drain vcq=3D%u..%u err=3D%d= \n", + mbx_idx, txn->first_vcq_id, + txn->last_vcq_id, error); + cmh_txn_finish(txn, error); + cmh_tm_txq_completion_notify(); + } + + rh.mbx[mbx_idx].last_head =3D + cmh_reg_read32(rh.cfg->mailboxes[mbx_idx].reg_base, + R_MBX_QUEUE_HEAD); +} + +/** + * cmh_rh_force_drain_mbx() - FLUSH + drain a mailbox from external contex= t + * @mbx_idx: Mailbox index to drain + * + * Issues MBX_COMMAND_FLUSH to the eSW, drains all pending transactions + * (completing each with -ECANCELED), and resets all recovery bookkeeping + * including the wedged flag. This is an administrative last-resort + * recovery path exposed via debugfs. + * + * Context: process context. Acquires rh_process_lock internally. + */ +void cmh_rh_force_drain_mbx(u32 mbx_idx) +{ + void __iomem *base; + + if (!rh.cfg || !rh.mbx || mbx_idx >=3D rh.cfg->mbx_count) + return; + + base =3D rh.cfg->mailboxes[mbx_idx].reg_base; + + dev_warn(cmh_dev(), "rh: force-drain mbx[%u] (debugfs)\n", mbx_idx)= ; + spin_lock_bh(&rh_process_lock); + cmh_reg_write32(MBX_IRQ_MASK, base, R_MBX_INTERRUPT); + cmh_reg_write32(MBX_COMMAND_FLUSH, base, R_MBX_COMMAND); + cmh_rh_poke_tail(base); + cmh_rh_drain_mbx(mbx_idx, -ECANCELED); + rh.mbx[mbx_idx].abort_stall_ticks =3D 0; + WRITE_ONCE(rh.mbx[mbx_idx].restart_pending, false); + rh.mbx[mbx_idx].restart_retries =3D 0; + rh.mbx[mbx_idx].flush_count =3D 0; + WRITE_ONCE(rh.mbx[mbx_idx].wedged, false); + spin_unlock_bh(&rh_process_lock); +} + +/** + * cmh_rh_mbx_is_wedged() - Check if a mailbox is permanently wedged + * @mbx_idx: Mailbox index to check + * + * Return: true if the mailbox has failed recovery and is offline. + */ +bool cmh_rh_mbx_is_wedged(u32 mbx_idx) +{ + if (!rh.mbx || !rh.cfg || mbx_idx >=3D rh.cfg->mbx_count) + return false; + + return READ_ONCE(rh.mbx[mbx_idx].wedged); +} + +/** + * cmh_rh_abort_mbx() - Issue MBX_COMMAND_ABORT under rh_process_lock + * @mbx_idx: Mailbox index to abort + * + * Serialises the ABORT write with RESTART/FLUSH commands issued by the + * watchdog, preventing command-register clobber races. Safe to call + * from any context (uses spin_lock_bh). + */ +void cmh_rh_abort_mbx(u32 mbx_idx) +{ + void __iomem *base; + + if (!rh.cfg || !rh.mbx || mbx_idx >=3D rh.cfg->mbx_count) + return; + + base =3D rh.cfg->mailboxes[mbx_idx].reg_base; + + spin_lock_bh(&rh_process_lock); + cmh_reg_write32(MBX_COMMAND_ABORT, base, R_MBX_COMMAND); + spin_unlock_bh(&rh_process_lock); +} + +static struct timer_list rh_watchdog; + +/* + * Hardirq handler -- runs with interrupts disabled. + * + * Read and W1C-clear R_MBX_INTERRUPT for each mailbox. + * If any MBX had a pending interrupt, return IRQ_WAKE_THREAD. + * Shared-IRQ safe: returns IRQ_NONE if we didn't handle anything. + */ +static irqreturn_t cmh_rh_hardirq(int irq, void *data) +{ + struct cmh_config *cfg =3D data; + bool handled =3D false; + u32 i; + + for (i =3D 0; i < cfg->mbx_count; i++) { + void __iomem *base =3D cfg->mailboxes[i].reg_base; + u32 bits; + + bits =3D cmh_reg_read32(base, R_MBX_INTERRUPT); + if (!bits) + continue; + + /* W1C: write back the set bits to clear them */ + cmh_reg_write32(bits, base, R_MBX_INTERRUPT); + + /* + * Accumulate bits atomically so a second hardirq + * firing while the threaded handler runs does not + * overwrite the first set of bits. + */ + atomic_or((int)bits, &rh.mbx[i].irq_bits); + WRITE_ONCE(rh.mbx[i].pending, true); + handled =3D true; + } + + /* + * Ordering: the kernel IRQ threading infrastructure + * performs a full barrier between hardirq return and + * the threaded handler invocation. + */ + if (handled) + atomic_inc(&rh.irq_count); + + return handled ? IRQ_WAKE_THREAD : IRQ_NONE; +} + +/* + * Process completions for a single mailbox. + * + * Walk the per-MBX transaction queue FIFO. For each transaction + * whose last_vcq_id is strictly less than the new head, fire the + * completion callback and free the object. + * + * "Strictly less than" using signed (s32) arithmetic handles wrap-around: + * the CMH eSW uses monotonically increasing 32-bit VCQ IDs. + */ +static void cmh_rh_process_mbx(u32 mbx_idx, u32 new_head, u32 irq_bits) +{ + struct transaction_obj *txn; + int error =3D 0; + + /* Determine error state from saved IRQ bits */ + if (irq_bits & MBX_ERROR_IRQ) { + void __iomem *base =3D rh.cfg->mailboxes[mbx_idx].reg_base; + u32 status =3D cmh_reg_read32(base, R_MBX_STATUS); + + error =3D -EIO; + dev_dbg(cmh_dev(), "rh: mbx[%u] error status=3D0x%08x (code= =3D%u cmd_idx=3D%u)\n", + mbx_idx, status, + MBX_STATUS_ERROR_CODE(status), + MBX_STATUS_CMD_INDEX(status)); + + /* + * ECHILD (10) in the parent status means a child VCQ + * failed internally. Read R_MBX_CHILD for the actual + * root cause (real errno, child core ID, child cmd idx). + */ + if (MBX_STATUS_ERROR_CODE(status) =3D=3D ECHILD) { + u32 child =3D cmh_reg_read32(base, R_MBX_CHILD); + + dev_dbg(cmh_dev(), + "rh: mbx[%u] child error=3D0x%08x (core=3D%= u code=3D%u cmd_idx=3D%u)\n", + mbx_idx, child, + MBX_STATUS_CORE_ID(child), + MBX_STATUS_ERROR_CODE(child), + MBX_STATUS_CMD_INDEX(child)); + } + + /* + * CMH eSW does not advance head on error -- the MBX is + * stuck in ERROR state until the host issues a recovery + * command. However, HEAD may have advanced past one or + * more already-completed transactions before the error + * occurred (their completion IRQ may not have been + * processed yet). Retire those normally first, then + * force-complete the NEXT transaction (the one that + * actually failed) with -EIO. + * + * MBX command semantics after ERROR: + * CONTINUE -- re-run the same VCQ at HEAD (retry) + * RESTART -- advance HEAD+1, skip failed, resume + * FLUSH -- HEAD=3DTAIL, flush all HWCs, discard queue + */ + + /* First: retire transactions completed before the error */ + while ((txn =3D cmh_tm_peek_transaction(mbx_idx)) !=3D NULL= ) { + if ((s32)(new_head - txn->last_vcq_id) <=3D 0) + break; + txn =3D cmh_tm_pop_transaction(mbx_idx); + if (!txn) + break; + dev_dbg(cmh_dev(), + "rh: mbx[%u] pre-error complete vcq=3D%u..%= u\n", + mbx_idx, txn->first_vcq_id, + txn->last_vcq_id); + cmh_txn_finish(txn, 0); + cmh_tm_txq_completion_notify(); + } + + /* Now pop and fail the transaction that actually errored *= / + txn =3D cmh_tm_pop_transaction(mbx_idx); + if (txn) { + dev_dbg(cmh_dev(), "rh: mbx[%u] error-complete vcq= =3D%u..%u\n", + mbx_idx, txn->first_vcq_id, + txn->last_vcq_id); + cmh_txn_finish(txn, error); + cmh_tm_txq_completion_notify(); + } else { + u32 head_reg, tail_reg; + + head_reg =3D cmh_reg_read32(base, R_MBX_QUEUE_HEAD)= ; + tail_reg =3D cmh_reg_read32(base, R_MBX_QUEUE_TAIL)= ; + dev_warn_ratelimited(cmh_dev(), + "rh: mbx[%u] ERROR with empty = txn queue (orphaned) status=3D0x%08x head=3D%u tail=3D%u core=3D%u ecode=3D= %u cmd_idx=3D%u\n", + mbx_idx, status, + head_reg, tail_reg, + MBX_STATUS_CORE_ID(status), + MBX_STATUS_ERROR_CODE(status), + MBX_STATUS_CMD_INDEX(status)); + } + { + struct cmh_mbx_stats *s =3D cmh_debugfs_mbx_stats(m= bx_idx); + + if (s) + atomic64_inc(&s->vcqs_errors); + } + + /* + * W1C-clear R_MBX_INTERRUPT before issuing RESTART. + * + * The eSW sets MBX_ERROR_IRQ in R_MBX_INTERRUPT when + * it writes ERROR status. On platforms where the + * hardirq handler runs (IRQ wired to GIC), this bit + * is cleared there. On polling-only platforms (no + * IRQ line), it must be cleared explicitly before + * issuing a recovery command to de-assert the + * MBX-to-SIC interrupt line. + */ + cmh_reg_write32(MBX_IRQ_MASK, base, R_MBX_INTERRUPT); + cmh_reg_write32(MBX_COMMAND_RESTART, base, R_MBX_COMMAND); + + /* + * Poke R_MBX_QUEUE_TAIL to guarantee the eSW receives + * an interrupt. + * + * Writing R_MBX_COMMAND alone may not produce a new + * SIC interrupt edge if the MBX-to-SIC line is still + * asserted from prior error processing. The eSW RUN + * handler re-writes ERROR_IRQ to R_MBX_INTERRUPT on + * every spurious wakeup while in ERROR state, which + * can keep the SIC line high on level-triggered HW. + * + * R_MBX_QUEUE_TAIL writes always generate a fresh + * interrupt to the eSW (this is the normal VCQ + * submission path). Writing the current TAIL value + * back is a no-op from the queue perspective but + * ensures the eSW wakes from WFI and processes the + * RESTART command. + */ + cmh_rh_poke_tail(base); + WRITE_ONCE(rh.mbx[mbx_idx].restart_pending, true); + rh.mbx[mbx_idx].restart_retries =3D 0; + return; + } + + /* + * Pop completed transactions. A transaction is complete when + * the CMH eSW has advanced head past its last VCQ ID: + * (s32)(new_head - txn->last_vcq_id) > 0 + * Using signed comparison for correct wrap-around handling. + * + * Multi-VCQ note: transactions spanning multiple slots (e.g. + * SLH-DSA with 3+ VCQs) are treated atomically -- either the + * head has passed all of them or none. The CMH eSW processes + * multi-VCQ groups sequentially within a single mailbox and + * only advances HEAD after the entire group completes. Per-slot + * progress validation (checking intermediate HEAD positions + * within a multi-VCQ group) is not implemented because: + * 1. The eSW guarantees atomic group completion semantics + * 2. Partial progress is only observable during processing, + * never at a completion boundary + * 3. Adding intermediate checks would require tracking + * per-slot status with no correctness benefit + * + * A defensive WARN_ON_ONCE detects eSW misbehavior: if HEAD + * lands between first_vcq_id and last_vcq_id of a multi-VCQ + * transaction, the eSW violated its atomic group contract. + */ + while ((txn =3D cmh_tm_peek_transaction(mbx_idx)) !=3D NULL) { + if ((s32)(new_head - txn->last_vcq_id) <=3D 0) { + /* + * Not yet complete. For multi-VCQ transactions, + * assert HEAD hasn't partially advanced into the + * group -- that would indicate eSW firmware bug. + */ + WARN_ON_ONCE(txn->first_vcq_id !=3D txn->last_vcq_i= d && + (s32)(new_head - txn->first_vcq_id) > = 0); + break; + } + + txn =3D cmh_tm_pop_transaction(mbx_idx); + if (!txn) + break; + + dev_dbg(cmh_dev(), "rh: mbx[%u] complete vcq=3D%u..%u err= =3D%d\n", + mbx_idx, txn->first_vcq_id, txn->last_vcq_id, + error); + + { + struct cmh_mbx_stats *s =3D cmh_debugfs_mbx_stats(m= bx_idx); + + if (s) { + u32 n =3D txn->last_vcq_id - + txn->first_vcq_id + 1; + + atomic64_add(n, &s->vcqs_completed); + } + } + + cmh_txn_finish(txn, error); + cmh_tm_txq_completion_notify(); + } +} + +/* + * Threaded IRQ handler -- runs in process context. + * + * Walk all MBXes that had pending interrupts. After processing the + * pending set, do a final hardware poll of all MBX head registers to + * catch completions whose PLIC interrupt was consumed during an + * earlier register access (e.g. an inline interrupt notification + * during MMIO can cause the PLIC edge to be claimed before the + * hardirq sees it). + */ +static irqreturn_t cmh_rh_thread(int irq, void *data) +{ + struct cmh_config *cfg =3D data; + u32 i; + bool recheck; + + do { + recheck =3D false; + + for (i =3D 0; i < cfg->mbx_count; i++) { + u32 new_head, irq_bits; + + if (!READ_ONCE(rh.mbx[i].pending)) + continue; + + irq_bits =3D (u32)atomic_xchg(&rh.mbx[i].irq_bits, = 0); + WRITE_ONCE(rh.mbx[i].pending, false); + + spin_lock_bh(&rh_process_lock); + new_head =3D cmh_reg_read32(cfg->mailboxes[i].reg_b= ase, + R_MBX_QUEUE_HEAD); + + if (new_head =3D=3D rh.mbx[i].last_head && !irq_bit= s) { + spin_unlock_bh(&rh_process_lock); + continue; + } + + cmh_rh_process_mbx(i, new_head, irq_bits); + rh.mbx[i].last_head =3D new_head; + spin_unlock_bh(&rh_process_lock); + } + + /* + * Re-check: if the hardirq fired again while we were + * processing, pending flags will be set again. + */ + for (i =3D 0; i < cfg->mbx_count; i++) { + if (READ_ONCE(rh.mbx[i].pending)) { + recheck =3D true; + break; + } + } + } while (recheck); + + /* + * Final hardware poll: read every MBX head register and status + * to catch completions or errors whose interrupt was missed. + */ + for (i =3D 0; i < cfg->mbx_count; i++) { + u32 new_head; + u32 status; + u32 poll_irq_bits =3D 0; + + spin_lock_bh(&rh_process_lock); + new_head =3D cmh_reg_read32(cfg->mailboxes[i].reg_base, + R_MBX_QUEUE_HEAD); + status =3D cmh_reg_read32(cfg->mailboxes[i].reg_base, + R_MBX_STATUS); + + if (MBX_STATUS_CODE(status) =3D=3D MBX_STATUS_ERROR) { + if (READ_ONCE(rh.mbx[i].wedged)) { + spin_unlock_bh(&rh_process_lock); + continue; + } + if (READ_ONCE(rh.mbx[i].restart_pending)) { + /* + * HEAD advanced while restart_pending mean= s + * RESTART worked but next VCQ also failed. + * Clear restart state and process new erro= r. + */ + if (new_head !=3D rh.mbx[i].last_head) { + WRITE_ONCE(rh.mbx[i].restart_pendin= g, + false); + rh.mbx[i].restart_retries =3D 0; + } else { + spin_unlock_bh(&rh_process_lock); + continue; + } + } + poll_irq_bits =3D MBX_ERROR_IRQ; + } else { + WRITE_ONCE(rh.mbx[i].restart_pending, false); + rh.mbx[i].restart_retries =3D 0; + rh.mbx[i].flush_count =3D 0; + } + + if (new_head !=3D rh.mbx[i].last_head || poll_irq_bits) { + cmh_rh_process_mbx(i, new_head, poll_irq_bits); + rh.mbx[i].last_head =3D new_head; + } + spin_unlock_bh(&rh_process_lock); + } + + return IRQ_HANDLED; +} + +/* + * Watchdog timer callback -- missed-IRQ recovery. + * + * Reads all MBX head registers. If any head advanced without a + * corresponding IRQ, process the completions here. Re-arms itself + * while rh.active is true. + */ +static void cmh_rh_watchdog_fn(struct timer_list *t) +{ + u32 i; + + if (!rh.active || !rh.cfg || !rh.mbx) + return; + + for (i =3D 0; i < rh.cfg->mbx_count; i++) { + u32 new_head; + u32 status; + u32 irq_bits =3D 0; + + spin_lock(&rh_process_lock); + new_head =3D cmh_reg_read32(rh.cfg->mailboxes[i].reg_base, + R_MBX_QUEUE_HEAD); + status =3D cmh_reg_read32(rh.cfg->mailboxes[i].reg_base, + R_MBX_STATUS); + + if (MBX_STATUS_CODE(status) =3D=3D MBX_STATUS_ERROR) { + if (READ_ONCE(rh.mbx[i].wedged)) { + spin_unlock(&rh_process_lock); + continue; + } + /* + * Back-to-back failure scenario: the crypto API + * (e.g. testmgr) may submit requests continuously. + * If RESTART succeeds but the next VCQ also fails, + * the entire RESTART->IDLE->RUN->ERROR cycle can + * complete within a single 200ms watchdog period. + * Without the HEAD-advance check below, the watchd= og + * would mistake the new error for a failed RESTART= , + * increment restart_retries, and eventually escala= te + * to FLUSH -- wedging the mailbox unnecessarily. + */ + if (READ_ONCE(rh.mbx[i].restart_pending)) { + void __iomem *base =3D + rh.cfg->mailboxes[i].reg_base; + + /* + * HEAD advanced since RESTART was issued: + * RESTART succeeded, this is a fresh error= . + * Clear recovery state and process normall= y. + */ + if (new_head !=3D rh.mbx[i].last_head) { + dev_dbg(cmh_dev(), + "rh: watchdog: mbx[%u] head= advanced %u->%u during restart -- new error\n", + i, rh.mbx[i].last_head, + new_head); + WRITE_ONCE(rh.mbx[i].restart_pendin= g, + false); + rh.mbx[i].restart_retries =3D 0; + goto new_error; + } + + rh.mbx[i].restart_retries++; + if (rh.mbx[i].restart_retries > + CMH_RH_RESTART_MAX_RETRIES) { + rh.mbx[i].flush_count++; + if (rh.mbx[i].flush_count >=3D + CMH_RH_FLUSH_MAX_FAILURES) { + u32 hb, ei, cmd; + + cmd =3D cmh_reg_read32(base= , R_MBX_COMMAND); + hb =3D cmh_reg_read32(rh.cf= g->sic_mapped, + R_SIC_S= W_HEARTBEAT); + ei =3D cmh_reg_read32(rh.cf= g->sic_mapped, + R_SIC_S= W_ERROR_INFO); + dev_crit(cmh_dev(), + "rh: mbx[%u] wedge= d after %u FLUSHes (cmd=3D0x%x status=3D0x%x hb=3D0x%x err=3D0x%x)\n", + i, + rh.mbx[i].flush_co= unt, + cmd, status, + hb, ei); + WRITE_ONCE(rh.mbx[i].wedged= , + true); + cmh_rh_drain_mbx(i, -EIO); + spin_unlock(&rh_process_loc= k); + continue; + } + /* + * Backstop: eSW did not respond + * to RESTART within the retry + * budget. Escalate to FLUSH + * which is a harder reset of + * the eSW mailbox state. + */ + dev_err(cmh_dev(), + "rh: watchdog: mbx[%u] REST= ART unresponsive after %u ticks, escalating to FLUSH (attempt %u/%u)\n", + i, rh.mbx[i].restart_retrie= s, + rh.mbx[i].flush_count, + CMH_RH_FLUSH_MAX_FAILURES); + cmh_reg_write32(MBX_IRQ_MASK, + base, + R_MBX_INTERRUPT); + cmh_reg_write32(MBX_COMMAND_FLUSH, + base, + R_MBX_COMMAND); + cmh_rh_poke_tail(base); + cmh_rh_drain_mbx(i, -EIO); + WRITE_ONCE(rh.mbx[i].restart_pendin= g, + false); + rh.mbx[i].restart_retries =3D 0; + spin_unlock(&rh_process_lock); + continue; + } + /* + * RESTART was already issued on a prior + * tick but the eSW hasn't cleared the + * ERROR status yet. Do NOT pop another + * transaction -- that would cascade-kill + * unrelated in-flight work. Re-poke TAIL + * in case the eSW missed the interrupt. + */ + cmh_rh_poke_tail(base); + dev_dbg_ratelimited(cmh_dev(), + "rh: watchdog: mbx[%u] = restart pending (%u/%u) status=3D0x%08x, re-poke\n", + i, + rh.mbx[i].restart_retri= es, + CMH_RH_RESTART_MAX_RETR= IES, + status); + spin_unlock(&rh_process_lock); + continue; + } +new_error: + dev_dbg_ratelimited(cmh_dev(), + "rh: watchdog: mbx[%u] error st= atus=3D0x%08x (missed error IRQ) head=3D%u tail=3D%u core=3D%u ecode=3D%u c= md_idx=3D%u\n", + i, status, new_head, + cmh_reg_read32(rh.cfg->mailboxe= s[i].reg_base, + R_MBX_QUEUE_TAIL= ), + MBX_STATUS_CORE_ID(status), + MBX_STATUS_ERROR_CODE(status), + MBX_STATUS_CMD_INDEX(status)); + irq_bits =3D MBX_ERROR_IRQ; + } else { + /* eSW cleared ERROR -- recovery succeeded */ + WRITE_ONCE(rh.mbx[i].restart_pending, false); + rh.mbx[i].restart_retries =3D 0; + rh.mbx[i].flush_count =3D 0; + } + + if (new_head !=3D rh.mbx[i].last_head || irq_bits) { + if (new_head !=3D rh.mbx[i].last_head) + dev_dbg_ratelimited(cmh_dev(), + "rh: watchdog: mbx[%u] = head %u->%u (missed IRQ recovery)\n", + i, rh.mbx[i].last_head, + new_head); + cmh_rh_process_mbx(i, new_head, irq_bits); + rh.mbx[i].last_head =3D new_head; + rh.mbx[i].abort_stall_ticks =3D 0; + } + + /* + * Abort-stall detector: if the head-of-queue transaction + * timed out (state =3D=3D TXN_TIMED_OUT) but the eSW hasn'= t + * responded (HEAD didn't advance, no ERROR status): + * + * tick 1: issue MBX_COMMAND_ABORT (serialised + * under rh_process_lock -- safe against + * concurrent RESTART/FLUSH) + * ticks 2..N-1: wait for eSW to respond with ERROR + * tick N: escalate to FLUSH + force-drain + * + * If the eSW responds with ERROR between ticks, the ERROR + * status branch above handles RESTART recovery and resets + * abort_stall_ticks via the restart_pending guard. + */ + if (!READ_ONCE(rh.mbx[i].wedged) && + !READ_ONCE(rh.mbx[i].restart_pending)) { + struct transaction_obj *head_txn; + + head_txn =3D cmh_tm_peek_transaction(i); + if (head_txn && + atomic_read(&head_txn->state) =3D=3D TXN_TIMED_= OUT) { + unsigned int stall_max; + void __iomem *base =3D + rh.cfg->mailboxes[i].reg_base; + + rh.mbx[i].abort_stall_ticks++; + + if (rh.mbx[i].abort_stall_ticks =3D=3D 1) { + dev_warn(cmh_dev(), + "rh: watchdog: mbx[%u] hea= d txn timed out, issuing ABORT\n", + i); + cmh_reg_write32(MBX_COMMAND_ABORT, + base, + R_MBX_COMMAND); + } + + stall_max =3D DIV_ROUND_UP(CMH_RH_ABORT_STA= LL_MS, + max(watchdog_ms, + CMH_RH_WATCHDO= G_MS_MIN)); + if (rh.mbx[i].abort_stall_ticks >=3D + stall_max) { + dev_err(cmh_dev(), + "rh: watchdog: mbx[%u] abor= t stall (%u ticks) -- FLUSH + drain\n", + i, rh.mbx[i].abort_stall_ti= cks); + cmh_reg_write32(MBX_COMMAND_FLUSH, + base, R_MBX_COMMAND= ); + cmh_rh_drain_mbx(i, -ETIMEDOUT); + rh.mbx[i].abort_stall_ticks =3D 0; + } + } else { + rh.mbx[i].abort_stall_ticks =3D 0; + } + } + spin_unlock(&rh_process_lock); + } + + if (rh.active) { + unsigned int wdog =3D max(watchdog_ms, CMH_RH_WATCHDOG_MS_M= IN); + + mod_timer(&rh_watchdog, + jiffies + msecs_to_jiffies(wdog)); + } +} + +/* + * Resolve per-MBX Linux virqs for the CMH interrupt lines. + * + * Strategy: + * 1. If cfg->irq >=3D 0, use it as a shared IRQ for all MBXes + * 2. Otherwise, find the "cri,cmh" DT node and map one IRQ per + * active mailbox using cfg->mailboxes[i].instance as the DT + * interrupt index (matching the per-MBX PLIC wiring) + * + * Populates rh.irqs[] and rh.nirqs. Returns 0 on success, or a + * negative errno if no IRQs could be resolved (polling-only mode). + */ +static int cmh_rh_resolve_irqs(struct cmh_config *cfg) +{ + struct device_node *np; + u32 i; + + rh.nirqs =3D 0; + + /* Single legacy IRQ from DT: shared across all MBXes */ + if (cfg->irq >=3D 0) { + rh.irqs[0] =3D cfg->irq; + rh.nirqs =3D 1; + dev_dbg(cmh_dev(), "rh: using single DT IRQ %d for all %u M= BXes\n", + cfg->irq, cfg->mbx_count); + return 0; + } + + np =3D cfg->of_node; + if (!np) { + dev_warn(cmh_dev(), "rh: no DT node -- IRQ disabled\n"); + return -ENODEV; + } + + for (i =3D 0; i < cfg->mbx_count; i++) { + int dt_idx =3D cfg->mailboxes[i].instance; + int virq =3D of_irq_get(np, dt_idx); + + if (virq <=3D 0) { + dev_warn(cmh_dev(), "rh: failed to map IRQ for MBX%= u (DT index %d, rc=3D%d)\n", + i, dt_idx, virq); + return -ENODEV; + } + rh.irqs[i] =3D virq; + dev_dbg(cmh_dev(), "rh: MBX%u -> IRQ %d (DT index %d)\n", + i, virq, dt_idx); + } + + rh.nirqs =3D cfg->mbx_count; + return 0; +} + +/** + * cmh_rh_init() - Initialize the response handler + * @cfg: Device configuration (mailbox count, MMIO bases, IRQ info) + * + * Resolve per-mailbox IRQs from the device tree (or module parameter + * override), register threaded IRQ handlers (hardirq + kthread), and + * arm the missed-IRQ software watchdog timer. If no IRQs can be + * resolved, falls back to watchdog-only polling mode. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_rh_init(struct cmh_config *cfg) +{ + int ret; + u32 i; + + rh.cfg =3D cfg; + rh.nirqs =3D 0; + rh.active =3D false; + atomic_set(&rh.irq_count, 0); + + /* Allocate per-MBX tracking */ + rh.mbx =3D kcalloc(cfg->mbx_count, sizeof(*rh.mbx), GFP_KERNEL); + if (!rh.mbx) + return -ENOMEM; + + /* Resolve per-MBX IRQs */ + if (cmh_rh_resolve_irqs(cfg) < 0) { + /* + * No IRQs available. The watchdog timer provides + * a polling fallback: it reads MBX head registers + * periodically and processes completions. This is + * slower than IRQ-driven completion but functional. + * + * Completion latency in polling-only mode is bounded + * by the watchdog interval (default 200 ms, tunable + * via debugfs config/watchdog_ms). + */ + dev_warn(cmh_dev(), + "rh: no IRQs -- using watchdog polling (interval %= u ms)\n", + watchdog_ms); + + /* Seed last_head from HW before first watchdog tick */ + for (i =3D 0; i < cfg->mbx_count; i++) + rh.mbx[i].last_head =3D + cmh_reg_read32(cfg->mailboxes[i].reg_base, + R_MBX_QUEUE_HEAD); + + rh.active =3D true; + timer_setup(&rh_watchdog, cmh_rh_watchdog_fn, 0); + mod_timer(&rh_watchdog, jiffies + + msecs_to_jiffies(max(watchdog_ms, + CMH_RH_WATCHDOG_MS_MIN))); + return 0; + } + + /* Initialize per-MBX state: read current head positions */ + for (i =3D 0; i < cfg->mbx_count; i++) + rh.mbx[i].last_head =3D cmh_reg_read32(rh.cfg->mailboxes[i]= .reg_base, + R_MBX_QUEUE_HEAD); + + /* + * Register threaded IRQ handlers. + * + * DT per-MBX path: one distinct virq per MBX, nirqs =3D=3D mbx_cou= nt. + * DT single-IRQ path: one shared IRQ, nirqs =3D=3D 1. The handler + * scans all mailboxes unconditionally, so a single registration + * suffices. + * + * Use IRQF_SHARED only for the single-IRQ path where one line + * is shared across all MBXes. Dedicated per-MBX virqs need no + * sharing flag. + */ + { + unsigned long irqflags =3D (rh.nirqs =3D=3D 1 && cfg->mbx_c= ount > 1) + ? IRQF_SHARED : 0; + + for (i =3D 0; i < rh.nirqs; i++) { + ret =3D request_threaded_irq(rh.irqs[i], + cmh_rh_hardirq, + cmh_rh_thread, + irqflags, + "cmh", cfg); + if (ret) { + dev_err(cmh_dev(), "rh: request_threaded_ir= q(%d) for MBX%u failed (rc=3D%d)\n", + rh.irqs[i], i, ret); + /* Unwind previously registered IRQs */ + while (i--) + free_irq(rh.irqs[i], cfg); + rh.nirqs =3D 0; + kfree(rh.mbx); + rh.mbx =3D NULL; + return ret; + } + } + } + + rh.active =3D true; + + /* Enable MBX completion interrupts (DONE + ERROR) */ + for (i =3D 0; i < cfg->mbx_count; i++) { + u32 stale; + + /* + * W1C any interrupt bits that accumulated between + * MQI setup and now (e.g. CMH eSW processing stale + * commands) before enabling the mask. + */ + stale =3D cmh_reg_read32(cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT); + if (stale) + cmh_reg_write32(stale, cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT); + + cmh_reg_write32(MBX_IRQ_MASK, + cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT_MASK); + } + + dev_info(cmh_dev(), "rh: initialized (%u IRQs, %u mailboxes, watchd= og %u ms)\n", + rh.nirqs, cfg->mbx_count, watchdog_ms); + + /* Arm missed-IRQ watchdog timer */ + timer_setup(&rh_watchdog, cmh_rh_watchdog_fn, 0); + mod_timer(&rh_watchdog, jiffies + + msecs_to_jiffies(max(watchdog_ms, + CMH_RH_WATCHDOG_MS_MIN))); + + return 0; +} + +/** + * cmh_rh_suspend() - Suspend the response handler + * @cfg: Device configuration + * + * Stop the watchdog timer and mask mailbox interrupts at the hardware + * level. The IRQ handlers remain registered so that resume can + * re-enable them without re-requesting. + */ +void cmh_rh_suspend(struct cmh_config *cfg) +{ + u32 i; + + if (!rh.active) + return; + + /* Stop the watchdog before masking HW interrupts */ + timer_delete_sync(&rh_watchdog); + + /* Mask MBX interrupts at the hardware level */ + for (i =3D 0; i < cfg->mbx_count; i++) + cmh_reg_write32(0, cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT_MASK); + + /* + * Ensure no threaded IRQ handler is still in-flight. + * After masking, a handler may already have been scheduled. + * synchronize_irq() waits for it to complete before we + * proceed with suspend (which tears down TM state). + */ + for (i =3D 0; i < rh.nirqs; i++) + synchronize_irq(rh.irqs[i]); + + rh.active =3D false; + dev_dbg(cmh_dev(), "rh: suspended\n"); +} + +/** + * cmh_rh_resume() - Resume the response handler after suspend + * @cfg: Device configuration + * + * Re-synchronize per-mailbox head tracking with hardware, clear stale + * interrupt bits accumulated during the power transition, re-enable + * mailbox completion interrupts, and re-arm the watchdog timer. + */ +void cmh_rh_resume(struct cmh_config *cfg) +{ + u32 i; + + if (!rh.mbx || !cfg) + return; + + /* Re-sync per-MBX head tracking with hardware */ + for (i =3D 0; i < cfg->mbx_count; i++) { + u32 stale; + + rh.mbx[i].last_head =3D + cmh_reg_read32(cfg->mailboxes[i].reg_base, + R_MBX_QUEUE_HEAD); + + /* W1C any stale interrupt bits from the power transition *= / + stale =3D cmh_reg_read32(cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT); + if (stale) + cmh_reg_write32(stale, cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT); + + /* Re-enable MBX completion interrupts */ + cmh_reg_write32(MBX_IRQ_MASK, cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT_MASK); + } + + rh.active =3D true; + + /* Re-arm the watchdog */ + mod_timer(&rh_watchdog, jiffies + + msecs_to_jiffies(max(watchdog_ms, + CMH_RH_WATCHDOG_MS_MIN))); + dev_dbg(cmh_dev(), "rh: resumed\n"); +} + +/** + * cmh_rh_cleanup() - Clean up the response handler + * @cfg: Device configuration + * + * Stop the watchdog timer, mask mailbox interrupts at the hardware + * level, release all registered IRQ handlers, and free per-mailbox + * tracking state. Safe to call even if init was never completed. + */ +void cmh_rh_cleanup(struct cmh_config *cfg) +{ + if (rh.active) { + u32 i; + + /* Cancel watchdog before disabling interrupts */ + timer_delete_sync(&rh_watchdog); + + /* Disable MBX interrupts before releasing handlers */ + for (i =3D 0; i < cfg->mbx_count; i++) + cmh_reg_write32(0, + cfg->mailboxes[i].reg_base, + R_MBX_INTERRUPT_MASK); + + /* Release all per-MBX IRQs */ + for (i =3D 0; i < rh.nirqs; i++) + free_irq(rh.irqs[i], cfg); + dev_dbg(cmh_dev(), "rh: %u IRQs released\n", rh.nirqs); + rh.nirqs =3D 0; + rh.active =3D false; + } + + dev_dbg(cmh_dev(), "rh: %u IRQs handled\n", + atomic_read(&rh.irq_count)); + + kfree(rh.mbx); + rh.mbx =3D NULL; + + dev_info(cmh_dev(), "rh: cleaned up\n"); +} + +/* -- debugfs timeout accessor ------------------------------------------ = */ + +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +/** + * cmh_rh_timeout_watchdog_ptr() - Return pointer to watchdog_ms for debug= fs + * + * Exposes the Response Handler watchdog timeout for runtime tuning + * via debugfs config/ directory. + * + * Return: pointer to the static watchdog_ms variable. + */ +unsigned int *cmh_rh_timeout_watchdog_ptr(void) { return &watchdog_ms; } +#endif diff --git a/drivers/crypto/cmh/cmh_sysfs.c b/drivers/crypto/cmh/cmh_sysfs.= c new file mode 100644 index 000000000000..ab482a222167 --- /dev/null +++ b/drivers/crypto/cmh/cmh_sysfs.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- sysfs Device Attributes + * + * Exposes hardware identity and status as read-only sysfs attributes + * under /sys/devices/platform/cmh/. Wired via .dev_groups in the + * platform_driver struct -- the driver core creates and removes these + * automatically around .probe() / .remove(). + * + * Because .dev_groups is used (not manual sysfs_create_group), the + * driver core guarantees that attributes are created after .probe() + * sets drvdata and removed before .remove() clears it. Therefore + * platform_get_drvdata() cannot return NULL in any show callback and + * no NULL check is needed. Same pattern as caam/ctrl.c and + * ccree/cc_sysfs.c. + */ + +#include +#include +#include + +#include "cmh.h" +#include "cmh_registers.h" +#include "cmh_sysfs.h" + +static ssize_t fw_version_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cmh_device *cmh =3D platform_get_drvdata(to_platform_device(= dev)); + struct cmh_config *cfg =3D &cmh->config; + + if (!cfg->sic_mapped) + return -ENODEV; + + return sysfs_emit(buf, "0x%08x\n", + cmh_reg_read32(cfg->sic_mapped, R_SIC_SW_VERSION)= ); +} +static DEVICE_ATTR_RO(fw_version); + +static ssize_t hw_version_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cmh_device *cmh =3D platform_get_drvdata(to_platform_device(= dev)); + struct cmh_config *cfg =3D &cmh->config; + + if (!cfg->sic_mapped) + return -ENODEV; + + return sysfs_emit(buf, "0x%08x\n", + cmh_reg_read32(cfg->sic_mapped, R_SIC_HW_VERSION0= )); +} +static DEVICE_ATTR_RO(hw_version); + +static ssize_t boot_status_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cmh_device *cmh =3D platform_get_drvdata(to_platform_device(= dev)); + struct cmh_config *cfg =3D &cmh->config; + + if (!cfg->sic_mapped) + return -ENODEV; + + return sysfs_emit(buf, "0x%08x\n", + cmh_reg_read32(cfg->sic_mapped, R_SIC_BOOT_STATUS= )); +} +static DEVICE_ATTR_RO(boot_status); + +static ssize_t mbx_available_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cmh_device *cmh =3D platform_get_drvdata(to_platform_device(= dev)); + struct cmh_config *cfg =3D &cmh->config; + + if (!cfg->sic_mapped) + return -ENODEV; + + return sysfs_emit(buf, "0x%08x\n", + cmh_reg_read32(cfg->sic_mapped, R_SIC_MBX_AVAILAB= ILITY)); +} +static DEVICE_ATTR_RO(mbx_available); + +static ssize_t mbx_count_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cmh_device *cmh =3D platform_get_drvdata(to_platform_device(= dev)); + + return sysfs_emit(buf, "%u\n", cmh->config.mbx_count); +} +static DEVICE_ATTR_RO(mbx_count); + +static struct attribute *cmh_sysfs_attrs[] =3D { + &dev_attr_fw_version.attr, + &dev_attr_hw_version.attr, + &dev_attr_boot_status.attr, + &dev_attr_mbx_available.attr, + &dev_attr_mbx_count.attr, + NULL, +}; + +static const struct attribute_group cmh_sysfs_group =3D { + .attrs =3D cmh_sysfs_attrs, +}; + +const struct attribute_group *cmh_sysfs_groups[] =3D { + &cmh_sysfs_group, + NULL, +}; diff --git a/drivers/crypto/cmh/cmh_txn.c b/drivers/crypto/cmh/cmh_txn.c new file mode 100644 index 000000000000..3c696a8baac5 --- /dev/null +++ b/drivers/crypto/cmh/cmh_txn.c @@ -0,0 +1,1978 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Transaction Manager + * + * Dedicated kthread that dequeues command messages, builds VCQs in + * DMA queue slots, and rings the MBX doorbell. + * + * Command flow: + * 1. Caller posts command_msg via cmh_tm_post_command() + * 2. TM thread wakes, dequeues msg from CMQ + * 3. Selects mailbox (core-to-MBX affinity, or caller-pinned) + * 4. Copies pre-built VCQ entries into DMA slot at tail + * 5. Creates transaction_obj, appends to per-MBX txn queue + * 6. Writes tail+1 -> R_MBX_QUEUE_TAIL (doorbell) + * + * The Response Handler (cmh_rh.c) walks per-MBX txn queues + * when an IRQ fires and the head advances, firing completion callbacks. + * + * Transaction state machine + * ------------------------- + * Each async transaction moves through the following states. DMA + * buffers remain mapped and owned by the HW until the COMPLETE state + * is reached -- only then are they safe to unmap/free. + * + * QUEUED --[TM posts to HW]--> INFLIGHT + * (cmq) | | \ + * | | \--[timer fires]--> + * | | TIMED_OUT + * | | | + * | [HW completes / [HW completes / + * | RH pops txn] RH pops txn] + * | | | + * | v v + * | COMPLETE COMPLETE + * | (err=3DHW rc) (err=3D-ETIMEDOUT) + * | + * +--[pre-submit fail]--> freed (callback never fires) + * + * Note: QUEUED is the command_msg phase (sitting in the CMQ list, + * not yet a transaction_obj). The transaction_obj states tracked + * by atomic_cmpxchg are INFLIGHT, TIMED_OUT, and COMPLETE only. + * + * Completion callback context guarantee: + * The crypto_request_complete() callback is invoked from one of: + * - The RH threaded IRQ handler (process context, BH disabled) + * - The watchdog timer (softirq / timer context) + * - The TM kthread during queue drain/cleanup (process context) + * + * It is NEVER invoked from hardirq context. + * + * The watchdog path runs from timer softirq because it must recover + * missed IRQs without sleeping. This is crypto-API-compliant: + * crypto_request_complete() is documented safe from any context + * (including softirq). Callers must NOT assume process context in + * their completion callbacks -- all operations therein must be + * softirq-safe (no mutex, no GFP_KERNEL, no sleeping locks). + * + * For backlog promotion (-EINPROGRESS callbacks), the callback runs + * under the CMQ spinlock with IRQs disabled -- callers must handle + * this per the crypto API backlog contract. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cmh_txn.h" +#include "cmh_rh.h" +#include "cmh_registers.h" +#include "cmh_config.h" +#include "cmh_vcq.h" +#include "cmh_debugfs.h" +#include "cmh_dma.h" + +/* Module State */ + +static struct { + struct cmh_config *cfg; + struct task_struct *thread; + bool running; + + /* Command Message Queue (CMQ) */ + struct list_head cmq; + spinlock_t cmq_lock; /* protects cmq + backlog lists= */ + wait_queue_head_t cmq_waitq; + + /* Backlog queue for CRYPTO_TFM_REQ_MAY_BACKLOG requests */ + struct list_head backlog; + u32 backlog_depth; + + /* Per-mailbox transaction queues */ + struct cmh_mbx_txq *txqs; /* array[cfg->mbx_count] */ + + /* Round-robin mailbox selector */ + u32 next_mbx; +} tm; + +static unsigned int cmq_max_depth =3D 256; +module_param(cmq_max_depth, uint, 0444); +MODULE_PARM_DESC(cmq_max_depth, + "Max pending commands in the Command Message Queue (defaul= t: 256)"); + +static unsigned int backlog_max_depth =3D 1024; +module_param(backlog_max_depth, uint, 0444); +MODULE_PARM_DESC(backlog_max_depth, + "Max pending commands in the backlog queue (0 =3D disable = backlog, default: 1024)"); + +static unsigned int async_timeout_ms =3D 2000; + +#define CMH_TM_BACKOFF_MIN_US 100 /* queue-full backoff range (us) */ +#define CMH_TM_BACKOFF_MAX_US 500 +static unsigned int cmq_depth; /* current CMQ depth, protected by tm= .cmq_lock */ + +/* + * Monotonically increasing counter bumped by cmh_tm_txq_completion_notify= (). + * Used as a generation check in the queue-full backoff predicate so that + * wait_event_interruptible_timeout() returns immediately when a TXQ + * completion frees a slot, rather than sleeping for the full timeout. + */ +static atomic_t txq_completion_gen; + +/* -- Debugfs stat helpers (avoid anonymous compound blocks) -------------= */ + +static void cmh_stat_inc_mbx_queue_full(u32 mbx_idx) +{ + struct cmh_mbx_stats *s =3D cmh_debugfs_mbx_stats(mbx_idx); + + if (s) + atomic64_inc(&s->queue_full_count); +} + +static void cmh_stat_record_vcq_submit(u32 mbx_idx, u32 num_vcqs, u32 dept= h) +{ + struct cmh_mbx_stats *s =3D cmh_debugfs_mbx_stats(mbx_idx); + + if (s) { + atomic64_add(num_vcqs, &s->vcqs_submitted); + cmh_stat_update_max(&s->max_queue_depth, (s64)depth); + } +} + +static void cmh_stat_inc_tm_backoff(void) +{ + struct cmh_tm_stats *s =3D cmh_debugfs_tm_stats(); + + if (s) + atomic64_inc(&s->backoff_count); +} + +static void cmh_stat_inc_cmq_eagain(void) +{ + struct cmh_tm_stats *s =3D cmh_debugfs_tm_stats(); + + if (s) + atomic64_inc(&s->cmq_eagain_count); +} + +static void cmh_stat_record_cmq_post(u32 depth) +{ + struct cmh_tm_stats *s =3D cmh_debugfs_tm_stats(); + + if (s) { + atomic64_inc(&s->cmq_posts); + cmh_stat_update_max(&s->cmq_depth_max, (s64)depth); + } +} + +static void cmh_stat_inc_async_timeout(void) +{ + struct cmh_tm_stats *s =3D cmh_debugfs_tm_stats(); + + if (s) + atomic64_inc(&s->async_timeout_count); +} + +/* + * Drop one reference on a command_msg; free when the last ref is dropped. + * Used by cmh_tm_submit_sync() to share msg ownership between the + * waiter (caller) and the TM subsystem (thread or cleanup drain). + */ +static void command_msg_put(struct command_msg *msg) +{ + if (refcount_dec_and_test(&msg->refs)) { + kfree(msg->vcq_data); + kfree(msg); + } +} + +/* + * Drop one reference on a transaction_obj; free when the last ref drops. + * Two references are held when the per-request timeout timer is armed: + * one for the TXQ owner (RH/cleanup), one for the timer callback. + * When no timer is armed, only the owner ref exists. + */ +static void txn_put(struct transaction_obj *txn) +{ + if (refcount_dec_and_test(&txn->refs)) + kfree(txn); +} + +/* + * Per-request async timeout callback (runs in softirq / timer context). + * + * This function ONLY marks the transaction state as TIMED_OUT via + * atomic cmpxchg and drops the timer reference. It does NOT fire + * the completion callback, does NOT touch DMA buffers, and does NOT + * write any MBX registers. + * + * Rationale: the HW may still be writing to DMA buffers at this + * point. Unmapping or freeing them here would be a use-after-free. + * The actual -ETIMEDOUT completion fires later, from process + * context, when the RH threaded IRQ pops the transaction after the + * HW finishes (or after MBX abort/drain on rmmod/suspend). + * + * MBX_COMMAND_ABORT is NOT issued here. It is issued by the RH + * watchdog abort-stall detector under rh_process_lock, which + * serialises it against RESTART/FLUSH recovery commands. Writing + * ABORT from timer softirq without the lock caused a race where + * concurrent timeouts clobbered an in-progress RESTART, wedging + * the mailbox. + * + * Context: softirq (timer). Must not sleep. + */ +static void txn_timeout_fn(struct timer_list *t) +{ + struct transaction_obj *txn =3D timer_container_of(txn, t, timeout_= timer); + int old; + + old =3D atomic_cmpxchg(&txn->state, TXN_INFLIGHT, TXN_TIMED_OUT); + if (old =3D=3D TXN_INFLIGHT) { + dev_err_ratelimited(cmh_dev(), + "tm: async timeout vcq=3D%u..%u mbx=3D%= u cmd_id=3D0x%08x\n", + txn->first_vcq_id, + txn->last_vcq_id, txn->mailbox_idx, + txn->command_id); + cmh_stat_inc_async_timeout(); + } + + txn_put(txn); /* drop timer ref */ +} + +/** + * cmh_txn_finish() - Complete a popped transaction with FSM + timer clean= up + * @txn: Transaction object to complete + * @error: Error code from HW (0 on success) + * + * Three cases: + * 1. Normal: state INFLIGHT -> COMPLETE. Fire callback with HW error. + * 2. Timed out: state already TXN_TIMED_OUT (timer marked it). + * Fire callback with -ETIMEDOUT. DMA is now safe because the + * HW has finished and HEAD has advanced past this VCQ. + * 3. Force-cancel (drain/quiesce): handled by caller, not here. + */ +void cmh_txn_finish(struct transaction_obj *txn, int error) +{ + int old; + + old =3D atomic_cmpxchg(&txn->state, TXN_INFLIGHT, TXN_COMPLETE); + + /* Dequeue the timer if still pending; drop timer ref if we did */ + if (timer_delete(&txn->timeout_timer)) + txn_put(txn); + + if (old =3D=3D TXN_INFLIGHT) { + /* HW completion (may carry error) */ + if (txn->complete) + txn->complete(txn->completion_data, error); + } else if (old =3D=3D TXN_TIMED_OUT) { + /* Timer won earlier; now HW is done -- deliver -ETIMEDOUT = */ + if (txn->complete) + txn->complete(txn->completion_data, -ETIMEDOUT); + } + + txn_put(txn); /* drop owner ref */ +} + +/* Mailbox Slot Addressing */ + +/* + * Return a kernel-virtual pointer to the VCQ slot for the given vcqid. + * Mirrors CMH eSW's mbx_queue_addr() but uses the kernel virt_addr. + */ +static void *mbx_slot_ptr(struct cmh_mbx_config *mbx, u32 vcqid) +{ + u32 slot_mask =3D (1U << mbx->slots_log2) - 1U; + u32 slot_offset =3D (vcqid & slot_mask) << mbx->stride_log2; + + return (u8 *)mbx->virt_addr + slot_offset; +} + +/* + * Return the number of free slots in a mailbox queue. + */ +static u32 mbx_free_slots(struct cmh_mbx_config *mbx) +{ + u32 head =3D cmh_reg_read32(mbx->reg_base, R_MBX_QUEUE_HEAD); + u32 tail =3D cmh_reg_read32(mbx->reg_base, R_MBX_QUEUE_TAIL); + u32 size =3D 1U << mbx->slots_log2; + + return size - (u32)(tail - head); +} + +/** + * cmh_tm_max_cmds_per_vcq() - Return max commands per VCQ slot + * + * Scans all mailbox configurations and returns the minimum number of + * VCQ command entries that fit in a single slot, clamped to the + * MIN_VCQ_CMDS..MAX_VCQ_CMDS range. + * + * Return: Maximum usable VCQ command count per slot. + */ +u32 cmh_tm_max_cmds_per_vcq(void) +{ + u32 i, min_cmds =3D MAX_VCQ_CMDS; + + for (i =3D 0; i < tm.cfg->mbx_count; i++) { + u32 stride =3D 1U << tm.cfg->mailboxes[i].stride_log2; + u32 cmds =3D stride / (u32)sizeof(struct vcq_cmd); + + if (cmds < min_cmds) + min_cmds =3D cmds; + } + + if (min_cmds < MIN_VCQ_CMDS) + min_cmds =3D MIN_VCQ_CMDS; + + return min_cmds; +} + +/** + * cmh_tm_mbx_count() - Return the number of configured mailboxes + * + * Return: Number of mailboxes in the current configuration. + */ +u32 cmh_tm_mbx_count(void) +{ + return tm.cfg->mbx_count; +} + +/* Core-to-MBX Affinity -- Config-Driven Multi-Instance Support */ + +/* + * Per-core-type configuration table. Each entry holds one or more + * (core_id, mbx_idx) instances. Defaults: single instance per core + * type with the standard CORE_ID_* and MBX auto-assigned on first use + * (mbx_idx =3D -1). Module params can override for explicit assignment + * and multi-instance support. + * + * Round-robin across instances for each new crypto operation. + */ + +struct core_instance_info { + u32 core_id; /* VCQ dispatch core_id */ + /* + * Assigned MBX index, or -1 (sentinel) for auto-assign on first + * use. Uses atomic_t for a lockless once-only latch: the first + * caller does atomic_cmpxchg(&mbx_idx, -1, new_mbx); all later + * callers see the winning value via atomic_read(). + */ + atomic_t mbx_idx; +}; + +struct core_type_info { + u32 num_instances; + struct core_instance_info instances[CMH_MAX_CORE_INSTANCES]; + atomic_t next_instance; /* round-robin counter */ +}; + +static struct core_type_info core_types[CMH_NUM_CORE_TYPES] =3D { + [CMH_CORE_HC] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_HC, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_AES] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_AES, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_SM4] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_SM4, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_SM3] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_SM3, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_CCP] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_CCP, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_PKE] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_PKE, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_QSE] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_QSE, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, + [CMH_CORE_HCQ] =3D { .num_instances =3D 1, + .instances =3D { { .core_id =3D CORE_ID_HCQ, + .mbx_idx =3D ATOMIC_INIT(-1) } } }, +}; + +/* Round-robin counter for auto-assigning MBXes to core instances */ +static atomic_t affinity_next_mbx =3D ATOMIC_INIT(0); + +/** + * cmh_tm_affinity_reset() - Reset core-to-MBX affinity state + * + * Clears all auto-assigned MBX bindings and resets round-robin + * counters for both the global MBX allocator and per-core-type + * instance selectors. + */ +void cmh_tm_affinity_reset(void) +{ + u32 i, j; + + atomic_set(&affinity_next_mbx, 0); + + /* Reset multi-instance table */ + for (i =3D 0; i < CMH_NUM_CORE_TYPES; i++) { + struct core_type_info *ct =3D &core_types[i]; + + atomic_set(&ct->next_instance, 0); + for (j =3D 0; j < ct->num_instances; j++) + atomic_set(&ct->instances[j].mbx_idx, -1); + } +} + +/** + * cmh_core_default_id() - Return default core_id for a core type + * @type: Core type selector + * + * Returns the first-instance core_id for @type without advancing the + * round-robin counter. Used by callers pinned to a fixed MBX (e.g. + * mgmt ioctls on MGMT_MBX) that only need the VCQ core_id field. + * + * Return: VCQ core_id value for the default instance of @type. + */ +u32 cmh_core_default_id(enum cmh_core_type type) +{ + if (WARN_ON_ONCE(type >=3D CMH_NUM_CORE_TYPES)) + return 0; + + return core_types[type].instances[0].core_id; +} + +/** + * cmh_core_select_instance() - Select a core instance via round-robin + * @type: Core type selector + * + * Round-robin across configured instances, each permanently pinned to + * its MBX (auto-assigned on first use if mbx_idx was -1). + * + * Uses atomic_inc_return (pre-increment), so the very first call for a + * given type returns instance[1 % N]. Over the lifetime of the module + * the distribution is perfectly balanced; the off-by-one only affects + * the first cycle. + * + * The (u32) cast before the modulo ensures correct behaviour across + * the INT_MAX -> INT_MIN wraparound of atomic_t: (u32)INT_MIN =3D + * 0x80000000, and 0x80000000 % N still yields a valid index. + * + * Return: A core_dispatch with (core_id, mbx_idx) for the selected + * instance. + */ +struct core_dispatch cmh_core_select_instance(enum cmh_core_type type) +{ + struct core_type_info *ct; + struct core_instance_info *inst; + struct core_dispatch d; + u32 idx, count; + s32 mbx, new_mbx, old; + + if (WARN_ON_ONCE(type >=3D CMH_NUM_CORE_TYPES)) + return (struct core_dispatch){ .core_id =3D 0, .mbx_idx =3D= -1 }; + + ct =3D &core_types[type]; + idx =3D (u32)atomic_inc_return(&ct->next_instance) % ct->num_instan= ces; + inst =3D &ct->instances[idx]; + + d.core_id =3D inst->core_id; + + mbx =3D atomic_read(&inst->mbx_idx); + if (mbx >=3D 0) { + d.mbx_idx =3D mbx; + return d; + } + + /* Auto-assign on first use */ + count =3D tm.cfg->mbx_count; + new_mbx =3D (s32)((u32)atomic_inc_return(&affinity_next_mbx) % coun= t); + old =3D atomic_cmpxchg(&inst->mbx_idx, -1, new_mbx); + + if (old >=3D 0) { + d.mbx_idx =3D old; + } else { + d.mbx_idx =3D new_mbx; + dev_info(cmh_dev(), + "tm: core 0x%02x -> mbx %d (auto)\n", + inst->core_id, new_mbx); + } + + return d; +} + +/** + * cmh_core_num_instances() - Return instance count for a core type + * @type: Core type selector + * + * Return: Number of configured instances for @type. + */ +u32 cmh_core_num_instances(enum cmh_core_type type) +{ + if (WARN_ON_ONCE(type >=3D CMH_NUM_CORE_TYPES)) + return 1; + + return core_types[type].num_instances; +} + +/** + * cmh_core_get_instance() - Get dispatch info for a specific instance + * @type: Core type selector + * @idx: Instance index within @type + * + * Returns (core_id, mbx_idx) for a specific instance by index, + * without advancing the round-robin counter. Triggers MBX auto-assign + * on first use if the instance has no MBX yet. + * + * Return: A core_dispatch with (core_id, mbx_idx) for instance @idx. + */ +struct core_dispatch cmh_core_get_instance(enum cmh_core_type type, u32 id= x) +{ + struct core_type_info *ct; + struct core_instance_info *inst; + struct core_dispatch d; + u32 count; + s32 mbx, new_mbx, old; + + if (WARN_ON_ONCE(type >=3D CMH_NUM_CORE_TYPES)) + return (struct core_dispatch){ .core_id =3D 0, .mbx_idx =3D= -1 }; + + ct =3D &core_types[type]; + if (WARN_ON_ONCE(idx >=3D ct->num_instances)) + return (struct core_dispatch){ .core_id =3D 0, .mbx_idx =3D= -1 }; + + inst =3D &ct->instances[idx]; + d.core_id =3D inst->core_id; + + mbx =3D atomic_read(&inst->mbx_idx); + if (mbx >=3D 0) { + d.mbx_idx =3D mbx; + return d; + } + + /* Auto-assign on first use */ + count =3D tm.cfg->mbx_count; + new_mbx =3D (s32)((u32)atomic_inc_return(&affinity_next_mbx) % coun= t); + old =3D atomic_cmpxchg(&inst->mbx_idx, -1, new_mbx); + + if (old >=3D 0) { + d.mbx_idx =3D old; + } else { + d.mbx_idx =3D new_mbx; + dev_info(cmh_dev(), + "tm: core 0x%02x -> mbx %d (auto)\n", + inst->core_id, new_mbx); + } + + return d; +} + +/** + * cmh_tm_txq_completion_notify() - Wake TM thread after RH completion + * + * Wakes the TM thread after the Response Handler completes a + * transaction. This unblocks the TM if it is waiting for a free MBX + * slot. The generation counter bump ensures the wait_event predicate + * evaluates to true on the next check. + */ +void cmh_tm_txq_completion_notify(void) +{ + atomic_inc(&txq_completion_gen); + wake_up_interruptible(&tm.cmq_waitq); +} + +/* Mailbox Selection */ + +/* + * Select a mailbox with at least @slots_needed free slots (round-robin). + * Returns mailbox index, or -EAGAIN if no mailbox qualifies. + * + * Note: the free-slot check here is advisory -- actual slot availability + * is enforced by the ring arithmetic under dispatch_lock in submit_vcq(). + * A TOCTOU gap exists between this check and the subsequent slot write, + * but it is safe: the worst case is a spurious -EAGAIN / backoff, never + * a ring overcommit. + */ +static int select_mailbox(u32 slots_needed) +{ + u32 count =3D tm.cfg->mbx_count; + u32 start =3D tm.next_mbx; + u32 i; + + for (i =3D 0; i < count; i++) { + u32 idx =3D (start + i) % count; + + if (cmh_rh_mbx_is_wedged(idx)) + continue; + + if (mbx_free_slots(&tm.cfg->mailboxes[idx]) >=3D slots_need= ed) { + tm.next_mbx =3D (idx + 1) % count; + return (int)idx; + } + cmh_stat_inc_mbx_queue_full(idx); + } + + return -EAGAIN; +} + +/* + * Resolve the target mailbox for a command message. + * + * If the message has a pinned MBX and it has enough free slots, use it. + * Otherwise fall back to round-robin selection. Returns mailbox index, + * or -EAGAIN when no MBX has enough free slots or all are wedged. + */ +static int resolve_mbx(struct command_msg *msg) +{ + u32 slots =3D msg->num_vcqs > 0 ? msg->num_vcqs : 1; + + if (msg->target_mbx >=3D 0 && + (u32)msg->target_mbx < tm.cfg->mbx_count) { + if (cmh_rh_mbx_is_wedged((u32)msg->target_mbx)) + return -EAGAIN; + if (mbx_free_slots(&tm.cfg->mailboxes[msg->target_mbx]) >= =3D + slots) + return msg->target_mbx; + return -EAGAIN; /* pinned MBX full, retry */ + } + + return select_mailbox(slots); +} + +/* VCQ Submission */ + +/* + * Write VCQ(s) into consecutive DMA slots and ring the doorbell. + * + * A command_msg may carry one or more VCQs (num_vcqs field). For a + * multi-VCQ message the flat vcq_data array contains N VCQs laid out + * contiguously, each starting with its own header whose cmds field + * gives that VCQ's entry count. All VCQs are written to consecutive + * MBX slots and tracked by a single transaction_obj. + * + * Returns 0 on success, negative errno on failure. + */ +static int submit_vcq(struct command_msg *msg, u32 mbx_idx) +{ + struct cmh_mbx_config *mbx =3D &tm.cfg->mailboxes[mbx_idx]; + struct cmh_mbx_txq *txq =3D &tm.txqs[mbx_idx]; + struct transaction_obj *txn; + const struct vcq_cmd *cmds =3D msg->vcq_data; + u32 num_vcqs =3D msg->num_vcqs > 0 ? msg->num_vcqs : 1; + u32 tail, stride_bytes, offset =3D 0; + unsigned long flags; + u32 v; + + mutex_lock(&txq->dispatch_lock); + + /* Read current tail (first VCQ ID) */ + tail =3D cmh_reg_read32(mbx->reg_base, R_MBX_QUEUE_TAIL); + stride_bytes =3D 1U << mbx->stride_log2; + + /* Allocate transaction tracking object */ + txn =3D kzalloc_obj(*txn, GFP_KERNEL); + if (!txn) { + mutex_unlock(&txq->dispatch_lock); + return -ENOMEM; + } + + /* Write each VCQ into a consecutive DMA slot */ + for (v =3D 0; v < num_vcqs; v++) { + u32 vcq_cmds, copy_size; + void *slot; + + /* + * For single-VCQ messages (backward compat) use the + * msg-level vcq_count. For multi-VCQ, parse the per-VCQ + * header to find each VCQ's command count. + */ + if (num_vcqs =3D=3D 1) { + vcq_cmds =3D msg->vcq_count; + } else { + const struct vcq_hdr *hdr =3D + (const struct vcq_hdr *)&cmds[offset].hwc; + vcq_cmds =3D hdr->cmds; + } + + copy_size =3D vcq_cmds * sizeof(struct vcq_cmd); + if (copy_size > stride_bytes) { + dev_err(cmh_dev(), "tm: VCQ %u too large (%u bytes = > stride %u)\n", + v, copy_size, stride_bytes); + mutex_unlock(&txq->dispatch_lock); + kfree(txn); + return -EMSGSIZE; + } + + if (vcq_cmds < MIN_VCQ_CMDS || vcq_cmds > MAX_VCQ_CMDS) { + dev_err(cmh_dev(), "tm: invalid vcq_count %u (range= %u..%u)\n", + vcq_cmds, MIN_VCQ_CMDS, MAX_VCQ_CMDS); + mutex_unlock(&txq->dispatch_lock); + kfree(txn); + return -EINVAL; + } + + /* Copy pre-built VCQ into DMA slot */ + slot =3D mbx_slot_ptr(mbx, tail + v); + cmh_dma_write(slot, &cmds[offset], copy_size); + + /* Zero remaining slot bytes to avoid stale data */ + if (copy_size < stride_bytes) + cmh_dma_zero((u8 *)slot + copy_size, + stride_bytes - copy_size); + + offset +=3D vcq_cmds; + } + + /* Ensure VCQ data is visible in memory before advancing tail */ + wmb(); + /* FPGA: confirm DRAM accepted writes before SIC doorbell (cross-sl= ave) */ + cmh_dma_fence(mbx_slot_ptr(mbx, tail + num_vcqs - 1)); + + /* Fill in transaction spanning all VCQs */ + txn->first_vcq_id =3D tail; + txn->last_vcq_id =3D tail + num_vcqs - 1; + txn->mailbox_idx =3D mbx_idx; + txn->command_id =3D msg->command_id; + txn->error_code =3D 0; + txn->complete =3D msg->complete; + txn->completion_data =3D msg->completion_data; + atomic_set(&txn->state, TXN_INFLIGHT); + timer_setup(&txn->timeout_timer, txn_timeout_fn, 0); + INIT_LIST_HEAD(&txn->list); + + /* + * Set refcount: 2 if a per-txn timer will be armed (one ref for + * the TXQ owner that pops it, one for the timer callback), or 1 + * if no timer (sync paths, or async_timeout_ms =3D=3D 0). + */ + if (msg->timeout_jiffies) + refcount_set(&txn->refs, 2); + else + refcount_set(&txn->refs, 1); + + /* Enqueue transaction under spinlock */ + spin_lock_irqsave(&txq->lock, flags); + list_add_tail(&txn->list, &txq->head); + txq->depth++; + spin_unlock_irqrestore(&txq->lock, flags); + + /* Ring doorbell: advance tail by number of VCQs submitted */ + cmh_reg_write32(tail + num_vcqs, mbx->reg_base, R_MBX_QUEUE_TAIL); + + /* Arm per-request timeout after doorbell (async only) */ + if (msg->timeout_jiffies) + mod_timer(&txn->timeout_timer, + jiffies + msg->timeout_jiffies); + + mutex_unlock(&txq->dispatch_lock); + + cmh_stat_record_vcq_submit(mbx_idx, num_vcqs, txq->depth); + + dev_dbg(cmh_dev(), "tm: submitted %u vcq(s) id=3D%u..%u to mbx[%u] = tail_now=3D%u\n", + num_vcqs, tail, tail + num_vcqs - 1, mbx_idx, + tail + num_vcqs); + + return 0; +} + +/* TM Thread */ + +static int cmh_tm_thread(void *data) +{ + struct command_msg *msg; + unsigned long flags; + int mbx_idx, ret; + + dev_info(cmh_dev(), "tm: thread started\n"); + + while (!kthread_should_stop()) { + /* Wait for work or stop signal */ + wait_event_interruptible(tm.cmq_waitq, + !list_empty(&tm.cmq) || kthread_sh= ould_stop()); + + if (kthread_should_stop()) + break; + + /* Dequeue one command message */ + spin_lock_irqsave(&tm.cmq_lock, flags); + if (list_empty(&tm.cmq)) { + spin_unlock_irqrestore(&tm.cmq_lock, flags); + continue; + } + msg =3D list_first_entry(&tm.cmq, struct command_msg, list)= ; + list_del_init(&msg->list); + cmq_depth--; + + /* + * Promote one backlogged request into the CMQ now that + * there is room. Notify the crypto consumer with + * -EINPROGRESS so it knows the request has left backlog. + */ + if (!list_empty(&tm.backlog)) { + struct command_msg *bl; + + bl =3D list_first_entry(&tm.backlog, + struct command_msg, list); + list_move_tail(&bl->list, &tm.cmq); + tm.backlog_depth--; + cmq_depth++; + cmh_stat_record_cmq_post(cmq_depth); + /* + * Signal -EINPROGRESS while still under cmq_lock + * so the consumer sees it before the final + * completion. The callback must be IRQ-safe + * (required by the async contract anyway). + */ + if (bl->complete) + bl->complete(bl->completion_data, + -EINPROGRESS); + } + + spin_unlock_irqrestore(&tm.cmq_lock, flags); + + /* Select a mailbox: pinned or round-robin */ + mbx_idx =3D resolve_mbx(msg); + + if (mbx_idx < 0) { + /* + * Queue full -- re-enqueue at front and wait. + * + * Sleep on cmq_waitq with a short timeout. The RH + * calls cmh_tm_txq_completion_notify() after each + * completed transaction, which bumps the generatio= n + * counter and wakes us immediately. The timeout i= s + * a safety net for missed wakeups. + */ + int gen =3D atomic_read(&txq_completion_gen); + unsigned long tmo; + + spin_lock_irqsave(&tm.cmq_lock, flags); + list_add(&msg->list, &tm.cmq); + cmq_depth++; + spin_unlock_irqrestore(&tm.cmq_lock, flags); + + tmo =3D usecs_to_jiffies(CMH_TM_BACKOFF_MAX_US); + wait_event_interruptible_timeout(tm.cmq_waitq, + kthread_should_sto= p() || + atomic_read(&txq_c= ompletion_gen) !=3D gen, + tmo ?: 1); + cmh_stat_inc_tm_backoff(); + continue; + } + + /* Submit VCQ to selected mailbox */ + WRITE_ONCE(msg->actual_mbx, mbx_idx); + ret =3D submit_vcq(msg, mbx_idx); + if (ret && msg->complete) + msg->complete(msg->completion_data, ret); + command_msg_put(msg); + } + + dev_info(cmh_dev(), "tm: thread stopped\n"); + return 0; +} + +/* Public Interface */ + +/** + * cmh_tm_init() - Initialize the Transaction Manager subsystem + * @cfg: Hardware configuration describing mailboxes and core types + * + * Allocates per-mailbox transaction queues, applies core-type + * configuration, and starts the TM kthread. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_tm_init(struct cmh_config *cfg) +{ + u32 i, j; + + if (cmq_max_depth =3D=3D 0) { + dev_warn(cmh_dev(), + "tm: cmq_max_depth=3D0 invalid, clamping to 1\n"); + cmq_max_depth =3D 1; + } + + tm.cfg =3D cfg; + tm.next_mbx =3D 0; + cmq_depth =3D 0; + + cmh_tm_affinity_reset(); + + /* Apply per-core-type config from DT child nodes */ + for (i =3D 0; i < CMH_NUM_CORE_TYPES; i++) { + struct cmh_core_type_cfg *src =3D &cfg->core_types[i]; + struct core_type_info *ct =3D &core_types[i]; + + ct->num_instances =3D src->num_instances; + for (j =3D 0; j < src->num_instances; j++) { + ct->instances[j].core_id =3D src->core_ids[j]; + if (src->mbx[j] >=3D 0) + atomic_set(&ct->instances[j].mbx_idx, + src->mbx[j]); + } + } + + /* Initialize CMQ and backlog */ + INIT_LIST_HEAD(&tm.cmq); + INIT_LIST_HEAD(&tm.backlog); + tm.backlog_depth =3D 0; + spin_lock_init(&tm.cmq_lock); + init_waitqueue_head(&tm.cmq_waitq); + + /* Allocate per-mailbox transaction queues */ + tm.txqs =3D kcalloc(cfg->mbx_count, sizeof(*tm.txqs), GFP_KERNEL); + if (!tm.txqs) + return -ENOMEM; + + for (i =3D 0; i < cfg->mbx_count; i++) { + INIT_LIST_HEAD(&tm.txqs[i].head); + spin_lock_init(&tm.txqs[i].lock); + mutex_init(&tm.txqs[i].dispatch_lock); + tm.txqs[i].depth =3D 0; + } + + /* Start TM thread */ + tm.thread =3D kthread_run(cmh_tm_thread, NULL, "cmh_tm"); + if (IS_ERR(tm.thread)) { + int ret =3D PTR_ERR(tm.thread); + + dev_err(cmh_dev(), "tm: failed to start thread (rc=3D%d)\n"= , ret); + tm.thread =3D NULL; + kfree(tm.txqs); + tm.txqs =3D NULL; + return ret; + } + + WRITE_ONCE(tm.running, true); + dev_info(cmh_dev(), + "tm: initialized (%u mailboxes, cmq_depth=3D%u backlog=3D%= u)\n", + cfg->mbx_count, cmq_max_depth, backlog_max_depth); + + return 0; +} + +/* + * cmh_tm_stop_and_drain_cmq() - Stop TM thread and drain CMQ/backlog + * + * Shared preamble for cmh_tm_cleanup() and cmh_tm_quiesce(): stops the + * kthread, marks the TM as not running, then splices the CMQ and backlog + * to local lists and cancels every pending command_msg outside the lock. + */ +static void cmh_tm_stop_and_drain_cmq(void) +{ + struct command_msg *msg, *tmp_msg; + unsigned long flags; + LIST_HEAD(cmq_drain); + LIST_HEAD(backlog_drain); + + if (tm.thread) { + kthread_stop(tm.thread); + tm.thread =3D NULL; + } + WRITE_ONCE(tm.running, false); + + spin_lock_irqsave(&tm.cmq_lock, flags); + list_splice_init(&tm.cmq, &cmq_drain); + cmq_depth =3D 0; + list_splice_init(&tm.backlog, &backlog_drain); + tm.backlog_depth =3D 0; + spin_unlock_irqrestore(&tm.cmq_lock, flags); + + list_for_each_entry_safe(msg, tmp_msg, &cmq_drain, list) { + list_del(&msg->list); + if (msg->complete) + msg->complete(msg->completion_data, -ECANCELED); + command_msg_put(msg); + } + list_for_each_entry_safe(msg, tmp_msg, &backlog_drain, list) { + list_del(&msg->list); + if (msg->complete) + msg->complete(msg->completion_data, -ECANCELED); + command_msg_put(msg); + } +} + +/** + * cmh_tm_cleanup() - Tear down the Transaction Manager subsystem + * + * Stops the TM kthread, drains the CMQ, backlog, and all per-mailbox + * transaction queues, notifying waiters with -ECANCELED or -ETIMEDOUT. + * Frees all TM-owned resources. + */ +void cmh_tm_cleanup(void) +{ + struct transaction_obj *txn, *tmp_txn; + unsigned long flags; + u32 i; + + cmh_tm_stop_and_drain_cmq(); + + /* Drain per-mailbox transaction queues */ + if (tm.txqs) { + for (i =3D 0; i < tm.cfg->mbx_count; i++) { + LIST_HEAD(drain); + int old; + + spin_lock_irqsave(&tm.txqs[i].lock, flags); + list_splice_init(&tm.txqs[i].head, &drain); + tm.txqs[i].depth =3D 0; + spin_unlock_irqrestore(&tm.txqs[i].lock, flags); + + list_for_each_entry_safe(txn, tmp_txn, &drain, list= ) { + list_del(&txn->list); + + if (timer_delete_sync(&txn->timeout_timer)) + txn_put(txn); + + old =3D atomic_cmpxchg(&txn->state, + TXN_INFLIGHT, + TXN_COMPLETE); + if (txn->complete) { + if (old =3D=3D TXN_INFLIGHT) + txn->complete(txn->completi= on_data, + -ECANCELED); + else if (old =3D=3D TXN_TIMED_OUT) + txn->complete(txn->completi= on_data, + -ETIMEDOUT); + } + + txn_put(txn); + } + } + kfree(tm.txqs); + tm.txqs =3D NULL; + } + + dev_info(cmh_dev(), "tm: cleaned up\n"); +} + +/* + * Default drain timeout for suspend/quiesce (milliseconds). + * Covers all symmetric + PKE operations. PQC callers (SLH-DSA sign + * at up to 120 s) should complete before system suspend is requested. + */ +static unsigned int drain_timeout_ms =3D 10000; + +/** + * cmh_tm_quiesce() - Quiesce the TM for suspend or shutdown + * + * Stops the TM kthread, drains the CMQ and backlog, then waits up to + * drain_timeout_ms for in-flight transactions to complete via the + * Response Handler. Any remaining transactions after the deadline + * are force-cancelled. + */ +void cmh_tm_quiesce(void) +{ + struct transaction_obj *txn, *tmp_txn; + unsigned long deadline; + unsigned long flags; + u32 i; + bool drained =3D true; + + cmh_tm_stop_and_drain_cmq(); + + /* Wait for in-flight TXQ transactions to complete via RH */ + if (!tm.txqs) + goto out; + + deadline =3D jiffies + msecs_to_jiffies(drain_timeout_ms); + do { + drained =3D true; + for (i =3D 0; i < tm.cfg->mbx_count; i++) { + if (READ_ONCE(tm.txqs[i].depth)) { + drained =3D false; + break; + } + } + if (drained) + break; + usleep_range(1000, 2000); + } while (time_before(jiffies, deadline)); + + if (!drained) { + dev_warn(cmh_dev(), + "tm: quiesce drain timeout (%u ms), cancelling rem= aining transactions\n", + drain_timeout_ms); + for (i =3D 0; i < tm.cfg->mbx_count; i++) { + LIST_HEAD(drain); + int old; + + spin_lock_irqsave(&tm.txqs[i].lock, flags); + list_splice_init(&tm.txqs[i].head, &drain); + tm.txqs[i].depth =3D 0; + spin_unlock_irqrestore(&tm.txqs[i].lock, flags); + + list_for_each_entry_safe(txn, tmp_txn, &drain, list= ) { + list_del(&txn->list); + + if (timer_delete_sync(&txn->timeout_timer)) + txn_put(txn); + + old =3D atomic_cmpxchg(&txn->state, + TXN_INFLIGHT, + TXN_COMPLETE); + if (txn->complete) { + if (old =3D=3D TXN_INFLIGHT) + txn->complete(txn->completi= on_data, + -ECANCELED); + else if (old =3D=3D TXN_TIMED_OUT) + txn->complete(txn->completi= on_data, + -ETIMEDOUT); + } + + txn_put(txn); + } + } + } + +out: + dev_info(cmh_dev(), "tm: quiesced%s\n", + drained ? "" : " (forced)"); +} + +/** + * cmh_tm_resume() - Resume the TM after suspend + * + * Restarts the TM kthread after a prior cmh_tm_quiesce(). + * + * Return: 0 on success, negative errno if kthread creation fails. + */ +int cmh_tm_resume(void) +{ + if (tm.thread || !tm.cfg) + return 0; + + tm.thread =3D kthread_run(cmh_tm_thread, NULL, "cmh_tm"); + if (IS_ERR(tm.thread)) { + int ret =3D PTR_ERR(tm.thread); + + dev_err(cmh_dev(), "tm: resume kthread_run failed (%d)\n", + ret); + tm.thread =3D NULL; + return ret; + } + WRITE_ONCE(tm.running, true); + dev_info(cmh_dev(), "tm: resumed\n"); + return 0; +} + +/** + * cmh_tm_try_cancel_command() - Cancel a queued command message + * @msg: Command message to cancel + * + * Attempts to remove @msg from the CMQ before the TM thread dequeues + * it. Must be called while @msg is still valid (before the caller's + * stack frame that owns it is freed). + * + * Return: true if @msg was removed, false if already consumed by TM. + */ +bool cmh_tm_try_cancel_command(struct command_msg *msg) +{ + unsigned long flags; + bool cancelled =3D false; + + spin_lock_irqsave(&tm.cmq_lock, flags); + if (!list_empty(&msg->list)) { + list_del_init(&msg->list); + cmq_depth--; + cancelled =3D true; + } + spin_unlock_irqrestore(&tm.cmq_lock, flags); + + return cancelled; +} + +/** + * cmh_tm_post_command() - Post a command message to the CMQ + * @msg: Pre-built command message to enqueue + * + * Enqueues @msg on the Command Message Queue and wakes the TM thread. + * If the CMQ is full, the message may be placed on the backlog queue + * (returning -EBUSY) if @msg->backlog_ok is set, or rejected with + * -EAGAIN. + * + * Return: 0 on success, -EBUSY if backlogged, -EAGAIN if full, + * -ENODEV if TM is not running. + */ +int cmh_tm_post_command(struct command_msg *msg) +{ + unsigned long flags; + + if (!READ_ONCE(tm.running)) + return -ENODEV; + + spin_lock_irqsave(&tm.cmq_lock, flags); + if (cmq_depth >=3D cmq_max_depth) { + if (msg->backlog_ok && + tm.backlog_depth < backlog_max_depth) { + list_add_tail(&msg->list, &tm.backlog); + tm.backlog_depth++; + spin_unlock_irqrestore(&tm.cmq_lock, flags); + return -EBUSY; + } + spin_unlock_irqrestore(&tm.cmq_lock, flags); + cmh_stat_inc_cmq_eagain(); + return -EAGAIN; + } + INIT_LIST_HEAD(&msg->list); + list_add_tail(&msg->list, &tm.cmq); + cmq_depth++; + cmh_stat_record_cmq_post(cmq_depth); + spin_unlock_irqrestore(&tm.cmq_lock, flags); + + wake_up_interruptible(&tm.cmq_waitq); + return 0; +} + +/* Synchronous Submit (refcounted completion + timeout) */ + +/* + * Heap-allocated sync context with refcounting. + * + * The completion callback may fire after the waiter has timed out and + * returned (e.g. during cmh_tm_cleanup on rmmod). If the struct lived + * on the waiter's stack, the callback would touch freed memory -- + * triggering a "BUG: spinlock bad magic" on the completion's spinlock. + * + * Two references are held: one by the waiter, one by the callback. + * Whichever runs last frees the struct. + */ +struct cmh_sync_ctx { + struct completion done; + int error; + refcount_t refs; /* 2: waiter + callback */ + + /* Optional orphan cleanup -- called when the last ref drops after + * the waiter abandoned an in-flight VCQ (noabort path). Lets the + * caller defer DMA-buffer cleanup until the eSW finishes writing. + */ + void (*orphan_cb)(void *data); + void *orphan_data; +}; + +static void cmh_sync_ctx_put(struct cmh_sync_ctx *ctx) +{ + if (refcount_dec_and_test(&ctx->refs)) { + if (ctx->orphan_cb) + ctx->orphan_cb(ctx->orphan_data); + kfree(ctx); + } +} + +static void cmh_sync_complete(void *data, int error) +{ + struct cmh_sync_ctx *ctx =3D data; + + ctx->error =3D error; + complete(&ctx->done); + cmh_sync_ctx_put(ctx); +} + +/* + * Default VCQ completion timeout (milliseconds), tunable via debugfs + * config/vcq_timeout_ms. Only affects the default timeout used by cmh_tm= _submit_sync() + * and cmh_tm_submit_sync_mbx(); callers that pass an explicit timeout_hz + * (e.g. RSA keygen) are not affected. + */ +static unsigned int vcq_timeout_ms =3D 2000; + +/* + * Extended timeout for slow crypto operations: RSA keygen, PQC + * keygen/sign/verify. Tunable via debugfs config/slow_op_timeout_ms. + */ +static unsigned int slow_op_timeout_ms =3D 300000; + +/** + * cmh_tm_submit_sync_tmo() - Synchronous VCQ submit with timeout + * @vcq_cmds: Array of pre-built VCQ command entries + * @vcq_count: Total number of entries in @vcq_cmds + * @num_vcqs: Number of VCQs packed in @vcq_cmds + * @target_mbx: Pinned mailbox index, or -1 for round-robin + * @timeout_hz: Completion timeout in jiffies + * + * Posts a VCQ command to the TM, waits for completion up to + * @timeout_hz. On timeout, issues MBX_COMMAND_ABORT if the VCQ is + * already in-flight. Must be called from process context. + * + * Return: 0 on success, -ETIMEDOUT, or negative errno. + */ +int cmh_tm_submit_sync_tmo(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, s32 target_mbx, + unsigned long timeout_hz) +{ + struct cmh_sync_ctx *sync; + struct command_msg *msg; + unsigned long left; + int ret; + + /* + * This path sleeps (GFP_KERNEL allocations + wait_for_completion) + * and is not safe from atomic / non-sleepable contexts. All + * current callers run in process context (crypto API userspace or + * ioctl), so this is never violated today. Catch it loudly if + * a future caller gets this wrong. + */ + WARN_ON_ONCE(!in_task()); + + sync =3D kzalloc_obj(*sync, GFP_KERNEL); + if (!sync) + return -ENOMEM; + + msg =3D kzalloc_obj(*msg, GFP_KERNEL); + if (!msg) { + kfree(sync); + return -ENOMEM; + } + + init_completion(&sync->done); + sync->error =3D 0; + refcount_set(&sync->refs, 2); /* waiter + callback */ + + /* + * Heap-copy the caller's VCQ array so the msg owns its data. + * This decouples VCQ lifetime from the caller's stack frame, + * which matters when the TM thread backs off (resolve_mbx + * returns -1) and re-enqueues the msg after the caller's + * wait_for_completion_timeout expires. + */ + msg->vcq_data =3D kmemdup(vcq_cmds, vcq_count * sizeof(*vcq_cmds), + GFP_KERNEL); + if (!msg->vcq_data) { + kfree(msg); + kfree(sync); + return -ENOMEM; + } + + INIT_LIST_HEAD(&msg->list); + if (WARN_ON_ONCE(vcq_count < MIN_VCQ_CMDS)) { + ret =3D -EINVAL; + goto err_free; + } + msg->command_id =3D vcq_cmds[1].id; /* first real command's ID */ + msg->vcq_count =3D vcq_count; + msg->num_vcqs =3D num_vcqs; + msg->target_mbx =3D target_mbx; + msg->actual_mbx =3D -1; + msg->complete =3D cmh_sync_complete; + msg->completion_data =3D sync; + refcount_set(&msg->refs, 2); /* waiter + TM subsystem */ + + ret =3D cmh_tm_post_command(msg); + if (ret) { +err_free: + kfree(msg->vcq_data); + kfree(msg); + kfree(sync); /* callback will never fire */ + return ret; + } + + dev_dbg(cmh_dev(), "tm: submit_sync posted cmd 0x%08x, waiting...\n= ", + msg->command_id); + + left =3D wait_for_completion_timeout(&sync->done, timeout_hz); + if (!left) { + dev_err(cmh_dev(), + "tm: submit_sync timeout (%lums) cmd=3D0x%08x\n", + timeout_hz * 1000 / HZ, msg->command_id); + if (cmh_tm_try_cancel_command(msg)) { + /* + * Msg was still queued -- TM never saw it. + * Drop the callback ref (no txn will fire it) + * and free msg directly (sole owner). + */ + cmh_sync_ctx_put(sync); /* no txn -> drop cb ref *= / + cmh_sync_ctx_put(sync); /* drop waiter ref */ + command_msg_put(msg); /* matches refcount_set(2)= */ + command_msg_put(msg); + } else { + /* + * TM has dequeued msg and the VCQ is in-flight. + * Issue MBX_COMMAND_ABORT to force-stop the VCQ; + * the RH will fire MBX_ERROR_IRQ, complete the + * transaction with -EIO, and issue RESTART. + * + * cmh_rh_abort_mbx() serialises the write under + * rh_process_lock, preventing clobber of a + * concurrent RESTART/FLUSH from the watchdog. + */ + s32 abrt_mbx =3D READ_ONCE(msg->actual_mbx); + + if (abrt_mbx >=3D 0 && + (u32)abrt_mbx < tm.cfg->mbx_count) { + dev_warn(cmh_dev(), + "tm: aborting mbx[%d] cmd=3D0x%08x= \n", + abrt_mbx, msg->command_id); + cmh_rh_abort_mbx((u32)abrt_mbx); + } + + /* + * Wait for the RH completion (ABORT triggers + * MBX_ERROR_IRQ within microseconds). Fixed + * 5 s ceiling -- not configurable because if + * ABORT doesn't complete in this window the + * HW is wedged and more waiting won't help. + */ + left =3D wait_for_completion_timeout(&sync->done, + 5 * HZ); + if (!left) { + /* + * ABORT did not complete within 5 s -- HW + * is wedged. The eSW may still be writing + * to DMA buffers owned by the caller, so w= e + * cannot let the caller free them. Transf= er + * ownership to the sync_ctx orphan mechani= sm; + * the RH callback (if it ever fires) will + * free via orphan_cb. If it never fires, = the + * buffers leak -- acceptable for a wedged = HW + * path that should never occur in practice= . + */ + dev_err(cmh_dev(), + "tm: abort timeout (5s) cmd=3D0x%08= x - DMA buffers orphaned\n", + msg->command_id); + } + cmh_sync_ctx_put(sync); /* drop waiter ref */ + command_msg_put(msg); /* drop waiter ref on msg = */ + } + return -ETIMEDOUT; + } + + ret =3D sync->error; + cmh_sync_ctx_put(sync); /* drop waiter ref */ + command_msg_put(msg); /* drop waiter ref on msg */ + return ret; +} + +/** + * cmh_tm_submit_sync_mbx() - Synchronous VCQ submit on a target MBX + * @vcq_cmds: Array of pre-built VCQ command entries + * @vcq_count: Total number of entries in @vcq_cmds + * @num_vcqs: Number of VCQs packed in @vcq_cmds + * @target_mbx: Pinned mailbox index, or -1 for round-robin + * + * Convenience wrapper around cmh_tm_submit_sync_tmo() using the + * default vcq_timeout_ms module parameter. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_tm_submit_sync_mbx(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, s32 target_mbx) +{ + return cmh_tm_submit_sync_tmo(vcq_cmds, vcq_count, num_vcqs, + target_mbx, + msecs_to_jiffies(vcq_timeout_ms)); +} + +/** + * cmh_tm_async_timeout_jiffies() - Default async per-request timeout + * + * Return: Timeout in jiffies from the async_timeout_ms module param, + * or 0 if async timeouts are disabled. + */ +unsigned long cmh_tm_async_timeout_jiffies(void) +{ + return async_timeout_ms ? msecs_to_jiffies(async_timeout_ms) : 0; +} + +/** + * cmh_tm_slow_op_timeout_jiffies() - Timeout for slow crypto ops + * + * Returns the extended timeout used for RSA keygen, PQC keygen/sign, + * and similar long-running operations. + * + * Return: Timeout in jiffies from the slow_op_timeout_ms module param. + */ +unsigned long cmh_tm_slow_op_timeout_jiffies(void) +{ + return msecs_to_jiffies(slow_op_timeout_ms); +} + +/** + * cmh_tm_submit_async() - Asynchronous VCQ submission + * @vcq_cmds: Array of pre-built VCQ command entries + * @vcq_count: Total number of entries in @vcq_cmds + * @num_vcqs: Number of VCQs packed in @vcq_cmds + * @target_mbx: Pinned mailbox index, or -1 for round-robin + * @callback: Completion callback (see context note below) + * @callback_data: Opaque data passed to @callback + * @backlog_ok: Allow backlogging if CMQ is full + * @timeout_jiffies: Per-request timeout (0 =3D no timeout) + * + * Builds a command_msg, heap-copies the VCQ data, and posts it to the + * CMQ via cmh_tm_post_command(). + * + * Callback context guarantee: + * The @callback may be invoked from one of: + * - RH threaded IRQ handler (process context, BH disabled) + * - RH watchdog timer (softirq / timer context) + * - TM kthread if submit_vcq() fails post-dequeue + * - cmh_tm_cleanup()/cmh_tm_quiesce() during drain (process context) + * It is NEVER invoked from hardirq context. + * + * Because the watchdog path runs from timer softirq, callbacks + * MUST be safe in atomic/softirq context: no mutex, no GFP_KERNEL, + * no sleeping locks. crypto_request_complete() is safe (documented + * callable from any context). kfree_sensitive() and + * scatterwalk_map_and_copy() are also safe (non-sleeping). + * Callers must not assume thread affinity (callback may run on any CPU)= . + * + * Unlike the _sync variants, this function: + * - Does NOT allocate a cmh_sync_ctx or wait for completion + * - Uses GFP_ATOMIC for internal allocations because the crypto API + * may call ->encrypt/->decrypt/->hash_final from softirq context + * (e.g. network stack via IPsec/TLS); GFP_KERNEL would deadlock. + * + * The command_msg is single-owner (refcount 1) -- the TM subsystem + * owns it after post and frees it after dispatching to the HW. + * + * DMA buffer ownership: the caller transfers ownership to the callback + * on return of 0 or -EBUSY. On any other return, the caller must + * clean up DMA buffers itself -- the callback will never fire. + * + * Return: 0 on successful post, -EBUSY if backlogged, -ENOMEM, + * -EINVAL, -EAGAIN, or -ENODEV on failure. + */ +int cmh_tm_submit_async(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, s32 target_mbx, + cmh_completion_fn callback, void *callback_data, + bool backlog_ok, unsigned long timeout_jiffies) +{ + struct command_msg *msg; + int ret; + + msg =3D kzalloc_obj(*msg, GFP_ATOMIC); + if (!msg) + return -ENOMEM; + + msg->vcq_data =3D kmemdup(vcq_cmds, + array_size(vcq_count, sizeof(*vcq_cmds)), + GFP_ATOMIC); + if (!msg->vcq_data) { + kfree(msg); + return -ENOMEM; + } + + INIT_LIST_HEAD(&msg->list); + if (WARN_ON_ONCE(vcq_count < MIN_VCQ_CMDS)) { + kfree(msg->vcq_data); + kfree(msg); + return -EINVAL; + } + msg->command_id =3D vcq_cmds[1].id; + msg->vcq_count =3D vcq_count; + msg->num_vcqs =3D num_vcqs; + msg->target_mbx =3D target_mbx; + msg->actual_mbx =3D -1; + msg->complete =3D callback; + msg->completion_data =3D callback_data; + msg->backlog_ok =3D backlog_ok; + msg->timeout_jiffies =3D timeout_jiffies; + refcount_set(&msg->refs, 1); /* sole owner: TM subsystem */ + + ret =3D cmh_tm_post_command(msg); + if (ret && ret !=3D -EBUSY) { + kfree(msg->vcq_data); + kfree(msg); + } + return ret; +} + +/** + * cmh_tm_submit_sync_noabort() - Sync submit without MBX abort on timeout + * @vcq_cmds: Array of pre-built VCQ command entries + * @vcq_count: Total number of entries in @vcq_cmds + * @num_vcqs: Number of VCQs packed in @vcq_cmds + * @timeout_hz: Completion timeout in jiffies + * @orphan_cb: Optional cleanup callback for abandoned DMA buffers + * @orphan_data: Opaque data passed to @orphan_cb + * + * On timeout, if the command was still queued it is cancelled and + * -EAGAIN is returned (caller may free all resources). If the VCQ is + * already in-flight, the waiter drops its refs and returns -EINPROGRESS + * -- the RH callback will fire when the eSW finishes the VCQ and free + * the sync_ctx / msg via the refcount mechanism. + * + * @orphan_cb is invoked when the last ref on the sync_ctx drops after + * the waiter abandoned an in-flight VCQ, allowing the caller to defer + * DMA-buffer cleanup until the eSW finishes writing. + * + * This prevents a short-timeout command (e.g. DRBG GENERATE from the + * hwrng kthread) from aborting the entire MBX and killing unrelated + * long-running operations (e.g. SLH-DSA sign at 120 s). + * + * Return: 0 on success, -EAGAIN if cancelled from queue, + * -EINPROGRESS if left in-flight, or negative errno. + */ +int cmh_tm_submit_sync_noabort(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, unsigned long timeout_hz, + void (*orphan_cb)(void *), void *orphan_data= ) +{ + struct cmh_sync_ctx *sync; + struct command_msg *msg; + unsigned long left; + int ret; + + WARN_ON_ONCE(!in_task()); + + sync =3D kzalloc_obj(*sync, GFP_KERNEL); + if (!sync) + return -ENOMEM; + + msg =3D kzalloc_obj(*msg, GFP_KERNEL); + if (!msg) { + kfree(sync); + return -ENOMEM; + } + + init_completion(&sync->done); + sync->error =3D 0; + refcount_set(&sync->refs, 2); + + INIT_LIST_HEAD(&msg->list); + if (WARN_ON_ONCE(vcq_count < MIN_VCQ_CMDS)) { + kfree(msg); + kfree(sync); + return -EINVAL; + } + msg->command_id =3D vcq_cmds[1].id; + msg->vcq_data =3D kmemdup(vcq_cmds, vcq_count * sizeof(*vcq_cmds), + GFP_KERNEL); + if (!msg->vcq_data) { + kfree(msg); + kfree(sync); + return -ENOMEM; + } + msg->vcq_count =3D vcq_count; + msg->num_vcqs =3D num_vcqs; + msg->target_mbx =3D -1; + msg->actual_mbx =3D -1; + msg->complete =3D cmh_sync_complete; + msg->completion_data =3D sync; + refcount_set(&msg->refs, 2); + + ret =3D cmh_tm_post_command(msg); + if (ret) { + kfree(msg->vcq_data); + kfree(msg); + kfree(sync); + return ret; + } + + left =3D wait_for_completion_timeout(&sync->done, timeout_hz); + if (!left) { + if (cmh_tm_try_cancel_command(msg)) { + /* Still queued -- TM never saw it, clean up fully = */ + cmh_sync_ctx_put(sync); /* drop cb ref */ + cmh_sync_ctx_put(sync); /* drop waiter ref */ + command_msg_put(msg); /* matches refcount_set(2)= */ + command_msg_put(msg); + return -EAGAIN; + } + + /* + * In-flight: skip ABORT. Transfer orphan cleanup + * ownership to sync_ctx -- the RH callback will + * eventually complete this VCQ, and when the last + * ref drops, orphan_cb frees any DMA buffers the + * eSW was still writing to. + */ + dev_dbg_ratelimited(cmh_dev(), + "tm: noabort timeout (%lums) cmd=3D0x%0= 8x, leaving in-flight\n", + timeout_hz * 1000 / HZ, + msg->command_id); + sync->orphan_cb =3D orphan_cb; + sync->orphan_data =3D orphan_data; + cmh_sync_ctx_put(sync); + command_msg_put(msg); + return -EINPROGRESS; + } + + ret =3D sync->error; + cmh_sync_ctx_put(sync); + command_msg_put(msg); + return ret; +} + +/** + * cmh_tm_submit_sync() - Synchronous VCQ submit with default timeout + * @vcq_cmds: Array of pre-built VCQ command entries + * @vcq_count: Total number of entries in @vcq_cmds + * @num_vcqs: Number of VCQs packed in @vcq_cmds + * + * Convenience wrapper: submits via round-robin MBX selection with the + * default vcq_timeout_ms. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_tm_submit_sync(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs) +{ + return cmh_tm_submit_sync_mbx(vcq_cmds, vcq_count, num_vcqs, -1); +} + +#define MBX_FLUSH_TIMEOUT_MS 1000 +#define MBX_FLUSH_POLL_MIN_US 10 +#define MBX_FLUSH_POLL_MAX_US 50 + +/** + * cmh_tm_flush_mbx() - Issue MBX_COMMAND_FLUSH and wait for completion + * @mbx_idx: Mailbox index to flush + * + * Resets the eSW child mailbox state: clears the VCQ command queue, + * resets head/tail, and -- critically -- resets the child temp stack + * via mbx_hdr_init() (sets hdr->temp back to &cmds[MAX_VCQ_CMDS]). + * + * Why this is needed: + * KIC derivation commands that output to SYS_REF_TEMP allocate on the + * per-MBX child temp LIFO stack (mbx_alloc_temp, each costing + * ROUND_UP(len,4)+56 bytes). These allocations persist across VCQ + * completions because mbx_vcq_done() does NOT reset the temp stack. + * Without an explicit flush, sequential KIC-TEMP ioctls exhaust the + * ~960-byte temp area and subsequent derives fail with ENOMEM. + * + * What is NOT affected: + * KIC HW keys, datastore objects, DRBG state -- these survive the + * flush. Only the queue pointers and temp stack are reset. + * + * Concurrency: + * Acquires the per-MBX dispatch_lock mutex to serialise with VCQ + * dispatch in submit_vcq(). This prevents the flush from resetting + * head/tail while the TM kthread is writing a VCQ to a DMA slot on + * the same MBX. The eSW clears R_MBX_COMMAND to zero once the flush + * completes. + * + * Return: 0 on success, -EINVAL, -ENODEV, -EBUSY, or -ETIMEDOUT. + */ +int cmh_tm_flush_mbx(s32 mbx_idx) +{ + struct cmh_mbx_config *mbx; + struct cmh_mbx_txq *txq; + void __iomem *base; + u32 reg; + int ret; + + if (!tm.cfg || mbx_idx < 0 || (u32)mbx_idx >=3D tm.cfg->mbx_count) + return -EINVAL; + + mbx =3D &tm.cfg->mailboxes[mbx_idx]; + base =3D mbx->reg_base; + if (!base) + return -ENODEV; + + txq =3D &tm.txqs[mbx_idx]; + mutex_lock(&txq->dispatch_lock); + + /* Ensure no command is already pending */ + if (cmh_reg_read32(base, R_MBX_COMMAND) !=3D 0) { + mutex_unlock(&txq->dispatch_lock); + return -EBUSY; + } + + cmh_reg_write32(MBX_COMMAND_FLUSH, base, R_MBX_COMMAND); + + /* Poll until eSW clears the command register */ + ret =3D read_poll_timeout(cmh_reg_read32, reg, reg =3D=3D 0, + MBX_FLUSH_POLL_MIN_US, + MBX_FLUSH_TIMEOUT_MS * 1000, + true, base, R_MBX_COMMAND); + if (ret) + dev_err(cmh_dev(), "mbx %u flush timeout (cmd=3D0x%08x)\n", + mbx->instance, + cmh_reg_read32(base, R_MBX_COMMAND)); + + mutex_unlock(&txq->dispatch_lock); + return ret; +} + +/** + * cmh_vcq_pack_and_submit() - Pack payload into VCQs and submit sync + * @payload: Array of VCQ command entries (without headers) + * @count: Number of entries in @payload + * @packed: Caller-provided output buffer for packed VCQ data + * @max_packed: Size of @packed buffer in vcq_cmd entries + * @target_mbx: Pinned mailbox index, or -1 for round-robin + * + * Splits @payload into VCQ-sized chunks, prepends headers, and submits + * synchronously. + * + * Return: 0 on success, -EMSGSIZE if @packed is too small, or + * negative errno from submit. + */ +int cmh_vcq_pack_and_submit(const struct vcq_cmd *payload, u32 count, + struct vcq_cmd *packed, u32 max_packed, + s32 target_mbx) +{ + u32 max_per_vcq =3D cmh_tm_max_cmds_per_vcq(); + u32 max_payload_per =3D max_per_vcq - 1; + u32 num_vcqs =3D 0, total =3D 0, i =3D 0; + + while (i < count) { + u32 chunk =3D min_t(u32, count - i, max_payload_per); + u32 vcq_cmds =3D chunk + 1; + + if (total + vcq_cmds > max_packed) + return -EMSGSIZE; + + vcq_set_header(&packed[total], vcq_cmds); + memcpy(&packed[total + 1], &payload[i], + chunk * sizeof(struct vcq_cmd)); + + total +=3D vcq_cmds; + i +=3D chunk; + num_vcqs++; + } + + return cmh_tm_submit_sync_mbx(packed, total, num_vcqs, target_mbx); +} + +/** + * cmh_vcq_pack_and_submit_async() - Pack payload and submit async + * @payload: Array of VCQ command entries (without headers) + * @count: Number of entries in @payload + * @packed: Caller-provided output buffer for packed VCQ data + * @max_packed: Size of @packed buffer in vcq_cmd entries + * @target_mbx: Pinned mailbox index, or -1 for round-robin + * @callback: Completion callback + * @callback_data: Opaque data passed to @callback + * @backlog_ok: Allow backlogging if CMQ is full + * @timeout_jiffies: Per-request timeout (0 =3D no timeout) + * + * Asynchronous variant of cmh_vcq_pack_and_submit(). Splits @payload + * into VCQ-sized chunks, prepends headers, and submits via + * cmh_tm_submit_async(). + * + * Return: 0 on success, -EBUSY if backlogged, -EMSGSIZE if @packed + * is too small, or negative errno from submit. + */ +int cmh_vcq_pack_and_submit_async(const struct vcq_cmd *payload, u32 count= , + struct vcq_cmd *packed, u32 max_packed, + s32 target_mbx, + cmh_completion_fn callback, + void *callback_data, + bool backlog_ok, + unsigned long timeout_jiffies) +{ + u32 max_per_vcq =3D cmh_tm_max_cmds_per_vcq(); + u32 max_payload_per =3D max_per_vcq - 1; + u32 num_vcqs =3D 0, total =3D 0, i =3D 0; + + while (i < count) { + u32 chunk =3D min_t(u32, count - i, max_payload_per); + u32 vcq_cmds =3D chunk + 1; + + if (total + vcq_cmds > max_packed) + return -EMSGSIZE; + + vcq_set_header(&packed[total], vcq_cmds); + memcpy(&packed[total + 1], &payload[i], + chunk * sizeof(struct vcq_cmd)); + + total +=3D vcq_cmds; + i +=3D chunk; + num_vcqs++; + } + + return cmh_tm_submit_async(packed, total, num_vcqs, target_mbx, + callback, callback_data, backlog_ok, + timeout_jiffies); +} + +/** + * cmh_tm_peek_transaction() - Peek at the head of a mailbox TXQ + * @mbx_idx: Mailbox index to inspect + * + * Returns a pointer to the oldest in-flight transaction without + * removing it from the queue. The caller must not free the returned + * object. + * + * Return: Pointer to the head transaction, or NULL if empty. + */ +struct transaction_obj *cmh_tm_peek_transaction(u32 mbx_idx) +{ + struct cmh_mbx_txq *txq; + struct transaction_obj *txn =3D NULL; + unsigned long flags; + + if (!tm.txqs || mbx_idx >=3D tm.cfg->mbx_count) + return NULL; + + txq =3D &tm.txqs[mbx_idx]; + + spin_lock_irqsave(&txq->lock, flags); + if (!list_empty(&txq->head)) + txn =3D list_first_entry(&txq->head, struct transaction_obj= , + list); + spin_unlock_irqrestore(&txq->lock, flags); + + return txn; +} + +/** + * cmh_tm_pop_transaction() - Remove and return the head of a MBX TXQ + * @mbx_idx: Mailbox index to pop from + * + * Dequeues the oldest in-flight transaction from the per-mailbox + * transaction queue. The caller takes ownership and must eventually + * call cmh_txn_finish() or txn_put(). + * + * Return: Pointer to the dequeued transaction, or NULL if empty. + */ +struct transaction_obj *cmh_tm_pop_transaction(u32 mbx_idx) +{ + struct cmh_mbx_txq *txq; + struct transaction_obj *txn; + unsigned long flags; + + if (!tm.txqs || mbx_idx >=3D tm.cfg->mbx_count) + return NULL; + + txq =3D &tm.txqs[mbx_idx]; + + spin_lock_irqsave(&txq->lock, flags); + if (list_empty(&txq->head)) { + spin_unlock_irqrestore(&txq->lock, flags); + return NULL; + } + txn =3D list_first_entry(&txq->head, struct transaction_obj, list); + list_del_init(&txn->list); + txq->depth--; + spin_unlock_irqrestore(&txq->lock, flags); + + return txn; +} + +/* -- debugfs timeout accessors ----------------------------------------- = */ + +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +/** + * cmh_tm_timeout_async_ptr() - Return pointer to async_timeout_ms for deb= ugfs + * + * Return: pointer to the static async_timeout_ms variable. + */ +unsigned int *cmh_tm_timeout_async_ptr(void) { return &async_timeout_ms= ; } + +/** + * cmh_tm_timeout_vcq_ptr() - Return pointer to vcq_timeout_ms for debugfs + * + * Return: pointer to the static vcq_timeout_ms variable. + */ +unsigned int *cmh_tm_timeout_vcq_ptr(void) { return &vcq_timeout_ms; = } + +/** + * cmh_tm_timeout_slow_op_ptr() - Return pointer to slow_op_timeout_ms for= debugfs + * + * Return: pointer to the static slow_op_timeout_ms variable. + */ +unsigned int *cmh_tm_timeout_slow_op_ptr(void) { return &slow_op_timeout_= ms; } + +/** + * cmh_tm_timeout_drain_ptr() - Return pointer to drain_timeout_ms for deb= ugfs + * + * Return: pointer to the static drain_timeout_ms variable. + */ +unsigned int *cmh_tm_timeout_drain_ptr(void) { return &drain_timeout_ms= ; } +#endif diff --git a/drivers/crypto/cmh/include/cmh.h b/drivers/crypto/cmh/include/= cmh.h new file mode 100644 index 000000000000..18150ba39129 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Top-level Device Structure + */ + +#ifndef CMH_H +#define CMH_H + +#include + +#include "cmh_config.h" + +#define CMH_DRV_NAME "cmh" +#define CMH_VERSION "1.0.0" + +/** + * struct cmh_device - Top-level driver state for a CMH hardware instance + * @config: Hardware configuration (core mappings, MBX layout, feature fla= gs) + * @dev: Platform or parent device used for DMA and logging + */ +struct cmh_device { + struct cmh_config config; + struct device *dev; +}; + +#endif /* CMH_H */ diff --git a/drivers/crypto/cmh/include/cmh_aes_abi.h b/drivers/crypto/cmh/= include/cmh_aes_abi.h new file mode 100644 index 000000000000..78405bdf70ff --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_aes_abi.h @@ -0,0 +1,97 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- AES Core ABI Definitions + * + * Kernel-side definitions for the CMH AES ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_AES_ABI_H +#define CMH_AES_ABI_H + +#include + +/* AES Block Size */ + +#define CMH_AES_BLOCK_SIZE 16U +#define CMH_AES_IV_SIZE 16U + +/* AES Modes (per CMH AES ABI) */ + +#define AES_MODE_ECB 1U +#define AES_MODE_CBC 2U +#define AES_MODE_CTR 3U +#define AES_MODE_CFB 4U +#define AES_MODE_GCM 5U +#define AES_MODE_CMAC 6U +#define AES_MODE_CCM 7U +#define AES_MODE_XTS 8U + +/* AES Operations (per CMH AES ABI) */ + +#define AES_OP_DECRYPT 1U +#define AES_OP_ENCRYPT 2U + +/* AES Command IDs */ + +#define AES_CMD_INIT 0x01U +#define AES_CMD_AAD_UPDATE 0x02U +#define AES_CMD_AAD_FINAL 0x03U +#define AES_CMD_UPDATE 0x04U +#define AES_CMD_FINAL 0x05U +#define AES_CMD_SCATTERGATHER 0x06U +#define AES_CMD_CCM_INIT 0x0AU +#define AES_CMD_AAD_FINAL_AUTH 0x0EU + +/* AES Command Structures */ + +struct aes_cmd_init { + u64 key; /* datastore reference for the key */ + u64 iv; /* DMA address of the IV (or nonce in CCM) */ + u32 keylen; /* key length in bytes */ + u32 ivlen; /* IV length in bytes (0..16) */ + u32 mode; /* AES mode (AES_MODE_*) */ + u32 op; /* AES operation (AES_OP_*) */ + u32 aadlen; /* AAD length or 0 */ + u32 iolen; /* plaintext/ciphertext length */ + u32 taglen; /* tag length or 0 */ +}; + +struct aes_cmd_aad_final { + u64 data; /* DMA address of AAD data */ + u32 datalen; /* AAD data length */ +}; + +struct aes_cmd_aad_final_auth { + u64 data; /* DMA address of final AAD data */ + u32 datalen; /* final AAD data length */ + u64 tag; /* DMA address of tag */ + u32 taglen; /* tag length */ +}; + +struct aes_cmd_update { + u64 input; /* DMA address of input data */ + u64 output; /* DMA address of output data */ + u32 iolen; /* input/output data length */ +}; + +struct aes_cmd_final { + u64 input; /* DMA address of last input data */ + u64 output; /* DMA address of last output data */ + u64 tag; /* DMA address of tag (AEAD only) */ + u32 iolen; /* last input/output data length */ + u32 taglen; /* tag length (AEAD only) */ +}; + +/* AES Command Union */ + +union aes_cmd { + struct aes_cmd_init cmd_init; + struct aes_cmd_update cmd_update; + struct aes_cmd_final cmd_final; + struct aes_cmd_aad_final cmd_aad_final; + struct aes_cmd_aad_final_auth cmd_aad_final_auth; +}; + +#endif /* CMH_AES_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_ccp_abi.h b/drivers/crypto/cmh/= include/cmh_ccp_abi.h new file mode 100644 index 000000000000..4e3eb9feaec9 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_ccp_abi.h @@ -0,0 +1,108 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- CCP Core ABI Definitions + * + * Kernel-side definitions for the CMH CCP ABI. + * All constants and layouts derived from the CMH eSW ABI. + * + * The CCP core provides three modes: + * - ChaCha20 stream cipher (skcipher) + * - Poly1305 one-time authenticator (shash) + * - ChaCha20-Poly1305 AEAD (RFC 7539) + */ + +#ifndef CMH_CCP_ABI_H +#define CMH_CCP_ABI_H + +#include + +/* CCP Block Sizes */ + +#define CCP_CHACHA_BLOCK_SIZE 64U /* ChaCha20 block =3D 512 bits */ +#define CCP_POLY_BLOCK_SIZE 16U /* Poly1305 block =3D 128 bits */ +#define CCP_CTRNONCE_SIZE 16U /* 4-byte LE counter + 12-byte nonc= e */ +#define CCP_POLY_KEY_SIZE 16U /* r_key and s_key each 16 bytes */ +#define CCP_POLY_TAG_SIZE 16U /* Poly1305 tag =3D 128 bits */ +#define CCP_CHACHA_CTR_LEN 4U /* 32-bit counter */ + +/* CCP Operations (per CMH CCP ABI) */ + +#define CCP_OP_DECRYPT 1U +#define CCP_OP_ENCRYPT 2U + +/* CCP Command IDs */ + +#define CCP_CMD_CHACHA20_INIT 0x01U +#define CCP_CMD_POLY1305_INIT 0x02U +#define CCP_CMD_AEAD_INIT 0x03U +#define CCP_CMD_AAD_UPDATE 0x04U +#define CCP_CMD_AAD_FINAL 0x05U +#define CCP_CMD_UPDATE 0x06U +#define CCP_CMD_FINAL 0x07U +#define CCP_CMD_SCATTERGATHER 0x08U +/* CCP_CMD_FLUSH =3D VCQ_CMD_FLUSH (0xFF) -- defined in cmh_vcq.h */ + +/* CCP Command Structures */ + +struct ccp_cmd_chacha { + u64 key; /* datastore reference for the key */ + u64 ctrnonce; /* DMA address of the 16-byte counter+nonce= */ + u32 keylen; /* key length: 16 or 32 bytes */ + u32 ctrnoncelen; /* always 16 */ + u32 ctrlen; /* counter length: 4 bytes */ + u32 op; /* CCP_OP_ENCRYPT or CCP_OP_DECRYPT */ +}; + +struct ccp_cmd_poly { + u64 rkey; /* datastore reference for the r key */ + u64 skey; /* datastore reference for the s key */ + u32 rkeylen; /* always 16 */ + u32 skeylen; /* always 16 */ +}; + +struct ccp_cmd_aead { + u64 key; /* datastore reference for the key */ + u64 ctrnonce; /* DMA address of the 16-byte counter+nonce= */ + u32 keylen; /* key length: 32 bytes */ + u32 ctrnoncelen; /* always 16 */ + u32 op; /* CCP_OP_ENCRYPT or CCP_OP_DECRYPT */ +}; + +struct ccp_cmd_aad_update { + u64 aad; /* DMA address of AAD data */ + u32 aadlen; /* AAD length (must be multiple of 16) */ +}; + +struct ccp_cmd_aad_final { + u64 aad; /* DMA address of last AAD data */ + u32 aadlen; /* last AAD length (any size) */ +}; + +struct ccp_cmd_update { + u64 input; /* DMA address of input data */ + u64 output; /* DMA address of output data */ + u32 iolen; /* input/output length */ +}; + +struct ccp_cmd_final { + u64 input; /* DMA address of last input data */ + u64 output; /* DMA address of last output data */ + u64 tag; /* DMA address of the 16-byte tag */ + u32 iolen; /* last input/output data length */ + u32 taglen; /* tag length (always 16) */ +}; + +/* CCP Command Union */ + +union ccp_cmd { + struct ccp_cmd_chacha cmd_chacha; + struct ccp_cmd_poly cmd_poly; + struct ccp_cmd_aead cmd_aead; + struct ccp_cmd_aad_update cmd_aad_update; + struct ccp_cmd_aad_final cmd_aad_final; + struct ccp_cmd_update cmd_update; + struct ccp_cmd_final cmd_final; +}; + +#endif /* CMH_CCP_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_config.h b/drivers/crypto/cmh/i= nclude/cmh_config.h new file mode 100644 index 000000000000..6a9e629ed353 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_config.h @@ -0,0 +1,91 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Configuration Structures and Defaults + */ + +#ifndef CMH_CONFIG_H +#define CMH_CONFIG_H + +#include +#include + +#include "cmh_registers.h" +#include "cmh_vcq.h" + +/* Limits */ + +/* + * Max mailboxes the driver manages simultaneously. The hardware address + * space supports CMH_MAX_MBX_INSTANCES (64) instance indices, but this + * compile-time constant caps how many the driver allocates DMA queues, + * IRQ slots, and per-transform cache entries for. To manage more + * mailboxes (up to the HW max), increase this value and rebuild the LKM + * -- it cannot be changed via module parameters at runtime. + */ +#define CMH_MAX_CONFIGURED_MBX 16 +#define CMH_MAX_CORE_INSTANCES 8 + +/* MBX setup parameter ranges (per CMH hardware specification) */ +#define CMH_MBX_SLOTS_LOG2_MIN 1 +#define CMH_MBX_SLOTS_LOG2_MAX 15 +#define CMH_MBX_STRIDE_LOG2_MIN 7 +#define CMH_MBX_STRIDE_LOG2_MAX 10 + +/* Default Configuration Values */ + +#define CMH_DEFAULT_MBX_COUNT 2 +#define CMH_DEFAULT_SLOTS_LOG2 5 /* 2^5 =3D 32 slots */ +#define CMH_DEFAULT_STRIDE_LOG2 7 /* 2^7 =3D 128 bytes per slot */ +#define CMH_DEFAULT_IRQ (-1) /* polling mode */ +#define CMH_DEFAULT_FW_READY_TIMEOUT_MS 5000 /* 5s for mission mode */ + +/* Per-Core-Type Instance Configuration */ + +struct cmh_core_type_cfg { + u32 num_instances; + u32 core_ids[CMH_MAX_CORE_INSTANCES]; + s32 mbx[CMH_MAX_CORE_INSTANCES]; /* -1 =3D auto-assign */ +}; + +/* Per-Mailbox Configuration */ + +struct cmh_mbx_config { + u32 instance; /* 0-based MBX instance index (0..6= 3) */ + u32 slots_log2; /* log2(slot count), range 1..15 */ + u32 stride_log2; /* log2(bytes per slot), range 7..1= 0 */ + u32 lock_val; /* MBX lock token (non-zero while h= eld) */ + dma_addr_t dma_handle; /* DMA bus address from dma_alloc_c= oherent */ + void *virt_addr; /* kernel virtual address of MBXQ b= uffer */ + size_t queue_size; /* total queue buffer size in bytes= */ + void __iomem *reg_base; /* ioremap'd register base for this= instance */ +}; + +/* Global Device Configuration */ + +struct cmh_config { + phys_addr_t sic_base; + size_t sic_size; + void __iomem *sic_mapped; /* ioremap'd SIC region = */ + struct device_node *of_node; /* DT node (may be NULL)= */ + u32 mbx_count; + struct cmh_mbx_config mailboxes[CMH_MAX_CONFIGURED_MBX]; + int irq; /* -1 =3D poll, else IRQ= line */ + unsigned int fw_ready_timeout_ms; /* FW mission-mode= timeout */ + struct cmh_core_type_cfg core_types[CMH_NUM_CORE_TYPES]; +}; + +/* Module Parameter Interface */ + +struct platform_device; + +/** + * cmh_config_init() - Populate config from module params and device-tree + * @cfg: Configuration structure to fill + * @pdev: Platform device (for DT properties and IRQ lookup) + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_config_init(struct cmh_config *cfg, struct platform_device *pdev); + +#endif /* CMH_CONFIG_H */ diff --git a/drivers/crypto/cmh/include/cmh_debugfs.h b/drivers/crypto/cmh/= include/cmh_debugfs.h new file mode 100644 index 000000000000..abaa837470c5 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_debugfs.h @@ -0,0 +1,90 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- debugfs Per-MBX and TM Counters + * + * Exposes diagnostic counters under /sys/kernel/debug/cmh/: + * + * mbxN/vcqs_submitted Total VCQs sent to MBX N + * mbxN/vcqs_completed Total completions received + * mbxN/vcqs_errors Total error completions + * mbxN/queue_full_count Times select_mailbox() skipped this MBX + * mbxN/max_queue_depth High-water mark of in-flight transactions + * + * tm/cmq_posts Total cmh_tm_post_command() calls + * tm/cmq_depth_max High-water mark of CMQ length + * tm/cmq_eagain_count Times CMQ was full (-EAGAIN) + * tm/backoff_count Times TM backed off (all MBX queues full) + * tm/async_timeout_count Async requests that timed out + * + * Counters are atomic64_t -- safe to read from any context. + * When CONFIG_CRYPTO_DEV_CMH_DEBUG is off, all functions become no-ops an= d the + * compiler eliminates the counter code entirely. + */ + +#ifndef CMH_DEBUGFS_H +#define CMH_DEBUGFS_H + +#include +#include + +/* Per-Mailbox Statistics */ + +struct cmh_mbx_stats { + atomic64_t vcqs_submitted; + atomic64_t vcqs_completed; + atomic64_t vcqs_errors; + atomic64_t queue_full_count; + atomic64_t max_queue_depth; +}; + +/* TM-Level Statistics */ + +struct cmh_tm_stats { + atomic64_t cmq_posts; + atomic64_t cmq_depth_max; + atomic64_t cmq_eagain_count; + atomic64_t backoff_count; + atomic64_t async_timeout_count; +}; + +/** + * cmh_stat_update_max() - Atomically update a high-water mark counter + * @counter: atomic64_t counter to update + * @val: New candidate value + * + * Updates @counter to @val if @val exceeds the current maximum. + * Lock-free via atomic cmpxchg loop. + */ +static inline void cmh_stat_update_max(atomic64_t *counter, s64 val) +{ + s64 cur; + + do { + cur =3D atomic64_read(counter); + if (val <=3D cur) + return; + } while (atomic64_cmpxchg(counter, cur, val) !=3D cur); +} + +/* Interface (stub when CONFIG_CRYPTO_DEV_CMH_DEBUG is off) */ + +struct cmh_config; + +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG + +int cmh_debugfs_init(struct cmh_config *cfg); +void cmh_debugfs_cleanup(void); + +struct cmh_mbx_stats *cmh_debugfs_mbx_stats(u32 mbx_idx); +struct cmh_tm_stats *cmh_debugfs_tm_stats(void); + +#else /* !CONFIG_CRYPTO_DEV_CMH_DEBUG */ + +static inline int cmh_debugfs_init(struct cmh_config *c) { return 0; } +static inline void cmh_debugfs_cleanup(void) {} +static inline struct cmh_mbx_stats *cmh_debugfs_mbx_stats(u32 i) { return = NULL; } +static inline struct cmh_tm_stats *cmh_debugfs_tm_stats(void) { return = NULL; } + +#endif /* CONFIG_CRYPTO_DEV_CMH_DEBUG */ +#endif /* CMH_DEBUGFS_H */ diff --git a/drivers/crypto/cmh/include/cmh_dma.h b/drivers/crypto/cmh/incl= ude/cmh_dma.h new file mode 100644 index 000000000000..7dd0d8311785 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_dma.h @@ -0,0 +1,219 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- DMA Interface + * + * Platform-independent DMA operations for the CMH crypto accelerator. + * All functions are implemented in cmh_dma.c (standard kernel DMA API). + * + * Alternate backends may be linked in place of cmh_dma.c for + * non-standard platforms. Such backends must implement the same + * symbol set and may use different allocation and mapping semantics + * (e.g. pool-based alloc/free instead of address translation). + */ + +#ifndef CMH_DMA_H +#define CMH_DMA_H + +#include +#include + +#include "cmh_vcq.h" + +struct platform_device; + +/** + * cmh_dma_init() - Initialize the DMA backend + * @pdev: Platform device (provides struct device for DMA ops) + * + * Called early in .probe(). The standard backend stores the device + * pointer; alternate backends may set up additional resources. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_dma_init(struct platform_device *pdev); + +/** + * cmh_dma_cleanup() - Tear down the DMA backend + * + * Called in .remove() and error paths. Releases any resources + * allocated by cmh_dma_init(). + */ +void cmh_dma_cleanup(void); + +/** + * cmh_dev() - Global device accessor + * + * Returns the struct device * associated with the platform_driver instanc= e. + * Valid only between cmh_dma_init() and cmh_dma_cleanup(). + * + * Return: Platform device pointer, or NULL outside lifecycle. + */ +struct device *cmh_dev(void); + +/* Streaming DMA map / unmap (short-lived per-request buffers) */ + +dma_addr_t cmh_dma_map_single(void *buf, size_t size, + enum dma_data_direction dir); +void cmh_dma_unmap_single(dma_addr_t addr, size_t size, + enum dma_data_direction dir); + +/* + * Sync a DMA_FROM_DEVICE buffer so the CPU sees device-written data. + * + * Required before reading *buf when SWIOTLB bounce buffering is active + * (e.g. arm64 without IOMMU): the device writes to the bounce buffer, + * not the original allocation, so the CPU must sync before access. + * On architectures without bounce buffers (e.g. rv64) this is a no-op. + * + * Call between cmh_tm_submit_sync() and the first CPU read of the buffer, + * while the mapping is still live (before cmh_dma_unmap_single). + */ +void cmh_dma_sync_for_cpu(dma_addr_t addr, size_t size, + enum dma_data_direction dir); + +/* + * Sync a DMA_TO_DEVICE buffer so the device sees CPU-written data. + * + * Required after CPU writes to a mapped streaming buffer (e.g. SG + * descriptor arrays that need items_dma for .lli pointer calculation + * before content is written). Must be called before the device reads. + */ +void cmh_dma_sync_for_device(dma_addr_t addr, size_t size, + enum dma_data_direction dir); + +int cmh_dma_map_error(dma_addr_t addr); + +/* Coherent DMA alloc / free (long-lived MBX queue buffers) */ + +void *cmh_dma_alloc(size_t size, dma_addr_t *handle, gfp_t gfp); +void cmh_dma_free(size_t size, void *virt, dma_addr_t handle); + +/** + * cmh_dma_write() - Copy data into a DMA-allocated buffer + * @dst: Destination pointer (from cmh_dma_alloc) + * @src: Source kernel buffer + * @len: Number of bytes to copy + * + * Copies @len bytes from @src to @dst. @dst must have been obtained + * from cmh_dma_alloc(). Abstracted to allow platforms with non-standard + * DMA buffer access semantics. + */ +void cmh_dma_write(void *dst, const void *src, size_t len); + +/** + * cmh_dma_fence() - Fence preceding writes to DMA-allocated memory + * @ptr: Any pointer into the region that was written + * + * Ensures all preceding CPU writes to DMA memory are committed to the + * target memory controller before subsequent MMIO register writes. + * + * Required on FPGA platforms where DMA memory and device control + * registers reside on different AXI slaves -- a CPU-side wmb() only + * orders store dispatch, not arrival at the target. A read from the + * DMA memory slave forces the memory controller to serialize behind + * all preceding writes from this CPU before responding, guaranteeing + * the data is committed before the doorbell register write is issued. + * + * On standard DMA API platforms (cache-coherent), this is a no-op. + */ +void cmh_dma_fence(void *ptr); + +/** + * cmh_dma_zero() - Zero a DMA-allocated buffer + * @dst: Destination pointer (from cmh_dma_alloc) + * @len: Number of bytes to zero + */ +void cmh_dma_zero(void *dst, size_t len); + +/* + * CMH eSW scatter-gather chain -- built with proper DMA mappings. + * + * The CMH eSW DMAC walks a linked list of dma_scattergather_item + * descriptors. Each .src is the DMA address of an input buffer; + * each .lli is the DMA address of the next descriptor (0 =3D end). + * + * The descriptor array uses streaming DMA (kmalloc + dma_map_single) + * so that cmh_dma_free_sg() is safe from any context -- including + * BH-disabled completion callbacks where dma_free_coherent's + * vunmap() path would crash on non-coherent architectures. + */ + +/* Input descriptor for cmh_dma_build_sg() -- one per data buffer */ +struct cmh_dma_buf { + void *data; + u32 len; +}; + +/* Opaque handle returned by cmh_dma_build_sg(); pass to cmh_dma_free_sg()= */ +struct cmh_sg_map { + struct dma_scattergather_item *items; /* CPU virtual address */ + dma_addr_t items_dma; /* DMA address (pass to GAT= HER cmd) */ + size_t items_size; /* allocation size */ + u32 count; + struct { + dma_addr_t dma; + u32 len; + } bufs[]; /* per-entry source DMA han= dles */ +}; + +/** + * cmh_dma_build_sg() - Build a DMA-mapped CMH eSW SG chain + * @bufs: Array of kernel buffer descriptors (data pointer + length) + * @count: Number of entries in @bufs (must be > 0; returns NULL for 0) + * @gfp: Allocation flags (GFP_KERNEL or GFP_ATOMIC) + * + * Allocates a dma_scattergather_item chain using streaming DMA + * (kmalloc + dma_map_single), DMA-maps each source buffer, and + * links the descriptors. + * The returned cmh_sg_map->items_dma is the address to pass to + * vcq_add_hc_gather() (or any core's scatter-gather command). + * + * Caller contract: + * - Each bufs[i].data must point to DMA-mappable memory (kmalloc, + * page-allocated, or vmalloc with DMA support). Stack buffers + * are NOT safe. + * - Each bufs[i].len must be > 0. + * - The returned cmh_sg_map must remain alive (not freed) until + * the hardware completes the scatter-gather operation. Only then + * may cmh_dma_free_sg() be called. + * - There is no hardware-imposed limit on @count, but callers are + * responsible for bounding it to avoid excessive DMA mappings. + * In practice, hash uses <=3D 2 entries (partial + new data). + * + * Return: Opaque cmh_sg_map handle, or NULL on allocation/mapping failure= . + */ +struct cmh_sg_map *cmh_dma_build_sg(const struct cmh_dma_buf *bufs, u32 co= unt, + gfp_t gfp); + +/** + * cmh_dma_free_sg() - Unmap all buffers and free the SG chain + * @sgm: Handle from cmh_dma_build_sg(), or NULL (no-op) + */ +void cmh_dma_free_sg(struct cmh_sg_map *sgm); + +/* + * Orphan-DMA context -- generic helper for the noabort submit path. + * + * When cmh_tm_submit_sync_noabort() times out with a VCQ still + * in-flight, the eSW will continue writing to DMA buffers after the + * caller returns. Callers wrap their DMA state in this struct and + * pass cmh_dma_orphan_free as the orphan_cb -- the RH callback frees + * the mapping + buffer when the VCQ eventually completes. + * + * Drain guarantee: cmh_tm_cleanup() calls timer_delete_sync() on each + * TXN timeout timer and splices all TXQ entries before invoking their + * completion callbacks. This ensures no orphan callback can race with + * or run after TM cleanup completes -- by that point every in-flight + * transaction has been force-completed and its orphan_cb invoked. + */ +struct cmh_dma_orphan { + void *buf; + dma_addr_t addr; + size_t len; + enum dma_data_direction dir; +}; + +void cmh_dma_orphan_free(void *data); + +#endif /* CMH_DMA_H */ diff --git a/drivers/crypto/cmh/include/cmh_drbg_abi.h b/drivers/crypto/cmh= /include/cmh_drbg_abi.h new file mode 100644 index 000000000000..d4cebfe83d4b --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_drbg_abi.h @@ -0,0 +1,67 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- DRBG Core ABI Definitions + * + * Kernel-side definitions for the CMH DRBG ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_DRBG_ABI_H +#define CMH_DRBG_ABI_H + +#include + +/* DRBG Commands */ + +#define DRBG_CMD_CONFIG 0x01U +#define DRBG_CMD_GENERATE 0x02U +#define DRBG_CMD_DATASTORE 0x03U +#define DRBG_CMD_RESET 0x04U + +/* DRBG Entropy Ratio (per CMH DRBG ABI) */ + +#define DRBG_ENTROPY_RATIO_ONE 0U +#define DRBG_ENTROPY_RATIO_ONE_HALF 1U +#define DRBG_ENTROPY_RATIO_ONE_THIRD 2U +#define DRBG_ENTROPY_RATIO_ONE_FOURTH 3U + +/* DRBG Security Strength (per CMH DRBG ABI) */ + +#define DRBG_SECURITY_STRENGTH_128 0x00U +#define DRBG_SECURITY_STRENGTH_256 0x10U + +/* DRBG Personalization Data Length */ + +#define DRBG_PADATA_LEN 16U + +/* DRBG Command Structures */ + +struct drbg_cmd_config { + u32 entropy_ratio; /* drbg_entropy_ratio value */ + u32 security_strength; /* drbg_security_strength value */ + u8 padata[DRBG_PADATA_LEN]; +}; + +struct drbg_cmd_generate { + u64 dst; /* DMA physical address for output */ + u32 len; /* requested output length in bytes */ + u8 padata[DRBG_PADATA_LEN]; +}; + +struct drbg_cmd_datastore { + u64 ref; /* datastore reference */ + u32 len; /* data length in bytes */ + u32 type; /* datastore type */ + u8 padata[DRBG_PADATA_LEN]; +}; + +/* DRBG Command Union */ + +union drbg_cmd { + struct drbg_cmd_config cmd_config; + struct drbg_cmd_generate cmd_generate; + struct drbg_cmd_datastore cmd_datastore; +}; + +#endif /* CMH_DRBG_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_eac_abi.h b/drivers/crypto/cmh/= include/cmh_eac_abi.h new file mode 100644 index 000000000000..f0ebd3de1fb4 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_eac_abi.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- EAC (Error and Alarm Controller) ABI Definitions + * + * Kernel-side definitions for the CMH EAC ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_EAC_ABI_H +#define CMH_EAC_ABI_H + +#include + +/* EAC Commands */ + +#define EAC_CMD_READ 0x01U + +/* EAC Read Response -- eSW writes this to the DMA destination buffer */ + +struct eac_read_rsp { + u64 mailbox_notification; /* bitmask: MBX that raised safety notif = */ + u32 hw_error; /* bitmask: HWC that raised error */ + u32 hw_nmi; /* bitmask: HWC that raised NMI */ + u32 hw_panic; /* bitmask: HWC that raised HW panic */ + u32 safety_fatal; /* bitmask: HWC that raised fatal safety = */ + u32 safety_notification; /* bitmask: HWC that raised safety notif = */ + u32 sw_info0; /* eSW tracing information */ + u32 sw_info1; /* eSW tracing information */ + u32 sram_bank_errors[4]; /* correctable ECC error counts per bank = */ +}; + +/* EAC Command Structures */ + +struct eac_cmd_read { + u64 dst; /* DMA destination for eac_read_rsp */ + u32 len; /* must be >=3D sizeof(struct eac_read_rsp) */ +}; + +union eac_cmd { + struct eac_cmd_read cmd_read; +}; + +#endif /* CMH_EAC_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_hc_abi.h b/drivers/crypto/cmh/i= nclude/cmh_hc_abi.h new file mode 100644 index 000000000000..4e8c5ea3c69c --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_hc_abi.h @@ -0,0 +1,162 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Hash Core (HC) ABI Definitions + * + * Kernel-side definitions for the CMH HC (Hash Core) ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_HC_ABI_H +#define CMH_HC_ABI_H + +#include +#include + +/* HC Commands */ + +#define HC_CMD_INIT 0x01U +#define HC_CMD_HMAC 0x02U +#define HC_CMD_UPDATE 0x03U +#define HC_CMD_FINAL 0x04U +#define HC_CMD_UPDATE2D 0x05U +#define HC_CMD_SQUEEZE 0x07U +#define HC_CMD_GATHER 0x08U +#define HC_CMD_CSHAKE 0x09U +#define HC_CMD_KMAC 0x0AU +#define HC_CMD_SAVE 0x0BU +#define HC_CMD_RESTORE 0x0CU + +/* HC Algorithms (per CMH HC ABI) */ + +#define HC_ALGO_SHA2_224 1U +#define HC_ALGO_SHA2_256 2U +#define HC_ALGO_SHA2_384 3U +#define HC_ALGO_SHA2_512 4U +#define HC_ALGO_SHA3_224 5U +#define HC_ALGO_SHA3_256 6U +#define HC_ALGO_SHA3_384 7U +#define HC_ALGO_SHA3_512 8U +#define HC_ALGO_SHAKE128 9U +#define HC_ALGO_SHAKE256 10U + +/* HC Algo Flags */ + +#define HC_ALGO_FLAG_SCA_KEY BIT(18) /* SCA key in 2 shares */ +#define HC_ALGO_FLAG_SCA_OUT BIT(19) /* SCA output in 2 shares */ + +#define HC_ALGO_SET(flags, algo) (((flags) & 0xFF0000UL) | ((algo) & 0xFF= UL)) +#define HC_ALGO_GET(algo) ((algo) & 0xFFU) + +/* Hash Digest Sizes */ + +#define CMH_SHA224_DIGEST_SIZE 28U +#define CMH_SHA256_DIGEST_SIZE 32U +#define CMH_SHA384_DIGEST_SIZE 48U +#define CMH_SHA512_DIGEST_SIZE 64U + +/* SHA-3 digest sizes are the same as SHA-2 for matching output widths */ +#define CMH_SHA3_224_DIGEST_SIZE 28U +#define CMH_SHA3_256_DIGEST_SIZE 32U +#define CMH_SHA3_384_DIGEST_SIZE 48U +#define CMH_SHA3_512_DIGEST_SIZE 64U + +/* SHAKE default output lengths (fixed-output ahash registration) */ +#define CMH_SHAKE128_DIGEST_SIZE 32U /* 128-bit security -> 32 bytes */ +#define CMH_SHAKE256_DIGEST_SIZE 64U /* 256-bit security -> 64 bytes */ + +/* HC Context (for SAVE/RESTORE) */ + +#define HC_CONTEXT_WORDS 149U +#define HC_CONTEXT_SIZE (HC_CONTEXT_WORDS * 4 + 4) /* ctx[149] + = crc */ + +/* cSHAKE function name max length */ + +#define HC_CSHAKE_MAX_NAMELEN 36U + +/* + * Maximum customization string (S) length for cSHAKE / KMAC. + * + * S is packed as inline VCQ data after the CSHAKE/KMAC command slot. + * The worst-case VCQ layout (KMAC with raw key + GATHER) uses 5 fixed + * slots out of CMH_KMAC_MAX_PAYLOAD (9), leaving 4 inline slots. + * Each VCQ slot is 64 bytes, so the safe limit is 4 * 64 =3D 256 bytes. + */ +#define HC_CSHAKE_MAX_CUSTOMLEN 256U + +/* HC Command Structures */ + +struct hc_cmd_init { + u32 algo; /* hc_algo value, optionally ORed with HC_ALGO_FLAG= _* */ +}; + +struct hc_cmd_hmac { + u64 key; /* datastore reference for HMAC key */ + u32 keylen; /* key length in bytes */ + u32 algo; /* hc_algo value */ +}; + +struct hc_cmd_update { + u64 input; /* DMA physical address of input data */ + u32 inlen; /* input data length in bytes */ +}; + +struct hc_cmd_final { + u64 digest; /* DMA physical address for output digest */ + u32 outlen; /* digest length in bytes */ +}; + +struct hc_cmd_update2d { + u64 input; /* DMA source address for input data */ + u64 output; /* DMA destination address for pass-through data */ + u32 iolen; /* input/pass-through data length in bytes */ +}; + +struct hc_cmd_gather { + u64 lista; /* DMA address of dma_scattergather_item chain */ + u32 sgcmd; /* HC sub-command: HC_CMD_UPDATE or HC_CMD_UPDATE2D= */ +}; + +struct hc_cmd_cshake { + u64 custom; /* DMA address for the customization string */ + u32 customlen; /* length of the customization string */ + u32 algo; /* HC_ALGO_SHAKE128 or HC_ALGO_SHAKE256 */ + u32 namelen; /* length of the function name string */ + u8 name[HC_CSHAKE_MAX_NAMELEN]; /* function name string (inline) *= / +}; + +struct hc_cmd_kmac { + u64 key; /* datastore reference for KMAC key */ + u64 custom; /* DMA address for the customization string */ + u32 keylen; /* key length in bytes */ + u32 customlen; /* length of the customization string */ + u32 algo; /* HC_ALGO_SHAKE128 or HC_ALGO_SHAKE256 */ + u32 outlen; /* requested output digest length */ +}; + +struct hc_cmd_save { + u64 output; /* DMA physical address for saved context */ + u32 outlen; /* must be HC_CONTEXT_SIZE */ +}; + +struct hc_cmd_restore { + u64 input; /* DMA physical address of saved context */ + u32 inlen; /* must be HC_CONTEXT_SIZE */ +}; + +/* HC Command Union */ + +union hc_cmd { + struct hc_cmd_init cmd_init; + struct hc_cmd_hmac cmd_hmac; + struct hc_cmd_cshake cmd_cshake; + struct hc_cmd_kmac cmd_kmac; + struct hc_cmd_update cmd_update; + struct hc_cmd_final cmd_final; + struct hc_cmd_update2d cmd_update2d; + struct hc_cmd_gather cmd_gather; + struct hc_cmd_save cmd_save; + struct hc_cmd_restore cmd_restore; +}; + +#endif /* CMH_HC_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_hcq_abi.h b/drivers/crypto/cmh/= include/cmh_hcq_abi.h new file mode 100644 index 000000000000..b9fc2a80a408 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_hcq_abi.h @@ -0,0 +1,221 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- HCQ Core ABI Definitions + * + * Kernel-side definitions for the CMH HCQ ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_HCQ_ABI_H +#define CMH_HCQ_ABI_H + +#include +#include + +/* VCQ layout: header + [SYS cmds] + HCQ_CMD + [sys_read] + flush */ +#define HCQ_VCQ_CMDS_MIN 3 /* header + cmd + flush */ +#define HCQ_VCQ_CMDS_MAX 6 /* keygen: hdr+new+write+cmd+read+f= lush */ + +/* HCQ Command IDs */ +#define HCQ_CMD_XMSS_VERIFY 0x03U +#define HCQ_CMD_LMS_VERIFY 0x04U +#define HCQ_CMD_SLHDSA_VERIFY_INTERNAL 0x05U +#define HCQ_CMD_SLHDSA_VERIFY 0x06U +#define HCQ_CMD_SLHDSA_VERIFY_PREHASH 0x07U +#define HCQ_CMD_SLHDSA_VERIFY_PREHASH_DIGEST 0x08U +#define HCQ_CMD_SLHDSA_KEYGEN 0x09U +#define HCQ_CMD_SLHDSA_SIGN_INTERNAL 0x10U +#define HCQ_CMD_SLHDSA_SIGN 0x11U +#define HCQ_CMD_SLHDSA_SIGN_PREHASH 0x12U +#define HCQ_CMD_SLHDSA_SIGN_PREHASH_DIGEST 0x13U +#define HCQ_CMD_SLHDSA_PUBGEN 0x14U + +/* SLH-DSA Parameter Set IDs */ +#define HCQ_SLHDSA_SHAKE_128S 1U +#define HCQ_SLHDSA_SHAKE_128F 2U +#define HCQ_SLHDSA_SHAKE_192S 3U +#define HCQ_SLHDSA_SHAKE_192F 4U +#define HCQ_SLHDSA_SHAKE_256S 5U +#define HCQ_SLHDSA_SHAKE_256F 6U +#define HCQ_SLHDSA_SHA2_128S 7U +#define HCQ_SLHDSA_SHA2_128F 8U +#define HCQ_SLHDSA_SHA2_192S 9U +#define HCQ_SLHDSA_SHA2_192F 10U +#define HCQ_SLHDSA_SHA2_256S 11U +#define HCQ_SLHDSA_SHA2_256F 12U +#define HCQ_SLHDSA_PARAM_MAX 12U + +/* SLH-DSA Prehash Algorithm IDs */ +#define HCQ_SLHDSA_PREHASH_SHA256 1U +#define HCQ_SLHDSA_PREHASH_SHA512 2U +#define HCQ_SLHDSA_PREHASH_SHAKE128 3U +#define HCQ_SLHDSA_PREHASH_SHAKE256 4U + +/* SLH-DSA size limits */ +#define SLHDSA_MAX_PK_SIZE 64U /* 2*n, n=3D32 */ +#define SLHDSA_MAX_SK_SIZE 128U /* 4*n, n=3D32 */ +#define SLHDSA_MAX_SEED_SIZE 96U /* 3*n, n=3D32 */ +#define SLHDSA_MAX_SIG_SIZE 49856U /* SHAKE-256f / SHA2-256f */ +#define SLHDSA_MAX_MSG_LEN 128U +#define SLHDSA_MAX_CTX_LEN 255U + +/* LMS/HSS size limits -- derived from eSW HCQ ABI constraints */ +#define LMS_MAX_PK_LEN 60U /* eSW public-key buffer */ +#define LMS_MAX_MSG_LEN 256U /* SHS_LMS_MESSAGE_LEN_MAX = */ +#define LMS_MAX_SIG_LEN 13364U /* eSW signature buffer */ + +/* XMSS/XMSS-MT size limits -- derived from eSW HCQ ABI constraints */ +#define XMSS_MAX_PK_LEN 136U /* eSW public-key buffer */ +#define XMSS_MAX_MSG_LEN 64U /* SHS_XMSS_MESSAGE_LEN_MAX */ +#define XMSS_MAX_SIG_LEN 27688U /* eSW signature buffer */ + +/* SLH-DSA n-value for each parameter set (index =3D param_set - 1) */ +extern const u32 slhdsa_n[]; + +/* SLH-DSA signature sizes (index =3D param_set - 1) */ +extern const u32 slhdsa_sig_size[]; + +/* Derive PK/SK/seed sizes from n */ +static inline u32 slhdsa_pk_size(u32 param_set) +{ + if (param_set < 1U || param_set > HCQ_SLHDSA_PARAM_MAX) + return 0; + return 2U * slhdsa_n[param_set - 1U]; +} + +static inline u32 slhdsa_sk_size(u32 param_set) +{ + if (param_set < 1U || param_set > HCQ_SLHDSA_PARAM_MAX) + return 0; + return 4U * slhdsa_n[param_set - 1U]; +} + +static inline u32 slhdsa_seed_size(u32 param_set) +{ + if (param_set < 1U || param_set > HCQ_SLHDSA_PARAM_MAX) + return 0; + return 3U * slhdsa_n[param_set - 1U]; +} + +static inline u32 slhdsa_get_sig_size(u32 param_set) +{ + if (param_set < 1U || param_set > HCQ_SLHDSA_PARAM_MAX) + return 0; + return slhdsa_sig_size[param_set - 1U]; +} + +/* HCQ Command Structures -- match CMH eSW ABI exactly */ + +struct hcq_cmd_xmss_verify { + u32 xmss_mt; /* 0 =3D XMSS, 1 =3D XMSS-MT */ + u32 pk_len; + u32 sig_len; + u32 dig_len; + u64 pk; + u64 sig; + u64 dig; +}; + +struct hcq_cmd_lms_verify { + u32 lms_hss; /* 0 =3D LMS, 1 =3D LMS-HSS */ + u32 pk_len; + u32 sig_len; + u32 dig_len; + u64 pk; + u64 sig; + u64 dig; +}; + +struct hcq_cmd_slhdsa_verify_internal { + u32 parameter_set; + u32 message_len; + u64 message; + u64 pk; + u64 sig; +}; + +struct hcq_cmd_slhdsa_verify { + u32 parameter_set; + u32 message_len; + u64 message; + u64 context; + u64 pk; + u64 sig; + u32 context_len; +}; + +struct hcq_cmd_slhdsa_verify_prehash { + u32 parameter_set; + u32 prehash_algo; + u32 message_len; + u32 context_len; + u64 message; + u64 context; + u64 pk; + u64 sig; +}; + +struct hcq_cmd_slhdsa_keygen { + u32 parameter_set; + u32 seed_len; + u32 pk_len; + u32 sk_len; + u64 seed; /* DS reference */ + u64 pk; /* extmem addr */ + u64 sk; /* DS reference */ +}; + +struct hcq_cmd_slhdsa_sign_internal { + u32 parameter_set; + u32 message_len; + u64 add_random; /* extmem addr, 0 =3D none */ + u64 message; + u64 sk; /* DS reference */ + u64 sig; /* extmem addr */ +}; + +struct hcq_cmd_slhdsa_sign { + u32 parameter_set; + u32 message_len; + u64 add_random; + u64 message; + u64 context; + u64 sk; /* DS reference */ + u64 sig; /* extmem addr */ + u32 context_len; +}; + +struct hcq_cmd_slhdsa_sign_prehash { + u32 parameter_set; + u32 prehash_algo; + u32 message_len; + u32 context_len; + u64 add_random; + u64 message; + u64 context; + u64 sk; /* DS reference */ + u64 sig; /* extmem addr */ +}; + +struct hcq_cmd_slhdsa_pubgen { + u32 parameter_set; + u32 sk_len; + u64 sk; /* DS reference */ + u64 pk; /* extmem addr */ +}; + +union hcq_cmd { + struct hcq_cmd_xmss_verify cmd_xmss_verify; + struct hcq_cmd_lms_verify cmd_lms_verify; + struct hcq_cmd_slhdsa_verify_internal cmd_slhdsa_verify_internal; + struct hcq_cmd_slhdsa_verify cmd_slhdsa_verify; + struct hcq_cmd_slhdsa_verify_prehash cmd_slhdsa_verify_prehash; + struct hcq_cmd_slhdsa_keygen cmd_slhdsa_keygen; + struct hcq_cmd_slhdsa_sign_internal cmd_slhdsa_sign_internal; + struct hcq_cmd_slhdsa_sign cmd_slhdsa_sign; + struct hcq_cmd_slhdsa_sign_prehash cmd_slhdsa_sign_prehash; + struct hcq_cmd_slhdsa_pubgen cmd_slhdsa_pubgen; +}; + +#endif /* CMH_HCQ_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_kic_abi.h b/drivers/crypto/cmh/= include/cmh_kic_abi.h new file mode 100644 index 000000000000..7f4fe3b9fd89 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_kic_abi.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- KIC Core ABI Definitions + * + * Kernel-side definitions for the CMH KIC ABI (KIC commands only). + * Derived from the CMH eSW ABI. + */ + +#ifndef CMH_KIC_ABI_H +#define CMH_KIC_ABI_H + +#include + +/* KIC Commands */ + +#define KIC_CMD_HKDF1 0x06U +#define KIC_CMD_HKDF2 0x07U +#define KIC_CMD_AES_CMAC_KDF 0x08U +#define KIC_CMD_DKEK_DERIVE 0x09U + +/* Maximum key size for KIC operations (bytes) */ +#define KIC_KEY_SIZE 32U + +/* + * KIC Command Structures + * + * Field names (llen, len) mirror the CMH eSW ABI register layout. + * llen =3D label length, len =3D output key length. + */ + +struct kic_cmd_hkdf1 { + u64 dst; /* DS ref for derived key (SYS_REF_LAST) */ + u64 base; /* base key reference (e.g., KIC_KEY1) */ + u64 label; /* label pointer (0 for inline-next-slot) */ + u32 llen; /* label length */ + u32 len; /* output key length */ + u32 type; /* SYS_TYPE_SET(flags, core_id) */ +}; + +struct kic_cmd_hkdf2 { + u64 dst; /* DS ref for derived key */ + u64 base; /* base key reference */ + u64 salt; /* salt key reference (SYS_REF_NONE =3D no salt) */ + u64 label; /* label pointer */ + u32 llen; /* label length */ + u32 len; /* output key length */ + u32 type; /* SYS_TYPE_SET(flags, core_id) */ +}; + +struct kic_cmd_aes_cmac_kdf { + u64 base_key; /* KIC/DS reference for base key */ + u64 out_key; /* DS reference for derived key */ + u64 label; /* label DMA address */ + u32 key_len; /* base & output key length (must be 32) */ + u32 label_len; /* label length */ + u32 type; /* SYS_TYPE_SET(flags, core_id) for output */ +}; + +struct kic_cmd_dkek_derive { + u64 base_key; /* KIC base key reference */ + u64 out_key; /* DS reference for the derived KEK */ + u32 host_id; /* host ID (0 =3D caller's own) */ + u32 metadata_len; /* metadata length */ + u64 metadata; /* metadata DMA address */ +}; + +/* KIC Command Union */ + +union kic_cmd { + struct kic_cmd_hkdf1 cmd_hkdf1; + struct kic_cmd_hkdf2 cmd_hkdf2; + struct kic_cmd_aes_cmac_kdf cmd_aes_cmac_kdf; + struct kic_cmd_dkek_derive cmd_dkek_derive; +}; + +#endif /* CMH_KIC_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_mqi.h b/drivers/crypto/cmh/incl= ude/cmh_mqi.h new file mode 100644 index 000000000000..93b847859953 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_mqi.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Mailbox Queue Initializer + * + * Allocates DMA-capable queue buffers and programs MBX registers + * via the MBX lock/setup/enable/unlock register sequence. + */ + +#ifndef CMH_MQI_H +#define CMH_MQI_H + +#include "cmh_config.h" + +#define MBX_LOCK_TIMEOUT_MS 1000 +#define MBX_LOCK_POLL_MIN_US 10 +#define MBX_LOCK_POLL_MAX_US 50 +#define MBX_HOST_INFO_LKM 0x4C4B4D00U /* "LKM\0" as host identifier= */ + +/** + * cmh_mqi_init() - Allocate MBX queue buffers and program registers + * @cfg: Global device configuration + * + * Performs the lock/setup/enable/unlock sequence for each configured MBX. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_mqi_init(struct cmh_config *cfg); + +/** + * cmh_mqi_cleanup() - Free MBX queue buffers and release locks + * @cfg: Global device configuration + */ +void cmh_mqi_cleanup(struct cmh_config *cfg); + +#endif /* CMH_MQI_H */ diff --git a/drivers/crypto/cmh/include/cmh_pke_abi.h b/drivers/crypto/cmh/= include/cmh_pke_abi.h new file mode 100644 index 000000000000..e0e7b946b4e3 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_pke_abi.h @@ -0,0 +1,272 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- PKE Core ABI Definitions + * + * Kernel-side definitions for the CMH PKE ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_PKE_ABI_H +#define CMH_PKE_ABI_H + +#include + +/* PKE Command IDs */ + +#define PKE_CMD_ECDSA_VERIFY 0x03U +#define PKE_CMD_ECDSA_SIGN 0x04U +#define PKE_CMD_ECDSA_PUBGEN 0x05U +#define PKE_CMD_ECDSA_KEYGEN 0x06U +#define PKE_CMD_EDDSA_VERIFY 0x07U +#define PKE_CMD_EDDSA_SIGN 0x08U +#define PKE_CMD_EDDSA_PUBGEN 0x09U +#define PKE_CMD_ECDH_KEYGEN 0x0AU +#define PKE_CMD_ECDH 0x0BU +#define PKE_CMD_RSA_ENC 0x0CU +#define PKE_CMD_RSA_DEC 0x0DU +#define PKE_CMD_RSA_KEYGEN 0x0EU +#define PKE_CMD_RSA_CRT_DEC 0x0FU +#define PKE_CMD_SM2_ECDH_KEYGEN 0x16U +#define PKE_CMD_SM2_ECDH 0x17U +#define PKE_CMD_SM2_DEC_POINT 0x18U +#define PKE_CMD_SM2_ENC_POINT 0x19U +#define PKE_CMD_SM2_ID_DIGEST 0x1AU +#define PKE_CMD_SM2_ECDH_HASH 0x1BU +#define PKE_CMD_SM2_DEC_HASH 0x1CU +#define PKE_CMD_SM2_ENC_HASH 0x1DU +#define PKE_CMD_EDDSA_PRIV_KEYGEN_SCA 0x21U +#define PKE_CMD_FLUSH 0xFFU + +/* EC Curve IDs (per CMH PKE ABI) */ + +#define PKE_CURVE_P192 0x01U +#define PKE_CURVE_P224 0x02U +#define PKE_CURVE_P256 0x03U +#define PKE_CURVE_P384 0x04U +#define PKE_CURVE_P521 0x05U +#define PKE_CURVE_SECP256K1 0x07U +#define PKE_CURVE_BP192R1 0x11U +#define PKE_CURVE_BP224R1 0x12U +#define PKE_CURVE_BP256R1 0x13U +#define PKE_CURVE_BP320R1 0x14U +#define PKE_CURVE_BP384R1 0x15U +#define PKE_CURVE_BP512R1 0x16U +#define PKE_CURVE_ANSSI_FRP256V1 0x17U +#define PKE_CURVE_SM2 0x18U +#define PKE_CURVE_25519 0x21U +#define PKE_CURVE_448 0x22U + +/* PKE Command Structures -- match CMH eSW ABI exactly */ + +struct pke_cmd_ecdsa_verify { + u32 curve; + u32 digest_len; + u64 public_key; + u64 digest; + u64 signature; + u64 rprime; +}; + +struct pke_cmd_ecdsa_sign { + u32 curve; + u32 secret_key_len; + u64 digest; + u64 signature; + u64 secret_key; /* DS reference */ + u32 digest_len; +}; + +struct pke_cmd_ecdsa_pubgen { + u32 curve; + u32 secret_key_len; + u64 public_key; + u64 secret_key; /* DS reference */ +}; + +struct pke_cmd_ecdsa_keygen { + u32 curve; + u32 secret_key_len; + u64 secret_key; /* DS reference */ + u32 secret_key_type; +}; + +struct pke_cmd_eddsa_verify { + u32 curve; + u32 digest_len; + u64 public_key_y; + u64 digest; + u64 signature; + u64 rprime; +}; + +struct pke_cmd_eddsa_sign { + u32 curve; + u32 secret_key_len; + u64 digest; + u64 signature; + u64 secret_key; /* DS reference */ + u32 digest_len; +}; + +struct pke_cmd_eddsa_pubgen { + u32 curve; + u32 secret_key_len; + u64 public_key_y; + u64 secret_key; /* DS reference */ +}; + +struct pke_cmd_ecdh_keygen { + u32 curve; + u32 secret_key_len; + u64 public_key_x; + u64 secret_key; /* DS reference */ +}; + +struct pke_cmd_ecdh { + u32 curve; + u32 secret_key_len; + u32 shared_secret_len; + u32 shared_secret_type; + u64 peer_key_x; + u64 secret_key; /* DS reference */ + u64 shared_secret; /* DS reference for result */ +}; + +struct pke_cmd_rsa_enc { + u32 bits; + u32 e_len; + u64 e; + u64 n; + u64 m; + u64 c; +}; + +struct pke_cmd_rsa_dec { + u32 bits; + u32 e_len; + u64 e; + u64 n; + u64 c; + u64 m; + u64 d; /* DS reference */ +}; + +struct pke_cmd_rsa_crt_dec { + u32 bits; + u32 e_len; + u64 e; + u64 n; + u64 c; + u64 m; + u64 crt; /* DS reference */ +}; + +struct pke_cmd_rsa_keygen { + u32 bits; + u32 d_type; + u64 e; + u64 n; + u64 d; /* DS reference */ + u64 crt; /* DS reference */ + u32 crt_type; +}; + +struct pke_cmd_eddsa_keygen_sca { + u32 curve; + u64 secret_key; /* DS reference: input normal SK */ + u64 sca_secret_key; /* DS reference: output blinded SK */ +}; + +/* SM2 Command Structures */ + +struct pke_cmd_sm2_ecdh_keygen { + u64 nonce; /* DMA addr (32B input or output) */ + u64 session_key; /* DMA addr output (64B) */ + u32 nonce_len; /* 0 =3D HW generates, 32 =3D caller provid= es */ +}; + +struct pke_cmd_sm2_ecdh { + u32 nonce_len; /* 0 or 32 */ + u32 private_key_len; /* must be 32 */ + u64 nonce; /* DMA addr (32B) */ + u64 peer_public_key; /* DMA addr (64B) */ + u64 peer_session_key; /* DMA addr (64B) */ + u64 private_key; /* DS reference */ + u64 shared_point; /* DS reference (output, 64B) */ + u32 shared_point_type; /* SYS_TYPE_SET(flags, CORE_ID_PKE) */ +}; + +struct pke_cmd_sm2_dec_point { + u32 ciphertext_len; /* total CT length (97..128) */ + u32 private_key_len; /* must be 32 */ + u64 ciphertext; /* DMA addr (64B: C1 point) */ + u64 dec_point; /* DMA addr output (64B) */ + u64 private_key; /* DS reference */ +}; + +struct pke_cmd_sm2_enc_point { + u64 nonce; /* DMA addr (32B, optional) */ + u64 public_key; /* DMA addr (64B) */ + u64 ciphertext; /* DMA addr output (64B: C1) */ + u64 enc_point; /* DMA addr output (64B) */ + u32 nonce_len; /* 0 or 32 */ +}; + +struct pke_cmd_sm2_id_digest { + u64 id; /* DMA addr (identity, <=3D32B) */ + u64 public_key; /* DMA addr (64B) */ + u64 digest; /* DMA addr output (32B) */ + u32 id_len; /* identity length in bytes */ +}; + +struct pke_cmd_sm2_ecdh_hash { + u64 peer_id_digest; /* DMA addr (32B) */ + u64 id_digest; /* DMA addr (32B) */ + u64 shared_point; /* DS reference (64B input) */ + u64 shared_key; /* DS reference (16B output) */ + u32 shared_key_type; /* SYS_TYPE_SET(flags, CORE_ID_PKE) */ +}; + +struct pke_cmd_sm2_dec_hash { + u64 ciphertext; /* DMA addr (full ciphertext) */ + u64 dec_point; /* DMA addr (64B) */ + u64 plaintext; /* DMA addr output (ct_len - 96 bytes) */ + u32 ciphertext_len; /* 97..128 */ +}; + +struct pke_cmd_sm2_enc_hash { + u64 message; /* DMA addr (plaintext) */ + u64 enc_point; /* DMA addr (64B) */ + u64 ciphertext; /* DMA addr output (96 + msg_len) */ + u32 message_len; /* 1..32 */ +}; + +/* PKE Command Union */ + +union pke_cmd { + struct pke_cmd_ecdsa_verify cmd_ecdsa_verify; + struct pke_cmd_ecdsa_sign cmd_ecdsa_sign; + struct pke_cmd_ecdsa_pubgen cmd_ecdsa_pubgen; + struct pke_cmd_ecdsa_keygen cmd_ecdsa_keygen; + struct pke_cmd_eddsa_verify cmd_eddsa_verify; + struct pke_cmd_eddsa_sign cmd_eddsa_sign; + struct pke_cmd_eddsa_pubgen cmd_eddsa_pubgen; + struct pke_cmd_ecdh_keygen cmd_ecdh_keygen; + struct pke_cmd_ecdh cmd_ecdh; + struct pke_cmd_rsa_enc cmd_rsa_enc; + struct pke_cmd_rsa_dec cmd_rsa_dec; + struct pke_cmd_rsa_crt_dec cmd_rsa_crt_dec; + struct pke_cmd_rsa_keygen cmd_rsa_keygen; + struct pke_cmd_eddsa_keygen_sca cmd_eddsa_keygen_sca; + struct pke_cmd_sm2_ecdh_keygen cmd_sm2_ecdh_keygen; + struct pke_cmd_sm2_ecdh cmd_sm2_ecdh; + struct pke_cmd_sm2_dec_point cmd_sm2_dec_point; + struct pke_cmd_sm2_enc_point cmd_sm2_enc_point; + struct pke_cmd_sm2_id_digest cmd_sm2_id_digest; + struct pke_cmd_sm2_ecdh_hash cmd_sm2_ecdh_hash; + struct pke_cmd_sm2_dec_hash cmd_sm2_dec_hash; + struct pke_cmd_sm2_enc_hash cmd_sm2_enc_hash; +}; + +#endif /* CMH_PKE_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_qse_abi.h b/drivers/crypto/cmh/= include/cmh_qse_abi.h new file mode 100644 index 000000000000..9834620e21d7 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_qse_abi.h @@ -0,0 +1,181 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- QSE Core ABI Definitions + * + * Kernel-side definitions for the CMH QSE ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_QSE_ABI_H +#define CMH_QSE_ABI_H + +#include +#include +#include + +/* VCQ layout: header + [SYS_NEW] + QSE_CMD + flush */ +#define QSE_VCQ_CMDS_MIN 3 /* header + cmd + flush */ +#define QSE_VCQ_CMDS_MAX 4 /* header + sys_new + cmd + flush *= / + +/* QSE Flags */ +#define QSE_FLAG_USE_REF BIT(0) +#define QSE_FLAG_USE_RNG BIT(1) + +/* QSE Command IDs */ +#define QSE_CMD_ML_KEM_KEYGEN 0x01U +#define QSE_CMD_ML_KEM_ENC 0x02U +#define QSE_CMD_ML_KEM_DEC 0x03U +#define QSE_CMD_ML_DSA_KEYGEN 0x04U +#define QSE_CMD_ML_DSA_SIGN 0x05U +#define QSE_CMD_ML_DSA_VERIFY 0x06U +#define QSE_CMD_ML_KEM_KEYGEN_MASKED 0x07U +#define QSE_CMD_ML_KEM_ENC_MASKED 0x08U +#define QSE_CMD_ML_KEM_DEC_MASKED 0x09U +#define QSE_CMD_ML_DSA_KEYGEN_MASKED 0x0AU +#define QSE_CMD_ML_DSA_SIGN_MASKED 0x0BU + +/* ML-KEM category values */ +#define ML_KEM_K_512 2U +#define ML_KEM_K_768 3U +#define ML_KEM_K_1024 4U + +/* ML-DSA mode values */ +#define ML_DSA_MODE_44 2U +#define ML_DSA_MODE_65 3U +#define ML_DSA_MODE_87 5U + +/* ML-DSA special message length for externalMu (pre-hashed 64-byte input)= */ +#define ML_DSA_MLEN_EXTERNAL_MU 0xFFFFFFFFU +#define ML_DSA_EXTMU_LEN 64U /* actual copy size for externalMu = */ + +/* ML-DSA maximum message length */ +#define ML_DSA_MAX_MLEN 10240U + +/* Shared secret size */ +#define ML_KEM_SS_LEN 32U +#define ML_KEM_SS_LEN_MASKED 64U + +/* Seed sizes */ +#define QSE_SEED_LEN 32U +#define QSE_SEED_LEN_MASKED 64U + +/* + * ML-KEM size tables -- indexed by (k - 2). + * [0] =3D ML-KEM-512 (k=3D2) + * [1] =3D ML-KEM-768 (k=3D3) + * [2] =3D ML-KEM-1024 (k=3D4) + */ +#define ML_KEM_LEVELS 3U + +#define ML_KEM_EK_SIZE(k) (384U * (k) + 32U) +#define ML_KEM_DK_SIZE(k) (768U * (k) + 96U) +#define ML_KEM_DK_SIZE_MASKED(k) (1152U * (k) + 128U) + +static inline u32 ml_kem_ct_size(u32 k) +{ + u32 du =3D (k =3D=3D 4U) ? 11U : 10U; + u32 dv =3D (k =3D=3D 4U) ? 5U : 4U; + + return 32U * (k * du + dv); +} + +#define ML_KEM_CT_SIZE(k) ml_kem_ct_size(k) + +/* + * ML-DSA size tables -- indexed by mode. + * Mode values: 2 (ML-DSA-44), 3 (ML-DSA-65), 5 (ML-DSA-87). + */ +extern const u32 ml_dsa_pk_size[]; +extern const u32 ml_dsa_sk_size[]; +extern const u32 ml_dsa_sk_size_masked[]; +extern const u32 ml_dsa_sig_size[]; + +/* Map ML-DSA mode (2/3/5) -> table index (0/1/2) */ +static inline int ml_dsa_mode_idx(u32 mode) +{ + switch (mode) { + case 2: return 0; + case 3: return 1; + case 5: return 2; + default: return -1; + } +} + +/* Map ML-KEM k (2/3/4) -> table index (0/1/2), or -1 if invalid */ +static inline int ml_kem_k_idx(u32 k) +{ + if (k >=3D 2U && k <=3D 4U) + return (int)(k - 2U); + return -1; +} + +/* QSE Command Structures -- match CMH eSW ABI exactly */ + +struct qse_cmd_ml_kem_keygen { + u32 k; + u32 flags; + u64 seed; + u64 z; + u64 ek; + u64 dk; + u32 dk_type; +}; + +struct qse_cmd_ml_kem_enc { + u32 k; + u32 flags; + u64 coin; + u64 ek; + u64 ct; + u64 ss; + u32 ss_type; +}; + +struct qse_cmd_ml_kem_dec { + u32 k; + u32 flags; + u64 ct; + u64 dk; + u64 ss; + u32 ss_type; +}; + +struct qse_cmd_ml_dsa_keygen { + u32 mode; + u32 flags; + u64 seed; + u64 pk; + u64 sk; + u32 sk_type; +}; + +struct qse_cmd_ml_dsa_sign { + u32 mode; + u32 flags; + u64 rnd; + u64 m; + u64 sk; + u64 sig; + u32 mlen; +}; + +struct qse_cmd_ml_dsa_verify { + u32 mode; + u32 flags; + u64 m; + u64 pk; + u64 sig; + u32 mlen; +}; + +union qse_cmd { + struct qse_cmd_ml_kem_keygen cmd_ml_kem_keygen; + struct qse_cmd_ml_kem_enc cmd_ml_kem_enc; + struct qse_cmd_ml_kem_dec cmd_ml_kem_dec; + struct qse_cmd_ml_dsa_keygen cmd_ml_dsa_keygen; + struct qse_cmd_ml_dsa_sign cmd_ml_dsa_sign; + struct qse_cmd_ml_dsa_verify cmd_ml_dsa_verify; +}; + +#endif /* CMH_QSE_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_registers.h b/drivers/crypto/cm= h/include/cmh_registers.h new file mode 100644 index 000000000000..9481b30b76d1 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_registers.h @@ -0,0 +1,145 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Hardware Register Definitions + * + * Derived from the CMH hardware register specification. + * All offsets are taken directly from the hardware documentation. + */ + +#ifndef CMH_REGISTERS_H +#define CMH_REGISTERS_H + +#include +#include + +/* MBX Instance Addressing */ + +#define CMH_MBX_INSTANCE_SHIFT 12 +#define CMH_MBX_INSTANCE_SIZE BIT(CMH_MBX_INSTANCE_SHIFT) /* 0x100= 0 */ +#define CMH_MAX_MBX_INSTANCES 64U + +/* MBX Per-Instance Register Offsets */ + +#define R_MBX_LOCK 0x000U +#define R_MBX_HOST_INFO 0x004U +#define R_MBX_QUEUE_LO 0x008U +#define R_MBX_QUEUE_HI 0x00CU +#define R_MBX_QUEUE_SLOTS 0x010U +#define R_MBX_QUEUE_STRIDE 0x014U +#define R_MBX_QUEUE_HEAD 0x018U +#define R_MBX_QUEUE_TAIL 0x01CU +#define R_MBX_INTERRUPT 0x020U +#define R_MBX_INTERRUPT_MASK 0x024U +#define R_MBX_COMMAND 0x028U +#define R_MBX_STATUS 0x02CU +#define R_MBX_CHILD 0x030U +#define R_MBX_ID 0x034U +#define R_MBX_HOST_CONFIG 0x038U +#define R_MBX_SCRATCH 0x03CU + +#define MBX_QUEUE_ALIGNMENT 0x4U + +/* MBX Interrupt Bits */ + +#define MBX_DONE_IRQ BIT(0) +#define MBX_ERROR_IRQ BIT(1) +#define MBX_IRQ_MASK (MBX_DONE_IRQ | MBX_ERROR_IRQ) + +/* MBX Command Values */ + +#define MBX_COMMAND_RUN 0x000U +#define MBX_COMMAND_PAUSE 0xC2FU +#define MBX_COMMAND_CONTINUE 0x5DBU +#define MBX_COMMAND_RESTART 0xB78U +#define MBX_COMMAND_ABORT 0x6F6U +#define MBX_COMMAND_FLUSH 0x3A5U + +/* MBX Status Values */ + +#define MBX_STATUS_IDLE 0x01U +#define MBX_STATUS_BUSY 0x10U +#define MBX_STATUS_HOLD 0x20U +#define MBX_STATUS_PAUSED 0x28U +#define MBX_STATUS_SUCCESS 0x40U +#define MBX_STATUS_ERROR 0x80U +#define MBX_STATUS_OFFLINE 0x88U /* ERROR | 0x08: offline/stop= ped */ + +#define MBX_MASK_DONE (MBX_STATUS_IDLE | MBX_STATUS_SUCCES= S) +#define MBX_MASK_RUNNING (MBX_STATUS_BUSY | MBX_STATUS_HOLD) +#define MBX_MASK_STOPPED MBX_STATUS_OFFLINE + +/* MBX Status Field Extraction */ + +#define MBX_STATUS_CODE(v) ((v) & 0xFFU) +#define MBX_STATUS_CORE_ID(v) (((v) >> 8) & 0xFFU) +#define MBX_STATUS_ERROR_CODE(v) (((v) >> 16) & 0xFFU) +#define MBX_STATUS_CMD_INDEX(v) (((v) >> 24) & 0xFFU) + +/* SIC Register Offsets (relative to SIC base / instance 0 base) */ + +#define R_SIC_BOOT_STATUS 0x100U +#define SIC_BOOT_STATUS_MASK 0x77U +#define SIC_BOOT_STATUS_PASS 0x66U + +#define R_SIC_MBX_AVAILABILITY 0x104U +#define R_SIC_MBX_AVAILABILITY2 0x108U + +#define R_SIC_SW_BOOT_STATUS 0x12CU +#define SIC_SW_BOOT_STATUS_STARTED BIT(0) +#define SIC_SW_BOOT_STATUS_READY BIT(1) +#define SIC_SW_BOOT_STATUS_MISSION BIT(6) + +#define R_SIC_SW_ERROR_INFO 0x130U +#define R_SIC_SW_HEARTBEAT 0x154U + +#define R_SIC_GPINTERRUPT 0x160U + +#define R_SIC_HW_VERSION0 0x200U +#define R_SIC_SW_VERSION 0x218U +#define R_SIC_CORE_ENABLE 0x22CU + +/* Register Access Helpers */ + +static inline u32 cmh_reg_read32(void __iomem *base, u32 offset) +{ + return ioread32((u8 __iomem *)base + offset); +} + +static inline void cmh_reg_write32(u32 value, void __iomem *base, u32 offs= et) +{ + iowrite32(value, (u8 __iomem *)base + offset); +} + +/* + * 64-bit register access via two 32-bit reads/writes. Only correct for + * register pairs where split access is defined (e.g. QUEUE_LO/HI). + * Do not use for registers requiring atomic 64-bit access. + * + * No explicit barrier between the two halves is needed: ioread32/iowrite3= 2 + * include implicit ordering guarantees on all supported architectures + * (MMIO accessors are strongly ordered with respect to each other). + */ +static inline u64 cmh_reg_read64(void __iomem *base, u32 offset) +{ + u32 lo =3D ioread32((u8 __iomem *)base + offset); + u32 hi =3D ioread32((u8 __iomem *)base + offset + 4); + + return ((u64)hi << 32) | lo; +} + +static inline void cmh_reg_write64(u64 value, void __iomem *base, u32 offs= et) +{ + iowrite32((u32)value, (u8 __iomem *)base + offset); + iowrite32((u32)(value >> 32), (u8 __iomem *)base + offset + 4); +} + +/* Return the ioremap'd base for MBX instance N within the SIC region */ +static inline void __iomem *cmh_mbx_instance_base(void __iomem *sic_mapped= , + u32 instance) +{ + return (u8 __iomem *)sic_mapped + + ((unsigned long)instance << CMH_MBX_INSTANCE_SHIFT); +} + +#endif /* CMH_REGISTERS_H */ diff --git a/drivers/crypto/cmh/include/cmh_rh.h b/drivers/crypto/cmh/inclu= de/cmh_rh.h new file mode 100644 index 000000000000..b182c203a475 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_rh.h @@ -0,0 +1,93 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Response Handler + * + * IRQ-driven completion processing. Uses request_threaded_irq(): + * - Hardirq: read+clear MBX interrupt registers, wake thread + * - Threaded handler: walk per-MBX transaction queues, + * fire completion callbacks, free transaction objects + * + * The Response Handler consumes transaction_obj entries enqueued + * by the Transaction Manager (cmh_txn.c) on each per-mailbox txq. + */ + +#ifndef CMH_RH_H +#define CMH_RH_H + +#include "cmh_config.h" + +/** + * cmh_rh_init() - Register IRQ handler and start response processing + * @cfg: Global device configuration + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_rh_init(struct cmh_config *cfg); + +/** + * cmh_rh_cleanup() - Free IRQ and stop response processing + * @cfg: Global device configuration + */ +void cmh_rh_cleanup(struct cmh_config *cfg); + +/** + * cmh_rh_suspend() - Quiesce RH for system suspend + * @cfg: Global device configuration + * + * Cancels the watchdog timer and masks MBX interrupts at the hardware + * level. IRQ handlers remain registered (standard PM pattern). + * The threaded IRQ handler stays active so that cmh_tm_quiesce() + * (called after this) can still drain in-flight transactions via + * IRQ-driven completions. + */ +void cmh_rh_suspend(struct cmh_config *cfg); + +/** + * cmh_rh_resume() - Restart RH after system resume + * @cfg: Global device configuration + * + * Re-synchronises per-MBX head tracking with hardware, clears stale + * interrupt bits, re-enables MBX interrupt masks, and re-arms the + * watchdog timer. Must be called before cmh_tm_resume(). + */ +void cmh_rh_resume(struct cmh_config *cfg); + +/* debugfs timeout accessor (debug builds only) */ +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +unsigned int *cmh_rh_timeout_watchdog_ptr(void); +#endif + +/** + * cmh_rh_force_drain_mbx() - FLUSH + drain all pending transactions on a = MBX + * @mbx_idx: Mailbox index to drain + * + * Issues MBX_COMMAND_FLUSH, drains all pending transactions with + * -ECANCELED, and resets all recovery bookkeeping (including the + * wedged flag). Safe to call at any time; acquires rh_process_lock. + * Intended for debugfs last-resort recovery. + */ +void cmh_rh_force_drain_mbx(u32 mbx_idx); + +/** + * cmh_rh_mbx_is_wedged() - Check if a mailbox is permanently wedged + * @mbx_idx: Mailbox index to check + * + * Returns true if the mailbox has failed RESTART+FLUSH recovery and + * is offline. Used by the TM to avoid submitting new work to a dead + * mailbox. + * + * Return: true if wedged, false otherwise (including out-of-range idx). + */ +bool cmh_rh_mbx_is_wedged(u32 mbx_idx); + +/** + * cmh_rh_abort_mbx() - Issue MBX_COMMAND_ABORT under rh_process_lock + * @mbx_idx: Mailbox index to abort + * + * Serialises the ABORT write with RESTART/FLUSH commands issued by the + * watchdog, preventing command-register clobber races. + */ +void cmh_rh_abort_mbx(u32 mbx_idx); + +#endif /* CMH_RH_H */ diff --git a/drivers/crypto/cmh/include/cmh_rng.h b/drivers/crypto/cmh/incl= ude/cmh_rng.h new file mode 100644 index 000000000000..1a886a0d82c1 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_rng.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Hardware RNG (DRBG) Driver + * + * Registers a struct hwrng backed by the CMH DRBG core. + * Each .read() builds a VCQ with DRBG_CMD_GENERATE and submits it + * through the Transaction Manager for synchronous completion. + * + * The DRBG must be configured (CONFIG command) by the management host + * before the LKM is loaded -- the LKM only issues GENERATE requests. + * + * CRNG seeding control: + * - Module param "hwrng_quality" (0=3Ddisabled, 1-1024=3Denable) + * - Default: 0 (conservative -- no automatic kernel CRNG seeding) + */ + +#ifndef CMH_RNG_H +#define CMH_RNG_H + +struct platform_device; + +int cmh_rng_register(struct platform_device *pdev); +void cmh_rng_unregister(void); + +/* debugfs timeout accessor (debug builds only) */ +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +unsigned int *cmh_rng_timeout_drbg_ptr(void); +#endif + +#endif /* CMH_RNG_H */ diff --git a/drivers/crypto/cmh/include/cmh_sm3_abi.h b/drivers/crypto/cmh/= include/cmh_sm3_abi.h new file mode 100644 index 000000000000..cbbe80fe18d6 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_sm3_abi.h @@ -0,0 +1,79 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- SM3 Hash Core ABI Definitions + * + * Kernel-side definitions for the CMH SM3 ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_SM3_ABI_H +#define CMH_SM3_ABI_H + +#include + +/* SM3 Commands */ + +#define SM3_CMD_INIT 0x01U +#define SM3_CMD_UPDATE 0x02U +#define SM3_CMD_FINAL 0x03U +#define SM3_CMD_UPDATE2D 0x04U +#define SM3_CMD_GATHER 0x06U +#define SM3_CMD_SAVE 0x07U +#define SM3_CMD_RESTORE 0x08U + +/* SM3 Digest / Block Sizes */ + +#define CMH_SM3_DIGEST_SIZE 32U +#define CMH_SM3_BLOCK_SIZE 64U + +/* SM3 Context (for SAVE/RESTORE) */ + +#define SM3_CONTEXT_WORDS 29U +#define SM3_CONTEXT_SIZE (SM3_CONTEXT_WORDS * 4 + 4) /* ctx[29] + = crc */ + +/* SM3 Command Structures */ + +struct sm3_cmd_update { + u64 input; /* DMA physical address of input data */ + u32 inlen; /* input data length in bytes */ +}; + +struct sm3_cmd_final { + u64 digest; /* DMA physical address for output digest */ + u32 outlen; /* digest length in bytes */ +}; + +struct sm3_cmd_update2d { + u64 input; /* DMA source address for input data */ + u64 output; /* DMA destination address for pass-through data */ + u32 iolen; /* input/pass-through data length in bytes */ +}; + +struct sm3_cmd_gather { + u64 lista; /* DMA address of dma_scattergather_item chain */ + u32 sgcmd; /* SM3 sub-command: SM3_CMD_UPDATE or SM3_CMD_UPDAT= E2D */ +}; + +struct sm3_cmd_save { + u64 output; /* DMA physical address for saved context */ + u32 outlen; /* must be SM3_CONTEXT_SIZE */ +}; + +struct sm3_cmd_restore { + u64 input; /* DMA physical address of saved context */ + u32 inlen; /* must be SM3_CONTEXT_SIZE */ +}; + +/* SM3 Command Union */ + +union sm3_cmd { + struct sm3_cmd_update cmd_update; + struct sm3_cmd_final cmd_final; + struct sm3_cmd_update2d cmd_update2d; + struct sm3_cmd_gather cmd_gather; + struct sm3_cmd_save cmd_save; + struct sm3_cmd_restore cmd_restore; +}; + +#endif /* CMH_SM3_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_sm4_abi.h b/drivers/crypto/cmh/= include/cmh_sm4_abi.h new file mode 100644 index 000000000000..a34faea613dc --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_sm4_abi.h @@ -0,0 +1,101 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- SM4 Core ABI Definitions + * + * Kernel-side definitions for the CMH SM4 ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_SM4_ABI_H +#define CMH_SM4_ABI_H + +#include + +/* SM4 Block Size */ + +#define CMH_SM4_BLOCK_SIZE 16U +#define CMH_SM4_IV_SIZE 16U +#define CMH_SM4_KEY_SIZE 16U /* SM4 always uses 128-bit keys */ + +/* SM4 Modes (per CMH SM4 ABI) */ + +#define SM4_MODE_ECB 1U +#define SM4_MODE_CBC 2U +#define SM4_MODE_CTR 3U +#define SM4_MODE_CFB 5U +#define SM4_MODE_GCM 6U +#define SM4_MODE_CMAC 7U +#define SM4_MODE_CCM 8U +#define SM4_MODE_XTS 9U +#define SM4_MODE_XCBC 10U + +/* SM4 Operations (per CMH SM4 ABI) */ + +#define SM4_OP_DECRYPT 1U +#define SM4_OP_ENCRYPT 2U + +/* SM4 Command IDs */ + +#define SM4_CMD_INIT 0x01U +#define SM4_CMD_AAD_UPDATE 0x02U +#define SM4_CMD_AAD_FINAL 0x03U +#define SM4_CMD_UPDATE 0x04U +#define SM4_CMD_FINAL 0x05U +#define SM4_CMD_SCATTERGATHER 0x06U +#define SM4_CMD_CCM_INIT 0x09U + +/* SM4 Command Structures */ + +struct sm4_cmd_init { + u64 key; /* datastore reference for the key */ + u64 iv; /* DMA address of the IV */ + u32 keylen; /* key length in bytes (16, or 32 for XTS) */ + u32 ivlen; /* IV length in bytes (0..16) */ + u32 mode; /* SM4 mode (SM4_MODE_*) */ + u32 op; /* SM4 operation (SM4_OP_*) */ + u32 aadlen; /* AAD length or 0 */ + u32 iolen; /* plaintext/ciphertext length */ +}; + +struct sm4_cmd_update { + u64 input; /* DMA address of input data */ + u64 output; /* DMA address of output data */ + u32 iolen; /* input/output data length */ +}; + +struct sm4_cmd_final { + u64 input; /* DMA address of last input data */ + u64 output; /* DMA address of last output data */ + u64 tag; /* DMA address of tag (AEAD only) */ + u32 iolen; /* last input/output data length */ + u32 taglen; /* tag length (AEAD only) */ +}; + +struct sm4_cmd_aad_final { + u64 data; /* DMA address of AAD data */ + u32 datalen; /* AAD data length */ +}; + +struct sm4_cmd_ccm_init { + u64 key; /* datastore reference for the key */ + u64 nonce; /* DMA address of the nonce */ + u32 keylen; /* key length in bytes (always 16) */ + u32 noncelen; /* nonce length (15 - L) */ + u32 op; /* SM4 operation (SM4_OP_*) */ + u32 aadlen; /* AAD length */ + u32 iolen; /* plaintext/ciphertext length */ + u32 taglen; /* tag length */ +}; + +/* SM4 Command Union */ + +union sm4_cmd { + struct sm4_cmd_init cmd_init; + struct sm4_cmd_update cmd_update; + struct sm4_cmd_final cmd_final; + struct sm4_cmd_aad_final cmd_aad_final; + struct sm4_cmd_ccm_init cmd_ccm_init; +}; + +#endif /* CMH_SM4_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_sys_abi.h b/drivers/crypto/cmh/= include/cmh_sys_abi.h new file mode 100644 index 000000000000..64110311e552 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_sys_abi.h @@ -0,0 +1,148 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- SYS Core ABI Definitions + * + * Kernel-side definitions for the CMH SYS ABI. + * All constants and layouts derived from the CMH eSW ABI. + */ + +#ifndef CMH_SYS_ABI_H +#define CMH_SYS_ABI_H + +#include +#include + +/* SYS Commands (per CMH SYS ABI) */ + +#define SYS_CMD_RUN 0x01U +#define SYS_CMD_NOP 0x02U +#define SYS_CMD_IMPORT 0x07U +#define SYS_CMD_EXPORT 0x08U +#define SYS_CMD_NEW 0x0AU +#define SYS_CMD_READ 0x0BU +#define SYS_CMD_WRITE 0x0CU +#define SYS_CMD_GRANT 0x0DU +#define SYS_CMD_LIST 0x0EU +#define SYS_CMD_FIND 0x0FU +#define SYS_CMD_DATA 0x11U + +/* SYS Reference Constants */ + +#define SYS_REF_NONE 0x0000000000000000ULL +#define SYS_REF_TEMP 0x1111111111111111ULL +#define SYS_REF_LAST 0xFFFFFFFFFFFFFFFFULL + +typedef u64 sys_ref_t; + +/* SYS CID */ + +#define SYS_CID_NONE 0x0000000000000000ULL + +/* SYS Type Encoding -- bits [7:0] =3D core_id, bits [23:16] =3D flags */ + +#define SYS_TYPE_FLAG_PT BIT(16) /* can be read as plaintext */ +#define SYS_TYPE_FLAG_XC BIT(17) /* can be exported over XC bus */ +#define SYS_TYPE_FLAG_SCA BIT(18) /* SCA key in 2 shares */ + +#define SYS_TYPE_SET(flags, core) \ + (((flags) & 0xFF0000UL) | ((core) & 0xFFUL)) +#define SYS_TYPE_CORE(type) ((type) & 0xFFU) +#define SYS_TYPE_FLAGS(type) ((type) & 0xFF0000U) +#define SYS_TYPE_NONE 0U /* DMA output, no DS storage */ + +#define SYS_WRAP_HDR_SIZE 16 /* sys_read plaintext header */ + +/* SYS Command Structures */ + +struct sys_cmd_new { + u64 cid; /* caller id (name) for the object */ + u64 ref; /* DMA address -- CMH eSW writes back reference her= e */ + u32 len; /* size of the new object in bytes */ +}; + +struct sys_cmd_write { + u64 ref; /* object datastore reference */ + u64 src; /* DMA source address of key data */ + u64 key; /* wrapping key reference (SYS_REF_NONE =3D plainte= xt) */ + u32 len; /* source buffer length */ + u32 type; /* SYS_TYPE_SET(flags, core_id) */ +}; + +struct sys_cmd_read { + u64 ref; /* object datastore reference */ + u64 dst; /* DMA destination for key data */ + u64 key; /* wrapping key reference (SYS_REF_NONE =3D plainte= xt) */ + u32 len; /* destination buffer length */ +}; + +struct sys_cmd_data { + u64 ref; /* object datastore reference */ + u64 dst; /* DMA destination for object data */ + u32 len; /* destination buffer length */ +}; + +struct sys_cmd_find { + u64 cid; /* caller id to search for */ + u64 dst; /* DMA destination for struct sys_list_item */ + u32 len; /* destination buffer length */ +}; + +struct sys_cmd_list { + u64 ref; /* starting DS reference (SYS_REF_NONE =3D first) *= / + u64 dst; /* DMA destination for struct sys_list_item */ + u32 len; /* destination buffer length */ +}; + +struct sys_cmd_grant { + u64 ref; /* object datastore reference */ + u64 read; /* bitfield: allow read for mailboxes */ + u64 write; /* bitfield: allow write for mailboxes */ + u64 execute; /* bitfield: allow use for mailboxes */ +}; + +struct sys_cmd_export { + u64 cid; /* caller id for the response */ + u64 dst; /* DMA destination for the export blob */ + u64 key; /* wrapping key datastore reference */ + u32 len; /* destination buffer length */ +}; + +struct sys_cmd_import { + u64 src; /* DMA source address of import blob */ + u64 key; /* wrapping key datastore reference */ + u32 len; /* source buffer length */ +}; + +/* SYS List/Find Response Item */ + +struct sys_list_item { + u64 ref; /* object datastore reference */ + u64 cid; /* caller id */ + u32 len; /* object length */ + u32 type; /* object type (SYS_TYPE_SET packed) */ +}; + +/* Wrapped-read header (prepended to SYS_CMD_READ responses) */ + +struct sys_wrap_hdr { + u64 cid; /* caller id */ + u32 wrap; /* wrap data length following this header */ + u32 len; /* object data length following wrap data */ +}; + +/* SYS Command Union */ + +union sys_cmd { + struct sys_cmd_new cmd_new; + struct sys_cmd_write cmd_write; + struct sys_cmd_read cmd_read; + struct sys_cmd_data cmd_data; + struct sys_cmd_find cmd_find; + struct sys_cmd_list cmd_list; + struct sys_cmd_grant cmd_grant; + struct sys_cmd_export cmd_export; + struct sys_cmd_import cmd_import; +}; + +#endif /* CMH_SYS_ABI_H */ diff --git a/drivers/crypto/cmh/include/cmh_sysfs.h b/drivers/crypto/cmh/in= clude/cmh_sysfs.h new file mode 100644 index 000000000000..864cf1c8fa00 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_sysfs.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- sysfs Device Attributes + */ + +#ifndef CMH_SYSFS_H +#define CMH_SYSFS_H + +struct attribute_group; + +extern const struct attribute_group *cmh_sysfs_groups[]; + +#endif /* CMH_SYSFS_H */ diff --git a/drivers/crypto/cmh/include/cmh_txn.h b/drivers/crypto/cmh/incl= ude/cmh_txn.h new file mode 100644 index 000000000000..6131f0b2224f --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_txn.h @@ -0,0 +1,463 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- Transaction Manager + * + * Dedicated kthread managing concurrent VCQ submissions. + * + * Callers post command_msg objects into the Command Message Queue (CMQ). + * The TM thread dequeues them, selects a mailbox, builds VCQ(s) in the + * DMA queue slot, creates a transaction_obj, and rings the doorbell. + * + * The Response Handler (cmh_rh.c) walks per-mailbox transaction queues + * when an IRQ fires and fires completion callbacks. + */ + +#ifndef CMH_TXN_H +#define CMH_TXN_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cmh_config.h" +#include "cmh_vcq.h" + +/* Command Message (caller -> TM) */ + +typedef void (*cmh_completion_fn)(void *data, int error); + +struct command_msg { + struct list_head list; /* CMQ linked list node */ + u32 command_id; /* VCQ_CMD_ID(core, flags, span= , cmd) */ + void *vcq_data; /* heap-owned copy of VCQ entrie= s */ + u32 vcq_count; /* total vcq_cmd entries across= all VCQs */ + u32 num_vcqs; /* how many VCQs in vcq_data (0= or 1 =3D single) */ + s32 target_mbx; /* MBX index from core affinity= , or -1 fallback */ + s32 actual_mbx; /* MBX selected by TM thread, -= 1 until dispatched */ + cmh_completion_fn complete; /* completion callback (may be = NULL) */ + void *completion_data; + refcount_t refs; /* submit_sync: 2 =3D waiter + = TM */ + bool backlog_ok; /* accept into backlog when CMQ= is full */ + unsigned long timeout_jiffies;/* per-txn async timeout (0 =3D= none) */ +}; + +/* Transaction Object (TM -> RH) */ + +/* Per-transaction FSM states for async timeout resolution */ +#define TXN_INFLIGHT 0 +#define TXN_COMPLETE 1 +#define TXN_TIMED_OUT 2 + +struct transaction_obj { + struct list_head list; /* per-mailbox txn queue node *= / + u32 first_vcq_id; + u32 last_vcq_id; + u32 mailbox_idx; /* index into cfg->mailboxes[] = */ + u32 command_id; /* VCQ_CMD_ID from first payloa= d cmd */ + int error_code; + cmh_completion_fn complete; + void *completion_data; + atomic_t state; /* TXN_INFLIGHT / COMPLETE / TI= MED_OUT */ + struct timer_list timeout_timer; /* per-request async timeout */ + refcount_t refs; /* owner + timer (if armed) */ +}; + +/* Per-Mailbox Transaction Queue */ + +struct cmh_mbx_txq { + struct list_head head; + spinlock_t lock; /* protects head list + depth *= / + u32 depth; /* number of in-flight transact= ions */ + struct mutex dispatch_lock; /* serialises VCQ dispatch + MB= X flush */ +}; + +/* Public Interface */ + +/** + * cmh_tm_init() - Initialise the Transaction Manager + * @cfg: Global device configuration (mailbox layout, IRQ, etc.) + * + * Starts the TM kthread and initialises per-mailbox transaction queues. + * + * Return: 0 on success, negative errno on failure. + */ +int cmh_tm_init(struct cmh_config *cfg); + +/** + * cmh_tm_cleanup() - Stop the TM kthread and drain all queues + */ +void cmh_tm_cleanup(void); + +/** + * cmh_tm_quiesce() - Stop TM kthread and drain in-flight transactions + * + * Stops the TM kthread, rejects new posts, then waits (with a + * configurable timeout) for all per-MBX transaction queues to drain. + * If the timeout fires, remaining transactions are cancelled with + * -ECANCELED. + */ +void cmh_tm_quiesce(void); + +/** + * cmh_tm_resume() - Restart the TM kthread after resume + * + * Return: 0 on success, negative errno if the kthread fails to start. + */ +int cmh_tm_resume(void); + +/** + * cmh_tm_post_command() - Post a command to the TM for submission + * @msg: Command message with pre-built VCQ data and completion callback + * + * Round-robin selects the next MBX with enough free slots for + * msg->num_vcqs VCQs. All VCQs in a message are written to + * consecutive slots on the same MBX (back-to-back). + * The caller retains ownership of @msg until the completion callback fire= s. + * + * Return: 0 on success, -EAGAIN if queue full, -ENODEV if TM stopped. + */ +int cmh_tm_post_command(struct command_msg *msg); + +/* + * Synchronous submit -- post one or more VCQs and wait for completion. + * + * Combines post_command + refcounted wait + timeout + cancel into one + * call. This is the standard pattern for all synchronous crypto ops. + * + * Context: must be called from a sleepable (task) context. + * Performs GFP_KERNEL allocations and sleeps on + * wait_for_completion_timeout(). A WARN_ON_ONCE fires + * if called from atomic / IRQ / softirq context. + * + * vcq_cmds: pre-built VCQ array (headers + commands, contiguous) + * vcq_count: total number of vcq_cmd entries across all VCQs + * num_vcqs: number of VCQs in the array (0 or 1 =3D single VCQ) + * + * For multi-VCQ submissions, the array contains multiple VCQs laid + * out contiguously, each starting with its own header. All VCQs are + * written to consecutive MBX slots and share one transaction object. + * + * Returns 0 on success, -ETIMEDOUT, or CMH eSW error code. + */ +int cmh_tm_submit_sync(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs); + +/* + * Synchronous submit pinned to a specific mailbox. + * target_mbx: -1 =3D round-robin, >=3D 0 =3D pin to that MBX index. + */ +int cmh_tm_submit_sync_mbx(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, s32 target_mbx); + +/* + * Synchronous submit with explicit timeout. + * timeout_hz: completion timeout in jiffies (use msecs_to_jiffies()). + */ + +/* + * Extended timeout for slow crypto operations: RSA keygen, PQC + * keygen/sign/verify. Controlled by the slow_op_timeout_ms module + * parameter. + */ +unsigned long cmh_tm_slow_op_timeout_jiffies(void); + +int cmh_tm_submit_sync_tmo(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, s32 target_mbx, + unsigned long timeout_hz); + +/* + * Synchronous submit that never issues MBX_COMMAND_ABORT on timeout. + * Returns -EAGAIN if cancelled from queue, -EINPROGRESS if the VCQ is + * left in-flight. On -EINPROGRESS, @orphan_cb(@orphan_data) will be + * called when the VCQ eventually completes (RH callback fires and the + * last sync_ctx ref drops). Use this to defer DMA cleanup. + * Safe for background/kthread callers that must not disrupt other MBX wor= k. + */ +int cmh_tm_submit_sync_noabort(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, unsigned long timeout_hz, + void (*orphan_cb)(void *), + void *orphan_data); + +/* + * Asynchronous submit -- post VCQs and return immediately. + * + * On successful return (0), the provided @callback may be invoked from + * either the RH threaded IRQ context (normal completion path) or the TM + * kthread (if VCQ dispatch to the HW ring fails after the message was + * posted to the CMQ). The caller must not assume a specific callback + * context. + * + * After a successful post, the caller must NOT touch VCQ buffers -- + * ownership transfers to the TM. If this function returns non-zero, + * the message was not posted, the callback will NOT fire, and the caller + * must perform cleanup. + * + * Uses GFP_ATOMIC internally -- the crypto API may invoke driver ops + * from softirq context (e.g. IPsec), so GFP_KERNEL would deadlock. + * + * If @backlog_ok is true and the CMQ is full, the message is placed on + * an overflow backlog queue and -EBUSY is returned. The caller must + * treat -EBUSY as "accepted" (like -EINPROGRESS): the callback WILL + * fire once the request is promoted from backlog and completes. When + * @backlog_ok is false, CMQ-full returns -EAGAIN (caller must clean up). + * + * Returns: 0 on successful post, -EBUSY (backlogged -- callback will + * fire), -ENOMEM, -EINVAL (bad vcq_count), -EAGAIN (CMQ full, + * no backlog), -ENODEV. + */ +int cmh_tm_submit_async(struct vcq_cmd *vcq_cmds, u32 vcq_count, + u32 num_vcqs, s32 target_mbx, + cmh_completion_fn callback, void *callback_data, + bool backlog_ok, unsigned long timeout_jiffies); + +/** + * cmh_tm_async_timeout_jiffies() - Default per-request async timeout + * + * Returns the debugfs-configurable timeout for symmetric data-path + * ops (async_timeout_ms converted to jiffies). Akcipher/kpp callers + * should pass 0 instead (no per-request timeout; vcq_timeout_ms is the + * safety net). + */ +unsigned long cmh_tm_async_timeout_jiffies(void); + +/** + * cmh_tm_flush_mbx() - Issue MBX_COMMAND_FLUSH and wait for completion + * @mbx_idx: Mailbox index + * + * Resets the eSW child mailbox state including the temp stack. + * Must be called when no VCQ submission is in progress on @mbx_idx. + * + * Return: 0 on success, -ETIMEDOUT if eSW does not clear the command, + * -EBUSY if a command is already pending. + */ +int cmh_tm_flush_mbx(s32 mbx_idx); + +/** + * cmh_tm_try_cancel_command() - Try to cancel a queued command + * @msg: Command message to cancel + * + * Return: true if removed from CMQ, false if already consumed by the TM t= hread. + */ +bool cmh_tm_try_cancel_command(struct command_msg *msg); + +/** + * cmh_tm_peek_transaction() - Peek at the oldest transaction on a mailbox + * @mbx_idx: Mailbox index + * + * For use by the Response Handler. Caller must hold txq->lock or call + * from a context where no concurrent pop is possible (e.g. threaded IRQ). + * + * Return: Pointer to the oldest transaction_obj, or NULL if empty. + */ +struct transaction_obj *cmh_tm_peek_transaction(u32 mbx_idx); + +/** + * cmh_tm_pop_transaction() - Remove and return the oldest transaction + * @mbx_idx: Mailbox index + * + * Return: Pointer to the removed transaction_obj, or NULL if empty. + */ +struct transaction_obj *cmh_tm_pop_transaction(u32 mbx_idx); + +/** + * cmh_txn_finish() - Complete a transaction with FSM + timer handling + * @txn: Transaction popped from the TXQ + * @error: Error code (0 for success, negative errno) + * + * Resolves the timer-vs-completion race via atomic cmpxchg, cancels + * the per-txn timeout timer if still pending, fires the completion + * callback (if this path wins the race), and drops the owner reference. + * The transaction is freed when the last reference is dropped. + * + * Called by the Response Handler after popping a completed transaction. + */ +void cmh_txn_finish(struct transaction_obj *txn, int error); + +/** + * cmh_tm_max_cmds_per_vcq() - Max vcq_cmd entries per MBX slot + * + * Returns the minimum across all configured MBXes so callers can pack + * VCQs without knowing which MBX will be selected. + * + * Return: At least MIN_VCQ_CMDS (2). + */ +u32 cmh_tm_max_cmds_per_vcq(void); + +/** + * cmh_tm_mbx_count() - Return the number of configured mailboxes + * + * Return: cfg->mbx_count. + */ +u32 cmh_tm_mbx_count(void); + +/** + * cmh_core_default_id() - Return the default core_id for a core type + * @type: Logical core type enum + * + * Returns the core_id of the first (index-0) instance without advancing + * the round-robin counter. Intended for callers pinned to a fixed MBX + * (e.g. mgmt ioctls on MGMT_MBX) that only need the VCQ core_id field. + * + * In multi-instance configurations the returned core_id is always that + * of instance[0], regardless of which MBX instance[0] is assigned to. + * Mgmt callers submit on MGMT_MBX (0) -- the eSW accepts any valid + * core_id on any MBX for command dispatch. + * + * Return: u32 core_id. + */ +u32 cmh_core_default_id(enum cmh_core_type type); + +/** + * cmh_core_select_instance() - Multi-instance core dispatch selection + * @type: Logical core type enum + * + * Returns the next (core_id, mbx_idx) pair for @type using round-robin + * across configured instances. On first use for an instance whose MBX + * is not pre-assigned, atomically assigns the next available MBX. + * + * With single-instance defaults, this degenerates to the same behaviour + * as the old single-entry core_to_mbx[] table -- one core type, one MBX. + * + * Return: struct core_dispatch with core_id and mbx_idx. + */ +struct core_dispatch cmh_core_select_instance(enum cmh_core_type type); + +/** + * cmh_core_num_instances() - Return count of configured instances + * @type: Logical core type enum + * + * Return: Number of instances (>=3D 1) for @type. + */ +u32 cmh_core_num_instances(enum cmh_core_type type); + +/** + * cmh_core_get_instance() - Get a specific instance by index + * @type: Logical core type enum + * @idx: Instance index (0-based, must be < cmh_core_num_instances()) + * + * Returns (core_id, mbx_idx) for the given instance without advancing + * the round-robin counter. Triggers auto-assign if the instance has + * no MBX yet. + * + * Return: struct core_dispatch with core_id and mbx_idx. + */ +struct core_dispatch cmh_core_get_instance(enum cmh_core_type type, u32 id= x); + +/** + * cmh_tm_affinity_reset() - Reset all core-to-MBX assignments + * + * Called during init and cleanup. + */ +void cmh_tm_affinity_reset(void); + +/** + * cmh_tm_txq_completion_notify() - Wake TM thread after TXQ completion + * + * Called by the Response Handler after completing a transaction to + * unblock the TM thread if it is waiting for a free MBX slot. + */ +void cmh_tm_txq_completion_notify(void); + +/* + * Pack @count payload commands (no headers) into one or more VCQs + * respecting the per-slot size limit, then submit synchronously. + * + * @payload: flat array of vcq_cmd entries (no headers) + * @count: number of entries in @payload + * @packed: caller-provided scratch buffer for the packed output + * @max_packed: size of @packed in vcq_cmd entries + * @target_mbx: -1 =3D round-robin, >=3D 0 =3D pin to this MBX index + * + * Each VCQ gets its own header. All VCQs are submitted as a single + * back-to-back transaction on the same MBX. + */ +int cmh_vcq_pack_and_submit(const struct vcq_cmd *payload, u32 count, + struct vcq_cmd *packed, u32 max_packed, + s32 target_mbx); + +/** + * cmh_vcq_pack_and_submit_async() - Pack payload commands and submit asyn= c + * @payload: Flat array of VCQ command entries (no headers) + * @count: Number of entries in @payload + * @packed: Caller-provided scratch buffer for packed output + * @max_packed: Size of @packed in vcq_cmd entries + * @target_mbx: Mailbox index (-1 for round-robin) + * @callback: Completion callback + * @callback_data: Opaque data passed to @callback + * @backlog_ok: If true, accept into backlog when CMQ is full + * @timeout_jiffies: Per-request timeout (0 to disable) + * + * Async variant of cmh_vcq_pack_and_submit(). Returns 0 on successful + * post; after a successful post, @callback may run from RH threaded IRQ + * context on normal completion, from the TM kthread if VCQ dispatch + * fails after posting, or from TM teardown paths such as + * cmh_tm_cleanup() / cmh_tm_quiesce() when queued or in-flight work is + * cancelled. Callers must not assume a single callback context. On + * non-zero return, the callback will NOT fire. + * + * @payload: flat array of vcq_cmd entries (no headers) + * @count: number of entries in @payload + * @packed: caller-provided scratch buffer for the packed output + * @max_packed: size of @packed in vcq_cmd entries + * @target_mbx: -1 =3D round-robin, >=3D 0 =3D pin to this MBX index + * @callback: completion callback (may run from IRQ or TM context) + * @callback_data: opaque pointer passed to @callback + * @backlog_ok: if true, queue the request when all MBXs are busy + * @timeout_jiffies: maximum wait time for MBX slot (0 =3D no wait) + * + * Return: 0 on successful post, -EBUSY (backlogged), negative errno on fa= ilure. + */ +int cmh_vcq_pack_and_submit_async(const struct vcq_cmd *payload, u32 coun= t, + struct vcq_cmd *packed, u32 max_packed, + s32 target_mbx, + cmh_completion_fn callback, + void *callback_data, + bool backlog_ok, + unsigned long timeout_jiffies); + +/* debugfs timeout accessors (debug builds only) */ +#ifdef CONFIG_CRYPTO_DEV_CMH_DEBUG +unsigned int *cmh_tm_timeout_async_ptr(void); +unsigned int *cmh_tm_timeout_vcq_ptr(void); +unsigned int *cmh_tm_timeout_slow_op_ptr(void); +unsigned int *cmh_tm_timeout_drain_ptr(void); +#endif + +/* -- Crypto request completion helper ---------------------------------- = */ + +struct device *cmh_dev(void); + +/** + * cmh_complete() - Complete a crypto request with optional error logging + * @req: The async crypto request to complete + * @err: Error code (0 =3D success, -EINPROGRESS =3D backlog promotion sig= nal) + * + * Logs a rate-limited diagnostic on genuine errors, then hands the + * request back to the crypto framework. -EINPROGRESS is excluded from + * logging -- it is the crypto API's backlog promotion notification, not + * an error. Centralizes error reporting so individual algorithm drivers + * do not need per-callback logging. + */ +static inline void cmh_complete(struct crypto_async_request *req, int err) +{ + if (err && err !=3D -EINPROGRESS) { + /* + * For template instances (e.g. hmac(sha3-512-cmh)) the + * driver name will be the outer template's, not ours. + * Still useful for triage -- identifies the failing tfm. + */ + dev_dbg_ratelimited(cmh_dev(), "op error: alg=3D%s err=3D%d= \n", + crypto_tfm_alg_driver_name(req->tfm), + err); + } + crypto_request_complete(req, err); +} + +#endif /* CMH_TXN_H */ diff --git a/drivers/crypto/cmh/include/cmh_vcq.h b/drivers/crypto/cmh/incl= ude/cmh_vcq.h new file mode 100644 index 000000000000..a9d04635d819 --- /dev/null +++ b/drivers/crypto/cmh/include/cmh_vcq.h @@ -0,0 +1,283 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2026 Cryptography Research, Inc. (CRI). + * CMH LKM -- VCQ (Virtual Command Queue) Definitions + * + * Kernel-side definitions for the CMH VCQ and DMA scatter-gather ABI, + * so the LKM can build VCQs without depending on CMH eSW headers. + * + * All constants and layouts are derived from the CMH eSW ABI. + * + * Per-core command definitions live in their own ABI headers (cmh_hc_abi.= h, + * cmh_aes_abi.h, etc.) and are included here to form the hwc_cmd union. + */ + +#ifndef CMH_VCQ_H +#define CMH_VCQ_H + +#include +#include +#include +#include + +#include "cmh_hc_abi.h" +#include "cmh_sm3_abi.h" +#include "cmh_drbg_abi.h" +#include "cmh_sys_abi.h" +#include "cmh_kic_abi.h" +#include "cmh_aes_abi.h" +#include "cmh_sm4_abi.h" +#include "cmh_ccp_abi.h" +#include "cmh_pke_abi.h" +#include "cmh_qse_abi.h" +#include "cmh_hcq_abi.h" +#include "cmh_eac_abi.h" + +/* VCQ Magic Numbers */ + +#define VCQ_HDR_MAGIC 0x01514356U /* 'V' 'C' 'Q' 0x01 */ +#define VCQ_CMD_MAGIC 0x01444D43U /* 'C' 'M' 'D' 0x01 */ + +/* VCQ Command ID Encoding */ + +#define VCQ_CMD_MASK 0x000000FFU +#define VCQ_SPAN_MASK 0x0000FF00U +#define VCQ_FLAG_MASK 0x00FF0000U +#define VCQ_CORE_MASK 0xFF000000U + +#define VCQ_CMD_ID(core, flags, span, cmd) \ + (((u32)(core) << 24) | ((flags) & VCQ_FLAG_MASK) | \ + (((u32)(span) << 8) & VCQ_SPAN_MASK) | ((cmd) & VCQ_CMD_MASK)) + +/* Core IDs (per CMH hardware specification) */ + +#define CORE_ID_SYS 0x00U +#define CORE_ID_DMA 0x01U +#define CORE_ID_HC 0x02U +#define CORE_ID_AES 0x03U +#define CORE_ID_SM4 0x04U +#define CORE_ID_SM3 0x05U +#define CORE_ID_XC 0x07U +#define CORE_ID_HCQ 0x08U +#define CORE_ID_QSE 0x09U +#define CORE_ID_PKE 0x0AU +#define CORE_ID_TIC 0x0BU +#define CORE_ID_KIC 0x0CU +#define CORE_ID_MPU 0x0EU +#define CORE_ID_DRBG 0x0FU +#define CORE_ID_EMC 0x11U +#define CORE_ID_CCP 0x18U +#define CORE_ID_EAC 0x1EU +#define CORE_ID_NUM 0x1FU /* eSW g_drvs[] array size sentinel= */ +#define CORE_ID_MAX 0xFFU /* VCQ encoding limit (8-bit field)= */ + +/** + * enum cmh_core_type - Logical core type for multi-instance dispatch + * @CMH_CORE_HC: Hash / HMAC / CSHAKE / KMAC (CORE_ID_HC) + * @CMH_CORE_AES: AES (CORE_ID_AES) + * @CMH_CORE_SM4: SM4 (CORE_ID_SM4) + * @CMH_CORE_SM3: SM3 (CORE_ID_SM3) + * @CMH_CORE_CCP: ChaCha20 / Poly1305 (CORE_ID_CCP) + * @CMH_CORE_PKE: RSA / ECDSA / ECDH / EdDSA / SM2 (CORE_ID_PKE) + * @CMH_CORE_QSE: ML-KEM / ML-DSA (CORE_ID_QSE) + * @CMH_CORE_HCQ: SLH-DSA / LMS / XMSS (CORE_ID_HCQ) + * @CMH_NUM_CORE_TYPES: Number of core types (array sizing sentinel) + * + * Algorithm drivers use this enum (not raw CORE_ID_* constants) for + * MBX selection and VCQ dispatch. Each value indexes into a config + * table that maps to one or more (core_id, mbx) pairs. + * + * Raw CORE_ID_* defines remain for: + * - SYS_TYPE_SET() key-type tags in datastore operations + * - DT child node ``reg`` values (hardware core identity for config loo= kup) + * - Singleton system cores (SYS, KIC, DRBG, EAC) not in this enum + */ +enum cmh_core_type { + CMH_CORE_HC =3D 0, + CMH_CORE_AES, + CMH_CORE_SM4, + CMH_CORE_SM3, + CMH_CORE_CCP, + CMH_CORE_PKE, + CMH_CORE_QSE, + CMH_CORE_HCQ, + CMH_NUM_CORE_TYPES +}; + +/** + * struct core_dispatch - VCQ dispatch target returned by core selection + * @core_id: Hardware core ID to encode in VCQ_CMD_ID() + * @mbx_idx: Mailbox index to submit the VCQ to + */ +struct core_dispatch { + u32 core_id; + s32 mbx_idx; +}; + +/* Common VCQ Command (per CMH VCQ ABI) */ + +#define VCQ_CMD_FLUSH 0xFFU + +/** + * struct vcq_hdr - VCQ header occupying the first slot of every VCQ + * @cmds: Total number of commands including the header itself + * @rsvd: Reserved -- used internally by CMH eSW firmware + */ +struct vcq_hdr { + u32 cmds; + u32 rsvd[13]; +}; + +/* DMA Scatter-Gather Item (per CMH DMAC hardware specification) */ + +/** + * struct dma_scattergather_item - DMA scatter-gather descriptor node + * @lli: Next descriptor address (0 =3D end of list) + * @src: Source address for input particle + * @dst: Destination address for output particle + * @len: Particle length (low 32 bits used by hardware) + * + * Linked-list node walked by the DMAC hardware. @lli chains to the + * next item or is zero for end-of-list. + */ +struct dma_scattergather_item { + u64 lli; + u64 src; + u64 dst; + u64 len; +}; + +/* Unified HWC Command Union */ +/* + * Each per-core ABI header defines a union _cmd. + * Add new cores here as they are implemented. + */ + +union hwc_cmd { + struct vcq_hdr hdr; + union hc_cmd hc; + union sm3_cmd sm3; + union drbg_cmd drbg; + union sys_cmd sys; + union kic_cmd kic; + union aes_cmd aes; + union sm4_cmd sm4; + union ccp_cmd ccp; + union pke_cmd pke; + union qse_cmd qse; + union hcq_cmd hcq; + union eac_cmd eac; +}; + +/** + * struct vcq_cmd - Single VCQ command entry (always 64 bytes) + * @magic: VCQ_HDR_MAGIC for the header slot, VCQ_CMD_MAGIC for commands + * @id: Encoded command ID built via VCQ_CMD_ID(core, flags, span, cmd) + * @hwc: Per-core command payload union + */ +struct vcq_cmd { + u32 magic; + u32 id; + union hwc_cmd hwc; +}; + +static_assert(sizeof(struct vcq_cmd) =3D=3D 64, + "struct vcq_cmd must be exactly 64 bytes (one VCQ slot)"); + +/** + * vcq_set_header() - Write the standard VCQ header at slot[0] + * @slot: Pointer to the first VCQ slot + * @total_cmds: Total number of commands including the header + */ +static inline void vcq_set_header(struct vcq_cmd *slot, u32 total_cmds) +{ + memset(slot, 0, sizeof(*slot)); + slot->magic =3D VCQ_HDR_MAGIC; + slot->id =3D VCQ_CMD_ID(CORE_ID_SYS, 0, 1, SYS_CMD_RUN); + slot->hwc.hdr.cmds =3D total_cmds; +} + +/* VCQ Command Limits */ + +#define MIN_VCQ_CMDS 2U /* header + at least one command */ +#define MAX_VCQ_CMDS 15U /* including the header */ +#define MAX_VCQ_SIZE (MAX_VCQ_CMDS * sizeof(struct vcq_cmd)) + +/** + * vcq_add_inline_data() - Pack inline data into consecutive VCQ slots + * @slot: Pointer to the command slot preceding the inline data + * @data: Source data to copy into subsequent slots + * @data_len: Length of @data in bytes + * + * Appends data starting at slot+1 and updates the span field in + * slot->id. The caller must ensure enough slots are reserved. + * + * Return: Total number of slots consumed (1 + inline slots). + */ +static inline u32 vcq_add_inline_data(struct vcq_cmd *slot, + const void *data, u32 data_len) +{ + u32 inline_slots, total_span; + + if (!data_len) + return 1; + + inline_slots =3D (data_len + sizeof(struct vcq_cmd) - 1) / + sizeof(struct vcq_cmd); + total_span =3D 1 + inline_slots; + + /* Zero the inline slots, then copy data */ + memset(slot + 1, 0, inline_slots * sizeof(struct vcq_cmd)); + memcpy(slot + 1, data, data_len); + + /* Update span in the command's id field */ + slot->id =3D (slot->id & ~VCQ_SPAN_MASK) | + (((u32)total_span << 8) & VCQ_SPAN_MASK); + + return total_span; +} + +/** + * vcq_add_flush() - Build a generic VCQ_CMD_FLUSH command + * @slot: Pointer to the VCQ slot to populate + * @core_id: Hardware core ID for the flush command + */ +static inline void vcq_add_flush(struct vcq_cmd *slot, u32 core_id) +{ + memset(slot, 0, sizeof(*slot)); + slot->magic =3D VCQ_CMD_MAGIC; + slot->id =3D VCQ_CMD_ID(core_id, 0, 1, VCQ_CMD_FLUSH); +} + +/* Shared HC VCQ Builders -- used by hash, hmac, cshake, kmac drivers */ + +static inline void vcq_add_hc_init(struct vcq_cmd *slot, u32 core_id, + u32 algo) +{ + memset(slot, 0, sizeof(*slot)); + slot->magic =3D VCQ_CMD_MAGIC; + slot->id =3D VCQ_CMD_ID(core_id, 0, 1, HC_CMD_INIT); + slot->hwc.hc.cmd_init.algo =3D algo; +} + +static inline void vcq_add_hc_final(struct vcq_cmd *slot, u32 core_id, + u64 digest_phys, u32 outlen) +{ + memset(slot, 0, sizeof(*slot)); + slot->magic =3D VCQ_CMD_MAGIC; + slot->id =3D VCQ_CMD_ID(core_id, 0, 1, HC_CMD_FINAL); + slot->hwc.hc.cmd_final.digest =3D digest_phys; + slot->hwc.hc.cmd_final.outlen =3D outlen; +} + +static inline void vcq_add_hc_gather(struct vcq_cmd *slot, u32 core_id, + u64 lista_phys, u32 sgcmd) +{ + memset(slot, 0, sizeof(*slot)); + slot->magic =3D VCQ_CMD_MAGIC; + slot->id =3D VCQ_CMD_ID(core_id, 0, 1, HC_CMD_GATHER); + slot->hwc.hc.cmd_gather.lista =3D lista_phys; + slot->hwc.hc.cmd_gather.sgcmd =3D sgcmd; +} + +#endif /* CMH_VCQ_H */ -- 2.43.7 ** This message and any attachments are for the sole use of the intended re= cipient(s). It may contain information that is confidential and privileged.= If you are not the intended recipient of this message, you are prohibited = from printing, copying, forwarding or saving it. Please delete the message = and attachments and notify the sender immediately. ** Rambus Inc.