From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC778C27C75 for ; Tue, 11 Jun 2024 20:19:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=2dRPSCtDL4x6ugmBX20ku+4A8lZWr67P4sg3s+GA17o=; b=Ps4BuSDkPBE737wH++ZpgmzzHN xuB22+dr6gYuaboQdUjIlTYOxFz++XqlYtxPdhkO2ce+OAvVwLTFTUBXqr/n49aGNlkKHp30dEEey /zEfdz9/OtdJmhPSpEbcN17YMf7w2qiwGxW2SKgb0o3BYt/UukwfFX5D8g+NZGmTYD9iV4H+sAbYW GPv/6CP9GTreoSpojFLlfI2X29CQzX9TRP075VnP+SJAw+jn3cvXJipPJb4aVQ9rqJ/dX9HKfvVhy SZqZgSX6tkJC7bAQjlKV/GBcFCywfNQrX6rLuFzbKdcIB5ssjkkohrwUUFKyAwqv9Ov9EUj8D0j0w p3Z1+scw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sH7xR-0000000A9S1-1vn8; Tue, 11 Jun 2024 20:19:05 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sH7xO-0000000A9RK-1Ptv for linux-arm-kernel@lists.infradead.org; Tue, 11 Jun 2024 20:19:03 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 0685861179; Tue, 11 Jun 2024 20:19:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4F4D4C2BD10; Tue, 11 Jun 2024 20:19:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1718137140; bh=gtTMkJYRvzc11WF2seK68wjURAdaNya0C5PddHKpdUM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Tg4w6+JdHhFxJdi4TnsqqM/VB92iMDH+/U5LYjnE2oucxYyHNoYL6hd/t2ucaGpqi MoMgRkjwlMDaMU/QbFTds8e7j2TTxvqa7GWcWpN0QZuhI4R7S15A4YpMi+GXeIkju8 ONuWoutk4kJTocbsp41zeLdPmpRfeldqNLgZ6RNufQ6kcR6Jo2FCvN2+7f0gd2ZktV a+mLDeyfRdQtGhbuQI1i7YJoVZ38wME27wLvflfJjcFKA3eMCh/QrUUjWcRoymhZ4a tPfpBeKRwt8nI9jOk4zvQl9T9nTOUaAl+M4rkU88SYx6HpP/b0s79oC+teD4onsnwP xzSfHlcvR7rIw== Date: Tue, 11 Jun 2024 13:18:58 -0700 From: Eric Biggers To: Herbert Xu Cc: Ard Biesheuvel , Steffen Klassert , netdev@vger.kernel.org, linux-crypto@vger.kernel.org, fsverity@lists.linux.dev, dm-devel@lists.linux.dev, x86@kernel.org, linux-arm-kernel@lists.infradead.org, Sami Tolvanen , Bart Van Assche , Tim Chen Subject: Re: [PATCH v4 6/8] fsverity: improve performance by using multibuffer hashing Message-ID: <20240611201858.GA128642@sol.localdomain> References: <20240606052801.GA324380@sol.localdomain> <20240610164258.GA3269@sol.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240611_131902_483169_C07B07A1 X-CRM114-Status: GOOD ( 11.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Jun 11, 2024 at 11:21:43PM +0800, Herbert Xu wrote: > > BTW, I found an old Intel paper that claims through their multi- > buffer strategy they were able to make AES-CBC-XCBC beat AES-GCM. > I wonder if we could still replicate this today: > > https://github.com/intel/intel-ipsec-mb/wiki/doc/fast-multi-buffer-ipsec-implementations-ia-processors-paper.pdf No, not even close. Even assuming that the lack of parallelizability in AES-CBC and AES-XCBC can be entirely compensated for via multibuffer crypto (which really it can't -- consider single packets, for example), doing AES twice is much more expensive than doing AES and GHASH. GHASH is a universal hash function, and computing a universal hash function is inherently cheaper than computing a cryptographic hash function. But also modern Intel CPUs have very fast carryless multiplication, and it uses a different execution port from what AES uses. So the overhead of AES + GHASH over AES alone is very small. By doing AES twice, you'd be entirely bottlenecked by the ports that can execute the AES instructions, while the other ports go nearly unused. So it would probably be approaching twice as slow as AES-GCM. Westmere (2010) through Ivy Bridge (2012) are the only Intel CPUs where multibuffer AES-CBC-XCBC could plausibly be faster than AES-GCM (given a sufficiently large number of messages at once), due to the very slow pclmulqdq instruction on those CPUs. This is long since fixed, as pclmulqdq became much faster in Haswell (2013), and faster still in Broadwell. This is exactly what that Intel paper shows; they show AES-GCM becoming fastest in "Gen 4", i.e. Haswell. The paper is from 2012, so of course they don't show anything after that. But AES-GCM has only pulled ahead even more since then. In theory something like AES-CBC + SHA-256 could be slightly more competitive than AES-CBC + AES-XCBC. But it would still be worse than simply doing AES-GCM -- which again, doesn't need multibuffer, and my recent patches have already fully optimized for recent x86_64 CPUs. - Eric