From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 85B77CEE343 for ; Tue, 18 Nov 2025 17:28:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=uXUfisXQqmHGJNo8abNctWZXgidcVotQQH5o6was7fQ=; b=0s+UXfa10mPuKOZNdYiNDHXp/Y 57JdCLqcBgMd2PY+81hG4nkccN1XtrlTz82oyDHuXDFtvEQuAN8YXGHWFLkVS9aMgjIynQuNXkE82 Ovf5cfcWoEkOO8sgqtIWog53xS+PAlrd8ZFOGlf2ehBuljknU/0hyYYaisVq2ivSZ1FdLuBVhaA8n mAYwy6gxtWXtE0KtNtH6GktA5wrtDh2lQ/XQftpLs2dQ12L7TLGXHzzTfky9RvlkYncVBCWOtoT5V PNOrkznTgJT4AUYAYX/2PS8yGGaY47/3ubHRPwsA5RirGdA+NNUMu5WEhT1JLI20SIwNq7D1wwtx/ fMQfFkFA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vLPVO-00000000rWZ-19pj; Tue, 18 Nov 2025 17:28:38 +0000 Received: from sea.source.kernel.org ([172.234.252.31]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vLPVL-00000000rWB-1Vjs for linux-arm-kernel@lists.infradead.org; Tue, 18 Nov 2025 17:28:36 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id A667A442CD; Tue, 18 Nov 2025 17:28:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E0602C4CEFB; Tue, 18 Nov 2025 17:28:32 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="KznyYLOk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1763486910; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uXUfisXQqmHGJNo8abNctWZXgidcVotQQH5o6was7fQ=; b=KznyYLOkyPpaGRgy8lV6nlCiKrZg3vkQAmjrMNUxLGkegbhMQYVeTOc6s77/qjfdw2WEjn 5fIKXenOWwCEkLkV83aXTFR06cMoKOFCEjBubByAUzvzFNAEE7JhZXvvCL2qDiqZ0miUcn qKrXnWfs6G27ehst/uGTxYAyYABtxYg= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 289aa45c (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 18 Nov 2025 17:28:29 +0000 (UTC) Date: Tue, 18 Nov 2025 18:28:26 +0100 From: "Jason A. Donenfeld" To: Ryan Roberts Cc: Arnd Bergmann , Kees Cook , Ard Biesheuvel , Jeremy Linton , Will Deacon , Catalin Marinas , Mark Rutland , "linux-arm-kernel@lists.infradead.org" , Linux Kernel Mailing List , ebiggers@kernel.org Subject: Re: [DISCUSSION] kstack offset randomization: bugs and performance Message-ID: References: <66c4e2a0-c7fb-46c2-acce-8a040a71cd8e@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20251118_092835_464179_7238F775 X-CRM114-Status: GOOD ( 34.08 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Nov 18, 2025 at 05:21:17PM +0000, Ryan Roberts wrote: > On 18/11/2025 17:15, Jason A. Donenfeld wrote: > > On Mon, Nov 17, 2025 at 05:47:05PM +0100, Arnd Bergmann wrote: > >> On Mon, Nov 17, 2025, at 12:31, Ryan Roberts wrote: > >>> On 17/11/2025 11:30, Ryan Roberts wrote: > >>>> Hi All, > >>>> > >>>> Over the last few years we had a few complaints that syscall performance on > >>>> arm64 is slower than x86. Most recently, it was observed that a certain Java > >>>> benchmark that does a lot of fstat and lseek is spending ~10% of it's time in > >>>> get_random_u16(). Cue a bit of digging, which led me to [1] and also to some new > >>>> ideas about how performance could be improved. > >> > >> > >>>> I believe this helps the mean latency significantly without sacrificing any > >>>> strength. But it doesn't reduce the tail latency because we still have to call > >>>> into the crng eventually. > >>>> > >>>> So here's another idea: Could we use siphash to generate some random bits? We > >>>> would generate the secret key at boot using the crng. Then generate a 64 bit > >>>> siphash of (cntvct_el0 ^ tweak) (where tweak increments every time we generate a > >>>> new hash). As long as the key remains secret, the hash is unpredictable. > >>>> (perhaps we don't even need the timer value). For every hash we get 64 bits, so > >>>> that would last for 10 syscalls at 6 bits per call. So we would still have to > >>>> call siphash every 10 syscalls, so there would still be a tail, but from my > >>>> experiements, it's much less than the crng: > >> > >> IIRC, Jason argued against creating another type of prng inside of the > >> kernel for a special purpose. > > > > Yes indeed... I'm really not a fan of adding bespoke crypto willynilly > > like that. Let's make get_random_u*() faster. If you're finding that the > > issue with it is the locking, and that you're calling this from irq > > context anyway, then your proposal (if I read this discussion correctly) > > to add a raw_get_random_u*() seems like it could be sensible. Those > > functions are generated via macro anyway, so it wouldn't be too much to > > add the raw overloads. Feel free to send a patch to my random.git tree > > if you'd like to give that a try. > > Thanks Jason; that's exactly what I did, and it helps. But I think ultimately > the get_random_uXX() slow path is too slow; that's the part that causes the tail > latency problem. I doubt there are options for speeding that up? > > Anyway, I'm currently prototyping a few options and getting clear performance > numbers. I'll be back in a couple of days and we can continue the discussion in > light of the data. Interesting... I would be curious to see what sorts of stable numbers you find. Because most of the time, get_random_uXX() should just be copying memory. Does the unlikely slower case really matter that much? I suspect it doesn't matter for anything real. On the other hand, it's probably possible to improve the slow path on ARM a bit by using the pure-ARM assembly chacha implementation that we use in the vDSO: https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/tree/arch/arm64/kernel/vdso/vgetrandom-chacha.S Or by using the non-generic code already provided by libcrypto from random.c. Jason