From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB6AF273D9A for ; Tue, 18 Nov 2025 17:15:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763486125; cv=none; b=Ofme6TQcODPh4allx5QqO5FiJU1rZXZc3/nhIqcLDw4piqD1hArHr+c30IKY/fYjzIrMxNN2/i1wMQNu8W/p00cP5nTKQmr+8mPTk06MDaY8nLo89hG2CmduPEY+TsHldEL3RYsZiWRmfkk0iwz29haMvCVLkXphyPyJeq0EGFU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763486125; c=relaxed/simple; bh=JdllH3d39F4osy2fc5yvoEDR9jwowm/QtE/yMQKvjOI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UNBgOw5TfiAUasqdsQDu8fOn0TGrVIGAnPXH/J65c2fpcXZei4XNdRav8o3ZKu8rwYXJHXlpwjLGhA0fdcT+M3tcPA1FVmj6KPKVNX3u1k6yWF7d28onCsTn/s4mBEbEMphDz7OuXjSF7urP8kNTPxtWC1dIc6C8jMDXSo553QE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b=YfKZxk8x; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="YfKZxk8x" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19BF3C4CEF1; Tue, 18 Nov 2025 17:15:22 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="YfKZxk8x" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1763486121; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uOcOyrt7B3d6M/YJeQqipiAsK0U27c1TIdKy6sm+Bow=; b=YfKZxk8xKb6bVagqRNxtq+ac/O7JjFfKz0zIS9AyHrWzdMxp37QmwGa29OK0B/9KCuFRAM zmk1FkQFYnkk3XIzyjKbirpQt6s0/IAtDq8UOF923rej4SYPaUh1vbM0yNoRNHnDcCljvX eBnf9l881UVtObkjMzdZzSYNmXE+Y/s= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 69c9bda7 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 18 Nov 2025 17:15:20 +0000 (UTC) Date: Tue, 18 Nov 2025 18:15:17 +0100 From: "Jason A. Donenfeld" To: Arnd Bergmann Cc: Ryan Roberts , Kees Cook , Ard Biesheuvel , Jeremy Linton , Will Deacon , Catalin Marinas , Mark Rutland , "linux-arm-kernel@lists.infradead.org" , Linux Kernel Mailing List Subject: Re: [DISCUSSION] kstack offset randomization: bugs and performance Message-ID: References: <66c4e2a0-c7fb-46c2-acce-8a040a71cd8e@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: On Mon, Nov 17, 2025 at 05:47:05PM +0100, Arnd Bergmann wrote: > On Mon, Nov 17, 2025, at 12:31, Ryan Roberts wrote: > > On 17/11/2025 11:30, Ryan Roberts wrote: > >> Hi All, > >> > >> Over the last few years we had a few complaints that syscall performance on > >> arm64 is slower than x86. Most recently, it was observed that a certain Java > >> benchmark that does a lot of fstat and lseek is spending ~10% of it's time in > >> get_random_u16(). Cue a bit of digging, which led me to [1] and also to some new > >> ideas about how performance could be improved. > > > >> I believe this helps the mean latency significantly without sacrificing any > >> strength. But it doesn't reduce the tail latency because we still have to call > >> into the crng eventually. > >> > >> So here's another idea: Could we use siphash to generate some random bits? We > >> would generate the secret key at boot using the crng. Then generate a 64 bit > >> siphash of (cntvct_el0 ^ tweak) (where tweak increments every time we generate a > >> new hash). As long as the key remains secret, the hash is unpredictable. > >> (perhaps we don't even need the timer value). For every hash we get 64 bits, so > >> that would last for 10 syscalls at 6 bits per call. So we would still have to > >> call siphash every 10 syscalls, so there would still be a tail, but from my > >> experiements, it's much less than the crng: > > IIRC, Jason argued against creating another type of prng inside of the > kernel for a special purpose. Yes indeed... I'm really not a fan of adding bespoke crypto willynilly like that. Let's make get_random_u*() faster. If you're finding that the issue with it is the locking, and that you're calling this from irq context anyway, then your proposal (if I read this discussion correctly) to add a raw_get_random_u*() seems like it could be sensible. Those functions are generated via macro anyway, so it wouldn't be too much to add the raw overloads. Feel free to send a patch to my random.git tree if you'd like to give that a try. Jason