From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F03523B604 for ; Fri, 9 Jan 2026 16:56:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767977798; cv=none; b=kdcRRLBToU5kswJrna4Nl7A6+pvKOEFsu1ujDe7Kdw7WRodxz9Q7JUp7SAkrf9v4xkMY50BVLOmVcl5vPiHR5/wgso3vOaNbN8bPJ81vk9CSG168UJONL/kJOhvfCLj5Lx8DPF+K3xfHJUdzLwxf51JiHZjZfXwSWGnulQnbX0g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767977798; c=relaxed/simple; bh=FKIel0sk+TQvo+3DkHgpu6JkM32De61s7CkVSarEAV4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GqRlokVRt6Fy4pI36FtpEQe0iyhiGLK/3U94vj6LpPPDSdPp5eALhdll90Go48V9Rl8WNn3hgQ9DngS4GvhJaTDm2/cSf+oI6Mli1NLP2r3U5/EmcGR872vNzFutD+G/x5wWgyoBUEv4bn1+QP8Gbo35p2w/TadUvP6LRFJBuqU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=EH3wWmxI; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=EC9zzN+e; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="EH3wWmxI"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="EC9zzN+e" Date: Fri, 9 Jan 2026 17:56:28 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1767977789; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C1HybMOuIDLdYM7rKAWYnyShEGqxrD+eweNk/ZnsmZc=; b=EH3wWmxINy5Wk5y4pT1mkniXPN0zsj2JvrMGRR4P1Hy/kx+gOU5mfmCXDiKHLaLsiYBe6l qGMbu6JvcXyRAh37xLZTm2heVXLi3Ukji2xxfyIKXEqRk+Gpm0uSszNcCZNPzEXBejosS9 J08I0i+ENihM2pB5cZ8M2Dff3MhJrdZurYeiF5ZeSqgEV4q6OjZynMGONqHAxX37M1kN+f WhM0w7CRmu3AAY3iLmP6+hYdgYQX+H6u4lbDoiaf3kMNB3kz8iC1Ezr4SsKrL2vpYExKoq JjW5PrSb7rzTRG5PHN1mI49GMRICwc2gEKOBuNOJwaj7XeO7FKDq+ROZZVsPWg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1767977789; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C1HybMOuIDLdYM7rKAWYnyShEGqxrD+eweNk/ZnsmZc=; b=EC9zzN+eGJeJRI9bkcwJj1Y3EgNycKAKXLhLhluXo+yV62pN5cvp9qvQb/8MnkIFV9/jvo pHm6DA7dtULRfxBQ== From: Sebastian Andrzej Siewior To: Thomas Gleixner , Florian Albertz Cc: mingo@redhat.com, linux-kernel@vger.kernel.org, Peter Zijlstra Subject: Re: PROBLEM: Kernel 6.17 newly deadlocks futex Message-ID: <20260109165628.Lt2MGP7M@linutronix.de> References: <1d9fe0eb-11a0-4f8e-a8e7-57e1756193d3@app.fastmail.com> <873456b5hq.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <873456b5hq.ffs@tglx> On 2025-12-19 21:07:13 [+0100], Thomas Gleixner wrote: > On Fri, Dec 19 2025 at 11:02, Florian Albertz wrote: =E2=80=A6 > > clone(child, malloc(STACK_SIZE) + STACK_SIZE, CLONE_VM, NULL, NULL,= NULL); > > > > // And now this futex wait never wakes from kernel 6.17 onwards. > > syscall(SYS_futex, fut, FUTEX_WAIT_PRIVATE, 0, NULL, NULL, 0); > > } >=20 > The below should fix that. It's not completely correct because the > resulting hash sizing looks at current->signal->threads. As signal is > not shared each resulting process accounts for their own threads. Fixing > that needs some more thoughts. I'm not sure if I mix things up or it was based on an earlier version where things were different but if I'm right then PeterZ said if someone uses CLONE_VM without CLONE_THREAD then he can keep the pieces. Using only CLONE_VM is okay (well it is not but is not causing the problem here). Using CLONE_VM for some clone() invocations and CLONE_VM + CLONE_THREAD for other is causing the problem. Who is doing this? Some exotic early container runtime? CLONE_VM without CLONE_THREAD is common with CLONE_VFORK and in this case we don't want to create the private hash. I'm not sure if it is worth the effort. The wrong or not accurate get_nr_threads() shouldn't be a problem given the situation. I would suggest to limit it to "CLONE_THREAD | CLONE_VM" or "!CLONE_THREAD && CLONE_VM" if we really want to support this. > Thanks, >=20 > tglx Sebastian