From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55291366DB5 for ; Wed, 22 Apr 2026 15:52:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.46 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776873135; cv=none; b=BVVZGBQgpRXHJ2VLGgKJiD2zuep12nJ8yEji6PP82cx78MVdmVLdyObU4ihSgUuZhmhlOTiQAIS7XIdcMQ1J9W8iasd2Alb+9l96oCX6Q+bKiyCQlfoeEbc1Nd7uTKeVnRt7vh6SqGGdbyPRkEEIBLYtAm7uxdt50EDKfSPqqzU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776873135; c=relaxed/simple; bh=TrBeFIlNm6oMpfU21mcqoetEVQLZ+RkU6aHamb/AFYk=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=KyeOCdGF2ttLv5C9N2u2pGQXEvx99/DfZjLey73oizmhTNw3tWE4EGw2ZQpq4CKmoIGImNW7U+r7XdrHQCVqQwi9i8qGY8f8nVXXbiCHy3d7s5SIxN6l9NNdgydFU8sxu7fk0S19HfMWQF4sIslLDTk8nirBrUCK+w87ada1RCs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RwMgqv9n; arc=none smtp.client-ip=209.85.167.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RwMgqv9n" Received: by mail-lf1-f46.google.com with SMTP id 2adb3069b0e04-5a2b5ea59a1so7766976e87.1 for ; Wed, 22 Apr 2026 08:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776873131; x=1777477931; darn=vger.kernel.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=RNnGez9KIYhAWfYZKM7eASvp+SAAnmhGa5DxdGzAeNY=; b=RwMgqv9nN5+VF5C8c0QIk8MELphlJc5NUTCUxl6E+SqwJbZ3Hb/KROfA3fk2yqcUot m7edOvG48I3xbuej6B88/bh+UUxPAzI/E3JKDcAS68U0cDl6SNdtEXf3yGimOEMvrFFS V3hUU6D2K6SfnMfWIvLV/11TwzEofMbz5yC9F08p4hw4Sy5HO72MW1cvdZRX7wlz4p56 htgNb88FZ1SmqxMKwJ8emYOWMRQeVKkB+IqfXGM+KY6bg7LNZHTzYs5SieoXe0UGpDKz uyxsXGSWcBnAiNKOXDCWDx6EqHVXhKhDvlyMwKvqWDyiRHE4tlGiusplJhheDhnpqOOw CeHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776873131; x=1777477931; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RNnGez9KIYhAWfYZKM7eASvp+SAAnmhGa5DxdGzAeNY=; b=mZeb5GQ0ip8NcQbRMRNanzLEP/hw3MWQvflip9t+3fEVQtGpAnFS2O2SDocW3G4wSa h4y7M2AlvuoDhslRyDXzkWIWer4oxskHLGb+W7TNTStLHPkuce7hdbWIMc3fAJsydCg/ lyed+SDZKR8Dv2nGQi575nxPkFr2Onjvnwg52GBU7NQtfs2+o5EX80wWWnKaZCt0+R7o 4IbRaPMC+na6JFAhNJC9yDIgZppkoVQUYhLOOV7ywOUdKe4X0f5NbDm0g8xpMT++LbrE MBnX56jWWabANheoUnDQmH1IzNONRHu8LtMxGWkkJNy9U5XY2NEJsMwzwGtroyCE9ACa X47w== X-Forwarded-Encrypted: i=1; AFNElJ8gHKyCh/LJ6JEahu+3Up09uIpWHceQFicPai8R/hL3oFlJ3yweFWx0lElpHJaiwAnqk3JT2u+U8mFun8Q=@vger.kernel.org X-Gm-Message-State: AOJu0Yy2NLnehovhrouQ4AEDVUZ/pU0Jbh8sWtLnfrhg1b9YOj42RH9f BxAoeTkJxqZiDofcq5OdIVKeeKa139xDKmMOxyeUUJ9Eb+uO+mvglMiN X-Gm-Gg: AeBDiespYZgt+N3DGwZIRXZe4Eq+l72Zuvy9VWckMqSFAY31jQwvwX2Ti2qYS6OFDjn +uxVPXT4y3RpqxNmBp7sFI5cPXD+6Qe60jqb/oJJ4d05GoLi47CfIuZNAoqHu7+rzjCm5vZ9fZg HoZYwrdy5BR3TmVczYszRjFnaPfwrWhXHd0Mc8zctlpJeG6KMh7m1qhGOa9m9NJuDOBCwskR2t6 BVPqHvNTjGKRVmkFInlN5yad2YFp1Z+C0O4t3l08pQa5/7aiFm7Kg9V50n4F5eJGyOvW8Um1HsQ qBAFnizAxvqHYDMi5B18Dkl2svyNRBbkX0HwkUdUou4yZcYyzRUhTtSvBs8+YdiLJj2OOKLWDOl N1J3VR/rTsb/dHwkwYxoIzin0CDKetLAred50GEA2iUML/yV59Qa1y2R1oR+lfSORwPXWCUOyme aTHoaNWt/7Cc2JH8PxamWComaCdknjYEexgl7uQeRvDfkv3T/x3pksaQ== X-Received: by 2002:a05:6512:6cf:b0:5a2:bebd:45b5 with SMTP id 2adb3069b0e04-5a4172bb8efmr8192442e87.4.1776873130944; Wed, 22 Apr 2026 08:52:10 -0700 (PDT) Received: from primary-ws.local ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a4185ad11fsm4480896e87.14.2026.04.22.08.52.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2026 08:52:10 -0700 (PDT) Message-ID: <2a93eae0c4b364d44fffe3840dcbb0a60a1c6114.camel@gmail.com> Subject: Re: Regression on linux-next (next-20260324 ) From: "mikhail.v.gavrilov@gmail.com" To: Peter Zijlstra , John Stultz Cc: K Prateek Nayak , "Borah, Chaitanya Kumar" , willy@infradead.org, linux-kernel@vger.kernel.org, "intel-gfx@lists.freedesktop.org" , "intel-xe@lists.freedesktop.org" , "Kurmi, Suresh Kumar" , "Saarinen, Jani" , ravitejax.veesam@intel.com Date: Wed, 22 Apr 2026 20:52:08 +0500 In-Reply-To: <20260422092335.GH3102924@noisy.programming.kicks-ass.net> References: <20260330195037.GW2872@noisy.programming.kicks-ass.net> <20260420130318.GD3102924@noisy.programming.kicks-ass.net> <20260421101521.GO3102624@noisy.programming.kicks-ass.net> <95651a71-1adf-45ba-83eb-5744bc6d4a52@amd.com> <20260421143752.GD1064669@noisy.programming.kicks-ass.net> <20260421205647.GL3126523@noisy.programming.kicks-ass.net> <20260422092335.GH3102924@noisy.programming.kicks-ass.net> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.60.1 (3.60.1-1.fc45) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Wed, 2026-04-22 at 11:23 +0200, Peter Zijlstra wrote: >=20 > How's this? It 'passes' the ww_mutex selftest thing in so far as that > I > get the same: >=20 > [=C2=A0=C2=A0=C2=A0 2.312369] Beginning ww (wound) mutex selftests > [=C2=A0=C2=A0=C2=A0 4.853240] stress (stress_inorder_work) failed with -3= 5 > [=C2=A0=C2=A0=C2=A0 9.379572] Beginning ww (die) mutex selftests > [=C2=A0=C2=A0 16.435831] All ww mutex selftests passed >=20 > before the offending commit and after this patch. >=20 > --- > Subject: Subject: locking/mutex: Fix ww_mutex wait_list operations > From: Peter Zijlstra > Date: Wed Apr 22 10:38:41 CEST 2026 >=20 > Chaitanya and John reported commit 25500ba7e77c ("locking/mutex: > Remove the > list_head from struct mutex") wrecked ww_mutex. >=20 > Specifically there were 2 issues: >=20 > =C2=A0- __ww_waiter_prev() had the termination condition wrong; it would > terminate > =C2=A0=C2=A0 when the previous entry was the first, which results in a > truncated > =C2=A0=C2=A0 iteration: W3, W2, (no W1). >=20 > =C2=A0- __mutex_add_waiter(@pos !=3D NULL), as used by __ww_waiter_add() = / > =C2=A0=C2=A0 __ww_mutex_add_waiter(); this inserts @waiter before @pos (w= hich > is what > =C2=A0=C2=A0 list_add_tail() does). But this should then also update lock= - > >first_waiter. >=20 > Much thanks to Prateek for spotting the __mutex_add_waiter() issue! >=20 > Fixes: 25500ba7e77c ("locking/mutex: Remove the list_head from struct > mutex") > Reported-by: "Borah, Chaitanya Kumar" > > Closes: > https://lore.kernel.org/r/af005996-05e9-4336-8450-d14ca652ba5d%40intel.co= m > Reported-by: John Stultz > Closes: > https://lore.kernel.org/r/CANDhNCq%3Doizzud3hH3oqGzTrcjB8OwGeineJ3mwZuGdD= WG8fRQ%40mail.gmail.com > Debugged-by: K Prateek Nayak > Signed-off-by: Peter Zijlstra (Intel) > --- > =C2=A0kernel/locking/mutex.c=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 40 +++++++++= ++++++++++++++++++-------- > ----- > =C2=A0kernel/locking/ww_mutex.h |=C2=A0=C2=A0 34 ++++++++++++++++++++++++= ++++++++-- > =C2=A02 files changed, 59 insertions(+), 15 deletions(-) >=20 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -198,27 +198,43 @@ static inline void __mutex_clear_flag(st > =C2=A0} > =C2=A0 > =C2=A0/* > - * Add @waiter to a given location in the lock wait_list and set the > - * FLAG_WAITERS flag if it's the first waiter. > + * Add @waiter to the @lock wait_list and set the FLAG_WAITERS flag > if it's > + * the first waiter. > + * > + * When @pos, @waiter is added before the waiter indicated by @pos. > Otherwise > + * @waiter will be added to the tail of the list. > =C2=A0 */ > =C2=A0static void > =C2=A0__mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter, > - =C2=A0=C2=A0 struct mutex_waiter *first) > + =C2=A0=C2=A0 struct mutex_waiter *pos) > =C2=A0 __must_hold(&lock->wait_lock) > =C2=A0{ > + struct mutex_waiter *first =3D lock->first_waiter; > + > =C2=A0 hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX); > =C2=A0 debug_mutex_add_waiter(lock, waiter, current); > =C2=A0 > - if (!first) > - first =3D lock->first_waiter; > + if (pos) { > + /* > + * Insert @waiter before @pos. > + */ > + list_add_tail(&waiter->list, &pos->list); > + /* > + * If @pos =3D=3D @first, then @waiter will be the new > first. > + */ > + if (pos =3D=3D first) > + lock->first_waiter =3D waiter; > + return; > + } > =C2=A0 > =C2=A0 if (first) { > =C2=A0 list_add_tail(&waiter->list, &first->list); > - } else { > - INIT_LIST_HEAD(&waiter->list); > - lock->first_waiter =3D waiter; > - __mutex_set_flag(lock, MUTEX_FLAG_WAITERS); > + return; > =C2=A0 } > + > + INIT_LIST_HEAD(&waiter->list); > + lock->first_waiter =3D waiter; > + __mutex_set_flag(lock, MUTEX_FLAG_WAITERS); > =C2=A0} > =C2=A0 > =C2=A0static void > @@ -229,10 +245,8 @@ __mutex_remove_waiter(struct mutex *lock > =C2=A0 __mutex_clear_flag(lock, MUTEX_FLAGS); > =C2=A0 lock->first_waiter =3D NULL; > =C2=A0 } else { > - if (lock->first_waiter =3D=3D waiter) { > - lock->first_waiter =3D > list_first_entry(&waiter->list, > - =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct > mutex_waiter, list); > - } > + if (lock->first_waiter =3D=3D waiter) > + lock->first_waiter =3D list_next_entry(waiter, > list); > =C2=A0 list_del(&waiter->list); > =C2=A0 } > =C2=A0 > --- a/kernel/locking/ww_mutex.h > +++ b/kernel/locking/ww_mutex.h > @@ -6,6 +6,19 @@ > =C2=A0#define MUTEX_WAITER mutex_waiter > =C2=A0#define WAIT_LOCK wait_lock > =C2=A0 > +/* > + *=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 +--------= + > + *=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | first= =C2=A0 | > + *=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 +--------= + > + *=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 | > + *=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 v > + *=C2=A0 +----+=C2=A0=C2=A0=C2=A0=C2=A0 +----+=C2=A0=C2=A0=C2=A0=C2=A0 += ----+ > + *=C2=A0 | W3 | <-> | W1 | <-> | W2 | > + *=C2=A0 +----+=C2=A0=C2=A0=C2=A0=C2=A0 +----+=C2=A0=C2=A0=C2=A0=C2=A0 += ----+ > + *=C2=A0=C2=A0=C2=A0 ^=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ^ > + *=C2=A0=C2=A0=C2=A0 +---------------------+ > + */ > + > =C2=A0static inline struct mutex_waiter * > =C2=A0__ww_waiter_first(struct mutex *lock) > =C2=A0 __must_hold(&lock->wait_lock) > @@ -13,26 +26,43 @@ __ww_waiter_first(struct mutex *lock) > =C2=A0 return lock->first_waiter; > =C2=A0} > =C2=A0 > +/* > + * for (cur =3D __ww_waiter_first(); cur; cur =3D __ww_waiter_next()) > + * > + * Should iterate like: W1, W2, W3 > + */ > =C2=A0static inline struct mutex_waiter * > =C2=A0__ww_waiter_next(struct mutex *lock, struct mutex_waiter *w) > =C2=A0 __must_hold(&lock->wait_lock) > =C2=A0{ > =C2=A0 w =3D list_next_entry(w, list); > + /* > + * Terminate if the next entry is the first again, that has > already > + * been observed. > + */ > =C2=A0 if (lock->first_waiter =3D=3D w) > =C2=A0 return NULL; > =C2=A0 > =C2=A0 return w; > =C2=A0} > =C2=A0 > +/* > + * for (cur =3D __ww_waiter_last(); cur; cur =3D __ww_waiter_prev()) > + * > + * Should iterate like: W3, W2, W1 > + */ > =C2=A0static inline struct mutex_waiter * > =C2=A0__ww_waiter_prev(struct mutex *lock, struct mutex_waiter *w) > =C2=A0 __must_hold(&lock->wait_lock) > =C2=A0{ > - w =3D list_prev_entry(w, list); > + /* > + * Terminate at the first entry, the previous entry of first > is the > + * last and that has already been observed. > + */ > =C2=A0 if (lock->first_waiter =3D=3D w) > =C2=A0 return NULL; > =C2=A0 > - return w; > + return list_prev_entry(w, list); > =C2=A0} > =C2=A0 > =C2=A0static inline struct mutex_waiter * Confirmed on an independent userspace-visible reproducer: Resident Evil 2/3/4/9 under Proton on AMD Zen4 + RX 7900 XTX, which hangs deterministically during level load on current master (main thread parked in futex_waitv). With this patch applied on top of master, both RE2 and RE9 complete a full playthrough with save-resume on two independent workstations (ASUS and ASRock B650). No hang, no splats. Symptom details and third bisect log are in the separate thread at https://lore.kernel.org/r/CABXGCsO5fKq2nD9nO8yO1z50ZzgCPWqueNXHANjntaswoOh2= Dg@mail.gmail.com Tested-by: Mikhail Gavrilov --=20 Thanks, Mikhail