public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "mikhail.v.gavrilov@gmail.com" <mikhail.v.gavrilov@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>, John Stultz <jstultz@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
	"Borah, Chaitanya Kumar"	 <chaitanya.kumar.borah@intel.com>,
	willy@infradead.org,  linux-kernel@vger.kernel.org,
	"intel-gfx@lists.freedesktop.org"	
	<intel-gfx@lists.freedesktop.org>,
	"intel-xe@lists.freedesktop.org"	
	<intel-xe@lists.freedesktop.org>,
	"Kurmi, Suresh Kumar"	 <suresh.kumar.kurmi@intel.com>,
	"Saarinen, Jani" <jani.saarinen@intel.com>,
	ravitejax.veesam@intel.com
Subject: Re: Regression on linux-next (next-20260324 )
Date: Wed, 22 Apr 2026 20:52:08 +0500	[thread overview]
Message-ID: <2a93eae0c4b364d44fffe3840dcbb0a60a1c6114.camel@gmail.com> (raw)
In-Reply-To: <20260422092335.GH3102924@noisy.programming.kicks-ass.net>

On Wed, 2026-04-22 at 11:23 +0200, Peter Zijlstra wrote:
> 
> How's this? It 'passes' the ww_mutex selftest thing in so far as that
> I
> get the same:
> 
> [    2.312369] Beginning ww (wound) mutex selftests
> [    4.853240] stress (stress_inorder_work) failed with -35
> [    9.379572] Beginning ww (die) mutex selftests
> [   16.435831] All ww mutex selftests passed
> 
> before the offending commit and after this patch.
> 
> ---
> Subject: Subject: locking/mutex: Fix ww_mutex wait_list operations
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Wed Apr 22 10:38:41 CEST 2026
> 
> Chaitanya and John reported commit 25500ba7e77c ("locking/mutex:
> Remove the
> list_head from struct mutex") wrecked ww_mutex.
> 
> Specifically there were 2 issues:
> 
>  - __ww_waiter_prev() had the termination condition wrong; it would
> terminate
>    when the previous entry was the first, which results in a
> truncated
>    iteration: W3, W2, (no W1).
> 
>  - __mutex_add_waiter(@pos != NULL), as used by __ww_waiter_add() /
>    __ww_mutex_add_waiter(); this inserts @waiter before @pos (which
> is what
>    list_add_tail() does). But this should then also update lock-
> >first_waiter.
> 
> Much thanks to Prateek for spotting the __mutex_add_waiter() issue!
> 
> Fixes: 25500ba7e77c ("locking/mutex: Remove the list_head from struct
> mutex")
> Reported-by: "Borah, Chaitanya Kumar"
> <chaitanya.kumar.borah@intel.com>
> Closes:
> https://lore.kernel.org/r/af005996-05e9-4336-8450-d14ca652ba5d%40intel.com
> Reported-by: John Stultz <jstultz@google.com>
> Closes:
> https://lore.kernel.org/r/CANDhNCq%3Doizzud3hH3oqGzTrcjB8OwGeineJ3mwZuGdDWG8fRQ%40mail.gmail.com
> Debugged-by: K Prateek Nayak <kprateek.nayak@amd.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/locking/mutex.c    |   40 +++++++++++++++++++++++++++--------
> -----
>  kernel/locking/ww_mutex.h |   34 ++++++++++++++++++++++++++++++++--
>  2 files changed, 59 insertions(+), 15 deletions(-)
> 
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -198,27 +198,43 @@ static inline void __mutex_clear_flag(st
>  }
>  
>  /*
> - * Add @waiter to a given location in the lock wait_list and set the
> - * FLAG_WAITERS flag if it's the first waiter.
> + * Add @waiter to the @lock wait_list and set the FLAG_WAITERS flag
> if it's
> + * the first waiter.
> + *
> + * When @pos, @waiter is added before the waiter indicated by @pos.
> Otherwise
> + * @waiter will be added to the tail of the list.
>   */
>  static void
>  __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
> -		   struct mutex_waiter *first)
> +		   struct mutex_waiter *pos)
>  	__must_hold(&lock->wait_lock)
>  {
> +	struct mutex_waiter *first = lock->first_waiter;
> +
>  	hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX);
>  	debug_mutex_add_waiter(lock, waiter, current);
>  
> -	if (!first)
> -		first = lock->first_waiter;
> +	if (pos) {
> +		/*
> +		 * Insert @waiter before @pos.
> +		 */
> +		list_add_tail(&waiter->list, &pos->list);
> +		/*
> +		 * If @pos == @first, then @waiter will be the new
> first.
> +		 */
> +		if (pos == first)
> +			lock->first_waiter = waiter;
> +		return;
> +	}
>  
>  	if (first) {
>  		list_add_tail(&waiter->list, &first->list);
> -	} else {
> -		INIT_LIST_HEAD(&waiter->list);
> -		lock->first_waiter = waiter;
> -		__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
> +		return;
>  	}
> +
> +	INIT_LIST_HEAD(&waiter->list);
> +	lock->first_waiter = waiter;
> +	__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
>  }
>  
>  static void
> @@ -229,10 +245,8 @@ __mutex_remove_waiter(struct mutex *lock
>  		__mutex_clear_flag(lock, MUTEX_FLAGS);
>  		lock->first_waiter = NULL;
>  	} else {
> -		if (lock->first_waiter == waiter) {
> -			lock->first_waiter =
> list_first_entry(&waiter->list,
> -							      struct
> mutex_waiter, list);
> -		}
> +		if (lock->first_waiter == waiter)
> +			lock->first_waiter = list_next_entry(waiter,
> list);
>  		list_del(&waiter->list);
>  	}
>  
> --- a/kernel/locking/ww_mutex.h
> +++ b/kernel/locking/ww_mutex.h
> @@ -6,6 +6,19 @@
>  #define MUTEX_WAITER	mutex_waiter
>  #define WAIT_LOCK	wait_lock
>  
> +/*
> + *           +--------+
> + *           | first  |
> + *           +--------+
> + *                |
> + *                v
> + *  +----+     +----+     +----+
> + *  | W3 | <-> | W1 | <-> | W2 |
> + *  +----+     +----+     +----+
> + *    ^                     ^
> + *    +---------------------+
> + */
> +
>  static inline struct mutex_waiter *
>  __ww_waiter_first(struct mutex *lock)
>  	__must_hold(&lock->wait_lock)
> @@ -13,26 +26,43 @@ __ww_waiter_first(struct mutex *lock)
>  	return lock->first_waiter;
>  }
>  
> +/*
> + * for (cur = __ww_waiter_first(); cur; cur = __ww_waiter_next())
> + *
> + * Should iterate like: W1, W2, W3
> + */
>  static inline struct mutex_waiter *
>  __ww_waiter_next(struct mutex *lock, struct mutex_waiter *w)
>  	__must_hold(&lock->wait_lock)
>  {
>  	w = list_next_entry(w, list);
> +	/*
> +	 * Terminate if the next entry is the first again, that has
> already
> +	 * been observed.
> +	 */
>  	if (lock->first_waiter == w)
>  		return NULL;
>  
>  	return w;
>  }
>  
> +/*
> + * for (cur = __ww_waiter_last(); cur; cur = __ww_waiter_prev())
> + *
> + * Should iterate like: W3, W2, W1
> + */
>  static inline struct mutex_waiter *
>  __ww_waiter_prev(struct mutex *lock, struct mutex_waiter *w)
>  	__must_hold(&lock->wait_lock)
>  {
> -	w = list_prev_entry(w, list);
> +	/*
> +	 * Terminate at the first entry, the previous entry of first
> is the
> +	 * last and that has already been observed.
> +	 */
>  	if (lock->first_waiter == w)
>  		return NULL;
>  
> -	return w;
> +	return list_prev_entry(w, list);
>  }
>  
>  static inline struct mutex_waiter *


Confirmed on an independent userspace-visible reproducer: Resident
Evil 2/3/4/9 under Proton on AMD Zen4 + RX 7900 XTX, which hangs
deterministically during level load on current master (main thread
parked in futex_waitv). With this patch applied on top of master,
both RE2 and RE9 complete a full playthrough with save-resume on two
independent workstations (ASUS and ASRock B650). No hang, no splats.

Symptom details and third bisect log are in the separate thread at
https://lore.kernel.org/r/CABXGCsO5fKq2nD9nO8yO1z50ZzgCPWqueNXHANjntaswoOh2Dg@mail.gmail.com

Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>

-- 
Thanks,
Mikhail

  parent reply	other threads:[~2026-04-24 12:37 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-27 13:39 Regression on linux-next (next-20260324 ) Borah, Chaitanya Kumar
2026-03-27 16:31 ` Peter Zijlstra
2026-03-27 16:43   ` Peter Zijlstra
2026-03-30  8:26     ` Borah, Chaitanya Kumar
2026-03-30 19:50       ` Peter Zijlstra
2026-04-20 13:03         ` Peter Zijlstra
2026-04-21  6:45           ` John Stultz
2026-04-21 10:15             ` Peter Zijlstra
2026-04-21 12:54               ` K Prateek Nayak
2026-04-21 14:37                 ` Peter Zijlstra
2026-04-21 14:45                   ` Matthew Wilcox
2026-04-21 15:03                     ` Peter Zijlstra
2026-04-21 15:48                   ` K Prateek Nayak
2026-04-21 17:29                     ` John Stultz
2026-04-21 20:56                       ` Peter Zijlstra
2026-04-22  9:23                         ` Peter Zijlstra
2026-04-22 12:07                           ` K Prateek Nayak
2026-04-22 15:52                           ` mikhail.v.gavrilov [this message]
2026-04-21 14:31           ` Borah, Chaitanya Kumar
2026-03-27 16:49 ` ✗ LGCI.VerificationFailed: failure for Regression on linux-next (next-20260324 ) (rev2) Patchwork
2026-04-20 19:22 ` ✗ LGCI.VerificationFailed: failure for Regression on linux-next (next-20260324 ) (rev3) Patchwork
2026-04-21 15:17 ` ✗ LGCI.VerificationFailed: failure for Regression on linux-next (next-20260324 ) (rev4) Patchwork
2026-04-22  9:54 ` ✗ LGCI.VerificationFailed: failure for Regression on linux-next (next-20260324 ) (rev5) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a93eae0c4b364d44fffe3840dcbb0a60a1c6114.camel@gmail.com \
    --to=mikhail.v.gavrilov@gmail.com \
    --cc=chaitanya.kumar.borah@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.saarinen@intel.com \
    --cc=jstultz@google.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravitejax.veesam@intel.com \
    --cc=suresh.kumar.kurmi@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox