All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: "Ingo Molnar" <mingo@redhat.com>, "Will Deacon" <will@kernel.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	linux-kernel@vger.kernel.org, john.p.donnelly@oracle.com,
	"Hillf Danton" <hdanton@sina.com>,
	"Mukesh Ojha" <quic_mojha@quicinc.com>,
	"Ting11 Wang 王婷" <wangting11@xiaomi.com>
Subject: Re: [PATCH v7 4/4] locking/rwsem: Enable direct rwsem lock handoff
Date: Mon, 13 Feb 2023 13:31:59 +0100	[thread overview]
Message-ID: <Y+otv+QGyMpHAFO1@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20230126003628.365092-5-longman@redhat.com>

On Wed, Jan 25, 2023 at 07:36:28PM -0500, Waiman Long wrote:

> @@ -609,6 +618,12 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>  
>  	lockdep_assert_held(&sem->wait_lock);
>  
> +	if (!waiter->task) {
> +		/* Write lock handed off */
> +		smp_acquire__after_ctrl_dep();
> +		return true;
> +	}
> +
>  	count = atomic_long_read(&sem->count);
>  	do {
>  		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
> @@ -754,6 +769,10 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
>  
>  	owner = rwsem_owner_flags(sem, &flags);
>  	state = rwsem_owner_state(owner, flags);
> +
> +	if (owner == current)
> +		return OWNER_NONSPINNABLE;	/* Handoff granted */
> +
>  	if (state != OWNER_WRITER)
>  		return state;
>  
> @@ -1168,21 +1186,23 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>  		 * without sleeping.
>  		 */
>  		if (waiter.handoff_set) {
> -			enum owner_state owner_state;
> -
> -			owner_state = rwsem_spin_on_owner(sem);
> -			if (owner_state == OWNER_NULL)
> -				goto trylock_again;
> +			rwsem_spin_on_owner(sem);
> +			if (!READ_ONCE(waiter.task)) {
> +				/* Write lock handed off */
> +				smp_acquire__after_ctrl_dep();
> +				set_current_state(TASK_RUNNING);
> +				goto out;
> +			}
>  		}
>  
>  		schedule_preempt_disabled();
>  		lockevent_inc(rwsem_sleep_writer);
>  		set_current_state(state);
> -trylock_again:
>  		raw_spin_lock_irq(&sem->wait_lock);
>  	}
>  	__set_current_state(TASK_RUNNING);
>  	raw_spin_unlock_irq(&sem->wait_lock);
> +out:
>  	lockevent_inc(rwsem_wlock);
>  	trace_contention_end(sem, 0);
>  	return sem;
> @@ -1190,6 +1210,11 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>  out_nolock:
>  	__set_current_state(TASK_RUNNING);
>  	raw_spin_lock_irq(&sem->wait_lock);
> +	if (!waiter.task) {
> +		smp_acquire__after_ctrl_dep();
> +		raw_spin_unlock_irq(&sem->wait_lock);
> +		goto out;
> +	}
>  	rwsem_del_wake_waiter(sem, &waiter, &wake_q);
>  	lockevent_inc(rwsem_wlock_fail);
>  	trace_contention_end(sem, -EINTR);
> @@ -1202,14 +1227,41 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>   */
>  static struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem)
>  {
> -	unsigned long flags;
>  	DEFINE_WAKE_Q(wake_q);
> +	unsigned long flags;
> +	unsigned long count;
>  
>  	raw_spin_lock_irqsave(&sem->wait_lock, flags);
>  
> -	if (!list_empty(&sem->wait_list))
> -		rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q);
> +	if (list_empty(&sem->wait_list))
> +		goto unlock_out;
> +
> +	/*
> +	 * If the rwsem is free and handoff flag is set with wait_lock held,
> +	 * no other CPUs can take an active lock.
> +	 */
> +	count = atomic_long_read(&sem->count);
> +	if (!(count & RWSEM_LOCK_MASK) && (count & RWSEM_FLAG_HANDOFF)) {
> +		/*
> +		 * Since rwsem_mark_wake() will handle the handoff to reader
> +		 * properly, we don't need to do anything extra for reader.
> +		 * Special handoff processing will only be needed for writer.
> +		 */
> +		struct rwsem_waiter *waiter = rwsem_first_waiter(sem);
> +		long adj = RWSEM_WRITER_LOCKED - RWSEM_FLAG_HANDOFF;
> +
> +		if (waiter->type == RWSEM_WAITING_FOR_WRITE) {
> +			atomic_long_set(&sem->owner, (long)waiter->task);
> +			atomic_long_add(adj, &sem->count);
> +			wake_q_add(&wake_q, waiter->task);
> +			rwsem_del_waiter(sem, waiter);
> +			waiter->task = NULL;	/* Signal the handoff */
> +			goto unlock_out;
> +		}
> +	}
> +	rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q);
>  
> +unlock_out:
>  	raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
>  	wake_up_q(&wake_q);
>  

I am once again confused...

*WHY* are you changing the writer wake-up path? The comments added here
don't clarify anything.

If we set handoff, we terminate/disallow the spinning/stealing. The
direct consequence is that the slowpath/wait-list becomes the only way
forward.

Since we don't take wait_lock on up, we fundamentally have a race
condition. But *WHY* do you insist on handling that in rwsem_wake()?
Delaying all that until rwsem_try_write_lock()? Doing so would render
pretty much all of the above pointless, no?

After all, rwsem_mark_wake() already wakes the writer if it is first,
no? Why invent yet another special way to wake up the writer.


Also; and I asked this last time around; why do we care about the
handoff to writer *at*all* ? It is the readers that set HANDOFF.


  parent reply	other threads:[~2023-02-13 12:32 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-26  0:36 [PATCH v7 0/4] lockinig/rwsem: Fix rwsem bugs & enable true lock handoff Waiman Long
2023-01-26  0:36 ` [PATCH v7 1/4] locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath Waiman Long
2023-01-26 11:38   ` [tip: locking/core] " tip-bot2 for Waiman Long
2023-01-26  0:36 ` [PATCH v7 2/4] locking/rwsem: Disable preemption at all down_read*() and up_read() code paths Waiman Long
2023-01-26 11:38   ` [tip: locking/core] locking/rwsem: Disable preemption in " tip-bot2 for Waiman Long
2023-01-26  0:36 ` [PATCH v7 3/4] locking/rwsem: Disable preemption at all down_write*() and up_write() " Waiman Long
2023-01-26 11:38   ` [tip: locking/core] locking/rwsem: Disable preemption in " tip-bot2 for Waiman Long
2023-01-26  0:36 ` [PATCH v7 4/4] locking/rwsem: Enable direct rwsem lock handoff Waiman Long
2023-02-12 13:33   ` kernel test robot
2023-02-13 12:31   ` Peter Zijlstra [this message]
2023-02-13 17:14     ` Waiman Long
2023-02-13 21:52       ` Peter Zijlstra
2023-02-14  2:03         ` Waiman Long
2023-02-23  1:44   ` kernel test robot
2023-01-26 12:46 ` [PATCH v7 0/4] lockinig/rwsem: Fix rwsem bugs & enable true " Ingo Molnar
2023-01-26 13:17   ` Waiman Long
     [not found] ` <20230126100441.4823-1-hdanton@sina.com>
2023-01-26 12:59   ` [PATCH v7 4/4] locking/rwsem: Enable direct rwsem " Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+otv+QGyMpHAFO1@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=boqun.feng@gmail.com \
    --cc=hdanton@sina.com \
    --cc=john.p.donnelly@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=quic_mojha@quicinc.com \
    --cc=wangting11@xiaomi.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.