All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neeraj Upadhyay <neeraju@codeaurora.org>
To: paulmck@kernel.org
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, mingo@kernel.org, jiangshanlai@gmail.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org,
	rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
	fweisbec@gmail.com, oleg@redhat.com, joel@joelfernandes.org
Subject: Re: [PATCH RFC tip/core/rcu 4/5] srcu: Provide polling interfaces for Tiny SRCU grace periods
Date: Mon, 23 Nov 2020 10:04:23 +0530	[thread overview]
Message-ID: <3e2dceb0-5128-28c0-454f-2a60bd5ea4e5@codeaurora.org> (raw)
In-Reply-To: <20201122180105.GA1437@paulmck-ThinkPad-P72>



On 11/22/2020 11:31 PM, Paul E. McKenney wrote:
> On Sun, Nov 22, 2020 at 07:57:26PM +0530, Neeraj Upadhyay wrote:
>> On 11/21/2020 5:43 AM, Paul E. McKenney wrote:
>>> On Fri, Nov 20, 2020 at 05:28:32PM +0530, Neeraj Upadhyay wrote:
>>>> Hi Paul,
>>>>
>>>> On 11/17/2020 6:10 AM, paulmck@kernel.org wrote:
>>>>> From: "Paul E. McKenney" <paulmck@kernel.org>
>>>>>
>>>>> There is a need for a polling interface for SRCU grace
>>>>> periods, so this commit supplies get_state_synchronize_srcu(),
>>>>> start_poll_synchronize_srcu(), and poll_state_synchronize_srcu() for this
>>>>> purpose.  The first can be used if future grace periods are inevitable
>>>>> (perhaps due to a later call_srcu() invocation), the second if future
>>>>> grace periods might not otherwise happen, and the third to check if a
>>>>> grace period has elapsed since the corresponding call to either of the
>>>>> first two.
>>>>>
>>>>> As with get_state_synchronize_rcu() and cond_synchronize_rcu(),
>>>>> the return value from either get_state_synchronize_srcu() or
>>>>> start_poll_synchronize_srcu() must be passed in to a later call to
>>>>> poll_state_synchronize_srcu().
>>>>>
>>>>> Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/
>>>>> Reported-by: Kent Overstreet <kent.overstreet@gmail.com>
>>>>> [ paulmck: Add EXPORT_SYMBOL_GPL() per kernel test robot feedback. ]
>>>>> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>>>>> ---
>>>>>     include/linux/rcupdate.h |  2 ++
>>>>>     include/linux/srcu.h     |  3 +++
>>>>>     include/linux/srcutiny.h |  1 +
>>>>>     kernel/rcu/srcutiny.c    | 52 ++++++++++++++++++++++++++++++++++++++++++++++--
>>>>>     4 files changed, 56 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
>>>>> index de08264..e09c0d8 100644
>>>>> --- a/include/linux/rcupdate.h
>>>>> +++ b/include/linux/rcupdate.h
>>>>> @@ -33,6 +33,8 @@
>>>>>     #define ULONG_CMP_GE(a, b)	(ULONG_MAX / 2 >= (a) - (b))
>>>>>     #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
>>>>>     #define ulong2long(a)		(*(long *)(&(a)))
>>>>> +#define USHORT_CMP_GE(a, b)	(USHRT_MAX / 2 >= (unsigned short)((a) - (b)))
>>>>> +#define USHORT_CMP_LT(a, b)	(USHRT_MAX / 2 < (unsigned short)((a) - (b)))
>>>>>     /* Exported common interfaces */
>>>>>     void call_rcu(struct rcu_head *head, rcu_callback_t func);
>>>>> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
>>>>> index e432cc9..a0895bb 100644
>>>>> --- a/include/linux/srcu.h
>>>>> +++ b/include/linux/srcu.h
>>>>> @@ -60,6 +60,9 @@ void cleanup_srcu_struct(struct srcu_struct *ssp);
>>>>>     int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
>>>>>     void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
>>>>>     void synchronize_srcu(struct srcu_struct *ssp);
>>>>> +unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp);
>>>>> +unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
>>>>> +bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
>>>>>     #ifdef CONFIG_DEBUG_LOCK_ALLOC
>>>>> diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
>>>>> index fed4a2d..e9bd6fb 100644
>>>>> --- a/include/linux/srcutiny.h
>>>>> +++ b/include/linux/srcutiny.h
>>>>> @@ -16,6 +16,7 @@
>>>>>     struct srcu_struct {
>>>>>     	short srcu_lock_nesting[2];	/* srcu_read_lock() nesting depth. */
>>>>>     	unsigned short srcu_idx;	/* Current reader array element in bit 0x2. */
>>>>> +	unsigned short srcu_idx_max;	/* Furthest future srcu_idx request. */
>>>>>     	u8 srcu_gp_running;		/* GP workqueue running? */
>>>>>     	u8 srcu_gp_waiting;		/* GP waiting for readers? */
>>>>>     	struct swait_queue_head srcu_wq;
>>>>> diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
>>>>> index 3bac1db..b405811 100644
>>>>> --- a/kernel/rcu/srcutiny.c
>>>>> +++ b/kernel/rcu/srcutiny.c
>>>>> @@ -34,6 +34,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp)
>>>>>     	ssp->srcu_gp_running = false;
>>>>>     	ssp->srcu_gp_waiting = false;
>>>>>     	ssp->srcu_idx = 0;
>>>>> +	ssp->srcu_idx_max = 0;
>>>>>     	INIT_WORK(&ssp->srcu_work, srcu_drive_gp);
>>>>>     	INIT_LIST_HEAD(&ssp->srcu_work.entry);
>>>>>     	return 0;
>>>>> @@ -114,7 +115,7 @@ void srcu_drive_gp(struct work_struct *wp)
>>>>>     	struct srcu_struct *ssp;
>>>>>     	ssp = container_of(wp, struct srcu_struct, srcu_work);
>>>>> -	if (ssp->srcu_gp_running || !READ_ONCE(ssp->srcu_cb_head))
>>>>> +	if (ssp->srcu_gp_running || USHORT_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
>>>>>     		return; /* Already running or nothing to do. */
>>>>>     	/* Remove recently arrived callbacks and wait for readers. */
>>>>> @@ -147,14 +148,19 @@ void srcu_drive_gp(struct work_struct *wp)
>>>>>     	 * straighten that out.
>>>>>     	 */
>>>>>     	WRITE_ONCE(ssp->srcu_gp_running, false);
>>>>> -	if (READ_ONCE(ssp->srcu_cb_head))
>>>>> +	if (USHORT_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
>>>>
>>>> Should this be USHORT_CMP_LT ?
>>>
>>> I believe that you are correct.  As is, it works but does needless
>>> grace periods.
>>>
>>>>>     		schedule_work(&ssp->srcu_work);
>>>>>     }
>>>>>     EXPORT_SYMBOL_GPL(srcu_drive_gp);
>>>>>     static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
>>>>>     {
>>>>> +	unsigned short cookie;
>>>>> +
>>>>>     	if (!READ_ONCE(ssp->srcu_gp_running)) {
>>>>> +		cookie = get_state_synchronize_srcu(ssp);
>>>>> +		if (USHORT_CMP_LT(READ_ONCE(ssp->srcu_idx_max), cookie))
>>>>> +			WRITE_ONCE(ssp->srcu_idx_max, cookie);
>>>>
>>>> I was thinking of a case which might break with this.
>>>>
>>>> Consider a scenario, where GP is in progress and kworker is right
>>>> before below point, after executing callbacks:
>>>>
>>>> void srcu_drive_gp(struct work_struct *wp) {
>>>>     <snip>
>>>
>>> We updated ->srcu_idx up here, correct?  So it has bottom bit zero.
>>>
>>>>     while (lh) {
>>>>     <cb execution loop>
>>>>     }
>>>>     >>> CURRENT EXECUTION POINT
>>>
>>> Keeping in mind that Tiny SRCU always runs !PREEMPT, this must be
>>> due to an interrupt.
>>>
>> Looking more, issue can happen, even when kworker is waiting for GP
>> completion @
>>
>> swait_event_exclusive(ssp->srcu_wq,
>> !READ_ONCE(ssp->srcu_lock_nesting[idx]));
>>
>> Other process can call call_srcu() and skip srcu_idx_max update, as
>> ssp->srcu_gp_running is true.
> 
> Good point!  Does this mean that additional changes are required,
> or does the fix below cover this situation as well?
> 
> 							Thanx, Paul
> 

I think the current fix covers this. Just wanted to higlight that
the window is not small and a rcutorture test case might be able to 
uncover the issue?


Thanks
Neeraj

>> Thanks
>> Neeraj
>>
>>>>     WRITE_ONCE(ssp->srcu_gp_running, false);
>>>>
>>>>     if (USHORT_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
>>>>       schedule_work(&ssp->srcu_work);
>>>> }
>>>>
>>>> Now, at this instance, srcu_gp_start_if_needed() runs and samples
>>>> srcu_gp_running and returns, without updating srcu_idx_max
>>>>
>>>> static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
>>>> {
>>>>     if (!READ_ONCE(ssp->srcu_gp_running)) returns true
>>>>     <snip>
>>>> }
>>>
>>> This could happen in an interrupt handler, so with you thus far.
>>>
>>>> kworker running srcu_drive_gp() resumes and returns without queueing a new
>>>> schedule_work(&ssp->srcu_work); for new GP?
>>>>
>>>> Prior to this patch, call_srcu() enqueues a cb before entering
>>>> srcu_gp_start_if_needed(), and srcu_drive_gp() observes this
>>>> queuing, and schedule a work for the new GP, for this scenario.
>>>>
>>>>
>>>>    	WRITE_ONCE(ssp->srcu_gp_running, false);
>>>> -	if (READ_ONCE(ssp->srcu_cb_head))
>>>> +	if (USHORT_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
>>>>    		schedule_work(&ssp->srcu_work);
>>>>
>>>> So, should the "cookie" calculation and "srcu_idx_max" setting be moved
>>>> outside of ssp->srcu_gp_running check and maybe return the same cookie to
>>>> caller and use that as the returned cookie from
>>>> start_poll_synchronize_srcu() ?
>>>>
>>>> srcu_gp_start_if_needed()
>>>>     cookie = get_state_synchronize_srcu(ssp);
>>>>     if (USHORT_CMP_LT(READ_ONCE(ssp->srcu_idx_max), cookie))
>>>>        WRITE_ONCE(ssp->srcu_idx_max, cookie);
>>>>     if (!READ_ONCE(ssp->srcu_gp_running)) {
>>>>     <snip>
>>>> }
>>>
>>> I believe that you are quite correct, thank you!
>>>
>>> But rcutorture does have a call_srcu() (really a ->call, but same if SRCU)
>>> in a timer handler.  The race window is quite narrow, so testing it might
>>> be a challenge...
>>>
>>> This is what I end up with:
>>>
>>> 	static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
>>> 	{
>>> 		unsigned short cookie;
>>>
>>> 		cookie = get_state_synchronize_srcu(ssp);
>>> 		if (USHORT_CMP_LT(READ_ONCE(ssp->srcu_idx_max), cookie))
>>> 			WRITE_ONCE(ssp->srcu_idx_max, cookie);
>>> 		if (!READ_ONCE(ssp->srcu_gp_running)) {
>>> 			if (likely(srcu_init_done))
>>> 				schedule_work(&ssp->srcu_work);
>>> 			else if (list_empty(&ssp->srcu_work.entry))
>>> 				list_add(&ssp->srcu_work.entry, &srcu_boot_list);
>>> 		}
>>> 	}
>>>
>>> Does that look plausible?
>>
>> Looks good.
>>
>>>
>>> 							Thanx, Paul
>>>
>>
>> -- 
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of
>> the Code Aurora Forum, hosted by The Linux Foundation

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member of the Code Aurora Forum, hosted by The Linux Foundation

  reply	other threads:[~2020-11-23  4:34 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-17  0:40 [PATCH tip/core/rcu 0/5] Provide SRCU polling grace-period interfaces Paul E. McKenney
2020-11-17  0:40 ` [PATCH RFC tip/core/rcu 1/5] srcu: Make Tiny SRCU use multi-bit grace-period counter paulmck
2020-11-19  8:14   ` Neeraj Upadhyay
2020-11-19 18:00     ` Paul E. McKenney
2020-11-17  0:40 ` [PATCH RFC tip/core/rcu 2/5] srcu: Provide internal interface to start a Tiny SRCU grace period paulmck
2020-11-20 11:36   ` Neeraj Upadhyay
2020-11-17  0:40 ` [PATCH RFC tip/core/rcu 3/5] srcu: Provide internal interface to start a Tree " paulmck
2020-11-20 11:36   ` Neeraj Upadhyay
2020-11-21  0:37     ` Paul E. McKenney
2020-11-17  0:40 ` [PATCH RFC tip/core/rcu 4/5] srcu: Provide polling interfaces for Tiny SRCU grace periods paulmck
2020-11-20 11:58   ` Neeraj Upadhyay
2020-11-21  0:13     ` Paul E. McKenney
2020-11-22 14:27       ` Neeraj Upadhyay
2020-11-22 18:01         ` Paul E. McKenney
2020-11-23  4:34           ` Neeraj Upadhyay [this message]
2020-11-23 21:07             ` Paul E. McKenney
2020-11-17  0:40 ` [PATCH RFC tip/core/rcu 5/5] srcu: Provide polling interfaces for Tree " paulmck
2020-11-20 12:01   ` Neeraj Upadhyay
2020-11-21  0:16     ` Paul E. McKenney
2020-11-22 14:22       ` Neeraj Upadhyay
2020-11-21  0:58 ` [PATCH tip/core/rcu 0/5] Provide SRCU polling grace-period interfaces Paul E. McKenney
2020-11-21  1:05   ` Steven Rostedt
2020-11-21  1:12     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e2dceb0-5128-28c0-454f-2a60bd5ea4e5@codeaurora.org \
    --to=neeraju@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.