All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Parri <parri.andrea@gmail.com>
To: Prateek Sood <prsood@codeaurora.org>,
	peterz@infradead.org, mingo@redhat.com
Cc: sramana@codeaurora.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] osq_lock: fix osq_lock queue corruption
Date: Thu, 10 Aug 2017 06:40:05 +0200	[thread overview]
Message-ID: <20170810044005.GA2907@andrea> (raw)
In-Reply-To: <1501521890-11661-1-git-send-email-prsood@codeaurora.org>

On Mon, Jul 31, 2017 at 10:54:50PM +0530, Prateek Sood wrote:
> Fix ordering of link creation between node->prev and prev->next in
> osq_lock(). A case in which the status of optimistic spin queue is
> CPU6->CPU2 in which CPU6 has acquired the lock.
> 
>         tail
>           v
>   ,-. <- ,-.
>   |6|    |2|
>   `-' -> `-'
> 
> At this point if CPU0 comes in to acquire osq_lock, it will update the
> tail count.
> 
>   CPU2			CPU0
>   ----------------------------------
> 
> 				       tail
> 				         v
> 			  ,-. <- ,-.    ,-.
> 			  |6|    |2|    |0|
> 			  `-' -> `-'    `-'
> 
> After tail count update if CPU2 starts to unqueue itself from
> optimistic spin queue, it will find updated tail count with CPU0 and
> update CPU2 node->next to NULL in osq_wait_next().
> 
>   unqueue-A
> 
> 	       tail
> 	         v
>   ,-. <- ,-.    ,-.
>   |6|    |2|    |0|
>   `-'    `-'    `-'
> 
>   unqueue-B
> 
>   ->tail != curr && !node->next
> 
> If reordering of following stores happen then
> prev->next where prev being CPU2 would be updated to point to CPU0 node:
> 
> 				       tail
> 				         v
> 			  ,-. <- ,-.    ,-.
> 			  |6|    |2|    |0|
> 			  `-' -> `-' -> `-'
> 
>   osq_wait_next()
>     node->next <- 0
>     xchg(node->next, NULL)
> 
> 	       tail
> 	         v
>   ,-. <- ,-.    ,-.
>   |6|    |2|    |0|
>   `-'    `-'    `-'
> 
>   unqueue-C
> 
> At this point if next instruction
> WRITE_ONCE(next->prev, prev);
> in CPU2 path is committed before the update of CPU0 node->prev = prev then
> CPU0 node->prev will point to CPU6 node.
> 
> 	       tail
>     V----------. v
>   ,-. <- ,-.    ,-.
>   |6|    |2|    |0|
>   `-'    `-'    `-'
>      `----------^
> 
> At this point if CPU0 path's node->prev = prev is committed resulting
> in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL
> currently,
> 
> 				       tail
> 			                 v
> 			  ,-. <- ,-. <- ,-.
> 			  |6|    |2|    |0|
> 			  `-'    `-'    `-'
> 			     `----------^
> 
> so if CPU0 gets into unqueue path of osq_lock it will keep spinning
> in infinite loop as condition prev->next == node will never be true.
> 
> Signed-off-by: Prateek Sood <prsood@codeaurora.org>
> ---
>  kernel/locking/osq_lock.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index a316794..9f4afa3 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>  
>  	prev = decode_cpu(old);
>  	node->prev = prev;
> +
> +	/*
> +	 * osq_lock()			unqueue
> +	 *
> +	 * node->prev = prev            osq_wait_next()
> +	 * WMB                          MB
> +	 * prev->next = node            next->prev = prev //unqueue-C
> +	 *
> +	 * Here 'node->prev' and 'next->prev' are the same variable and we need
> +	 * to ensure these stores happen in-order to avoid corrupting the list.
> +	 */

The interested pattern/behavior remains somehow implicit in this snippet
(for example, as you described above, a load "reading from" the store to
prev->next is implicit in that osq_wait_next()); however I was unable to
come up with an alternative solution without complicating the comment.

Reviewed-by: Andrea Parri <parri.andrea@gmail.com>

  Andrea


> +	smp_wmb();
> +
>  	WRITE_ONCE(prev->next, node);
>  
>  	/*
> -- 
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
> is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> 

  parent reply	other threads:[~2017-08-10  4:40 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-31 17:24 [PATCH] osq_lock: fix osq_lock queue corruption Prateek Sood
2017-08-07  5:06 ` Prateek Sood
2017-08-10  4:40 ` Andrea Parri [this message]
  -- strict thread matches above, loose matches on Subject: below --
2017-07-14 13:47 Prateek Sood
2017-07-18 11:25 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170810044005.GA2907@andrea \
    --to=parri.andrea@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=prsood@codeaurora.org \
    --cc=sramana@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.