From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754146AbbCFOfb (ORCPT <rfc822;w@1wt.eu>);
	Fri, 6 Mar 2015 09:35:31 -0500
Received: from mail.efficios.com ([78.47.125.74]:58276 "EHLO mail.efficios.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751437AbbCFOfa (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Fri, 6 Mar 2015 09:35:30 -0500
Date: Fri, 6 Mar 2015 14:35:23 +0000 (UTC)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        Huang Ying <ying.huang@intel.com>,
        Lai Jiangshan <laijs@cn.fujitsu.com>,
        Lai Jiangshan <eag0628@gmail.com>,
        Peter Zijlstra <peterz@infradead.org>,
        LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>,
        Steven Rostedt <rostedt@goodmis.org>
Message-ID: <402283060.238174.1425652523685.JavaMail.zimbra@efficios.com>
In-Reply-To: <CA+55aFzn+ikDZ=Zi6EegH0ji6Wc1o-k6f1DmPGjzaJ+K3udc5g@mail.gmail.com>
References: <995381344.227770.1425597484864.JavaMail.zimbra@efficios.com> <950470583.228004.1425599331454.JavaMail.zimbra@efficios.com> <CA+55aFzn+ikDZ=Zi6EegH0ji6Wc1o-k6f1DmPGjzaJ+K3udc5g@mail.gmail.com>
Subject: Re: Possible lock-less list race in scheduler_ipi()
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [173.246.22.116]
X-Mailer: Zimbra 8.0.7_GA_6021 (ZimbraWebClient - FF36 (Linux)/8.0.7_GA_6021)
Thread-Topic: Possible lock-less list race in scheduler_ipi()
Thread-Index: guifqV1y0peTUmWN8XR0DQeCKHZdhg==
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

----- Original Message -----
> From: "Linus Torvalds" <torvalds@linux-foundation.org>
> To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, "Huang Ying" <ying.huang@intel.com>, "Lai Jiangshan"
> <laijs@cn.fujitsu.com>, "Lai Jiangshan" <eag0628@gmail.com>, "Peter Zijlstra" <peterz@infradead.org>, "LKML"
> <linux-kernel@vger.kernel.org>, "Ingo Molnar" <mingo@kernel.org>, "Steven Rostedt" <rostedt@goodmis.org>
> Sent: Thursday, March 5, 2015 8:02:06 PM
> Subject: Re: Possible lock-less list race in scheduler_ipi()
> 
> On Thu, Mar 5, 2015 at 3:48 PM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> >
> > llist_next() is pretty simple:
> >
> > static inline struct llist_node *llist_next(struct llist_node *node)
> > {
> >         return node->next;
> > }
> >
> > It is so simple that I wonder if the compiler would be
> > within its rights to reorder the load of node->next
> > after some operations within ttwu_do_activate(), thus
> > causing corruption of this linked-list due to a
> > concurrent try_to_wake_up() performed by another core.
> >
> > Am I too paranoid about the possible compiler mishaps
> > there, or are my concerns justified ?
> 
> I *think* you are too paranoid, because that would be a major compiler
> bug anyway - gcc cannot reorder the load against anything that might
> be changing the value.  Which obviously includes calling non-inlined
> functions.
> 
> At least the code generation I see doesn't seem to say that gcc gets this
> wrong:
> 
>         ...
>         leaq    -32(%rbx), %rsi #, p
>         movq    (%rbx), %rbx    # MEM[(struct llist_node
> *)__mptr_19].next, __mptr
>         movq    %r12, %rdi      # tcp_ptr__,
>         call    ttwu_do_activate.constprop.85   #
>         ...
> 
> that "movq (%rbx), %rbx" is the "llist = llist_next(llist);" thing.

Indeed, the compiler should never reorder loads/stores from/to
same memory location from a program order POV. What I had in mind
is a bit more far-fetched though: it would involve having the compiler
reorder this load after a store to another memory location, which
would in turn allow another execution context (interrupt or thread)
to corrupt the list.

The assembly snippet you show above appears to be OK. However, another
compiler may choose to inline ttwu_do_activate, leaving room for
more aggressive optimizations.

But I agree that it is rather far-fetched.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com