From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754146AbbCFOfb (ORCPT ); Fri, 6 Mar 2015 09:35:31 -0500 Received: from mail.efficios.com ([78.47.125.74]:58276 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751437AbbCFOfa (ORCPT ); Fri, 6 Mar 2015 09:35:30 -0500 Date: Fri, 6 Mar 2015 14:35:23 +0000 (UTC) From: Mathieu Desnoyers To: Linus Torvalds Cc: "Paul E. McKenney" , Huang Ying , Lai Jiangshan , Lai Jiangshan , Peter Zijlstra , LKML , Ingo Molnar , Steven Rostedt Message-ID: <402283060.238174.1425652523685.JavaMail.zimbra@efficios.com> In-Reply-To: References: <995381344.227770.1425597484864.JavaMail.zimbra@efficios.com> <950470583.228004.1425599331454.JavaMail.zimbra@efficios.com> Subject: Re: Possible lock-less list race in scheduler_ipi() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [173.246.22.116] X-Mailer: Zimbra 8.0.7_GA_6021 (ZimbraWebClient - FF36 (Linux)/8.0.7_GA_6021) Thread-Topic: Possible lock-less list race in scheduler_ipi() Thread-Index: guifqV1y0peTUmWN8XR0DQeCKHZdhg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > From: "Linus Torvalds" > To: "Mathieu Desnoyers" > Cc: "Paul E. McKenney" , "Huang Ying" , "Lai Jiangshan" > , "Lai Jiangshan" , "Peter Zijlstra" , "LKML" > , "Ingo Molnar" , "Steven Rostedt" > Sent: Thursday, March 5, 2015 8:02:06 PM > Subject: Re: Possible lock-less list race in scheduler_ipi() > > On Thu, Mar 5, 2015 at 3:48 PM, Mathieu Desnoyers > wrote: > > > > llist_next() is pretty simple: > > > > static inline struct llist_node *llist_next(struct llist_node *node) > > { > > return node->next; > > } > > > > It is so simple that I wonder if the compiler would be > > within its rights to reorder the load of node->next > > after some operations within ttwu_do_activate(), thus > > causing corruption of this linked-list due to a > > concurrent try_to_wake_up() performed by another core. > > > > Am I too paranoid about the possible compiler mishaps > > there, or are my concerns justified ? > > I *think* you are too paranoid, because that would be a major compiler > bug anyway - gcc cannot reorder the load against anything that might > be changing the value. Which obviously includes calling non-inlined > functions. > > At least the code generation I see doesn't seem to say that gcc gets this > wrong: > > ... > leaq -32(%rbx), %rsi #, p > movq (%rbx), %rbx # MEM[(struct llist_node > *)__mptr_19].next, __mptr > movq %r12, %rdi # tcp_ptr__, > call ttwu_do_activate.constprop.85 # > ... > > that "movq (%rbx), %rbx" is the "llist = llist_next(llist);" thing. Indeed, the compiler should never reorder loads/stores from/to same memory location from a program order POV. What I had in mind is a bit more far-fetched though: it would involve having the compiler reorder this load after a store to another memory location, which would in turn allow another execution context (interrupt or thread) to corrupt the list. The assembly snippet you show above appears to be OK. However, another compiler may choose to inline ttwu_do_activate, leaving room for more aggressive optimizations. But I agree that it is rather far-fetched. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com