From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754025AbbCJRbK (ORCPT ); Tue, 10 Mar 2015 13:31:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51880 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753115AbbCJRbH (ORCPT ); Tue, 10 Mar 2015 13:31:07 -0400 Date: Tue, 10 Mar 2015 18:28:16 +0100 From: Oleg Nesterov To: Linus Torvalds Cc: Peter Zijlstra , "linux-tip-commits@vger.kernel.org" , Davidlohr Bueso , Peter Anvin , Sasha Levin , Thomas Gleixner , Linux Kernel Mailing List , Jason Low , Michel Lespinasse , Andrew Morton , Ingo Molnar , Paul McKenney , Dave Jones , Ming Lei , Tim Chen , Kirill Tkhai Subject: Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running Message-ID: <20150310172816.GA9058@redhat.com> References: <1425714331.2475.388.camel@j-VirtualBox> <20150307171347.GA30365@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/10, Linus Torvalds wrote: > > On Sat, Mar 7, 2015 at 9:13 AM, Oleg Nesterov wrote: > >> + /* > >> + * Ensure we emit the owner->on_cpu, dereference _after_ > >> + * checking sem->owner still matches owner, if that fails, > >> + * owner might point to free()d memory, if it still matches, > >> + * the rcu_read_lock() ensures the memory stays valid. > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > Yes, this is another case when we wrongly assume this. > > > > Peter, should I resend > > > > [PATCH 3/3] introduce task_rcu_dereference() > > http://marc.info/?l=linux-kernel&m=141443631413914 > > > > ? or should we add another call_rcu() in finish_task_switch() (like -rt does) > > to make this true? > > I think we should just make 'task_struct_cachep' have SLAB_DESTROY_BY_RCU. This is what I initially suggested too, but then tried to argue with. But it seems that I lost if you too prefer SLAB_DESTROY_BY_RCU. Yes, SLAB_DESTROY_BY_RCU will work in this case because we recheck ->owner in a loop. And because task->on_cpu is just a word we can safely read. But this won't fix other problems we might have. For example, suppose that we will need get_task_struct(owner) in this code, this won't work. Or, as Kirill pointed out, lets look at "tsk = ACCESS_ONCE(cpu_rq(cpu)->curr)" in task_numa_group(). Even if this will be "fixed" by SLAB_DESTROY_BY_RCU, this code won't be correct anyway. Even if (I think) it will be safe to dereference ->numa_group as well. But OK, I won't argue. Oleg.