From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753072AbcELL55 (ORCPT ); Thu, 12 May 2016 07:57:57 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:50761 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752992AbcELL5y (ORCPT ); Thu, 12 May 2016 07:57:54 -0400 Date: Thu, 12 May 2016 13:57:45 +0200 From: Peter Zijlstra To: Michal Hocko Cc: Tetsuo Handa , LKML , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , "David S. Miller" , Tony Luck , Andrew Morton , Chris Zankel , Max Filippov , Davidlohr Bueso , Waiman Long Subject: [PATCH] locking, rwsem: Fix down_write_killable() Message-ID: <20160512115745.GP3192@twins.programming.kicks-ass.net> References: <20160510123806.GB3193@twins.programming.kicks-ass.net> <20160511072357.GC16677@dhcp22.suse.cz> <20160511082853.GF16677@dhcp22.suse.cz> <20160511084401.GH3193@twins.programming.kicks-ass.net> <20160511090442.GH16677@dhcp22.suse.cz> <20160511091733.GC3192@twins.programming.kicks-ass.net> <20160511093127.GI16677@dhcp22.suse.cz> <20160511094128.GB3190@twins.programming.kicks-ass.net> <20160511135938.GA19577@dhcp22.suse.cz> <20160511180345.GA27728@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160511180345.GA27728@dhcp22.suse.cz> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 11, 2016 at 08:03:46PM +0200, Michal Hocko wrote: > On Wed 11-05-16 15:59:38, Michal Hocko wrote: > > On Wed 11-05-16 11:41:28, Peter Zijlstra wrote: > > > On Wed, May 11, 2016 at 11:31:27AM +0200, Michal Hocko wrote: > > > > > > > Care to cook up a full patch? > > > > > > compile tested only, if someone could please test it? > > > > I have tried to run the test case from Tetsuo[1] with a small printk to > > show the interrupted writer case: > > [ 2753.596678] XXX: Writer interrupted. Woken waiters:0 > > [ 2998.266978] XXX: Writer interrupted. Woken waiters:0 > > > > which means rwsem_atomic_update(-RWSEM_WAITING_BIAS, sem) path which is > > the problematic case. oom_reaper was always able to succeed so I guess > > the patch works as expected. I will leave the test run for longer to be > > sure. > > And just for the reference I am able to reproduce the lockup without the > patch applied and the same test case and a debugging patch > > [ 1522.036379] XXX interrupted. list_is_singular:1 > [ 1523.040462] oom_reaper: unable to reap pid:3736 (tgid=3736) Here the complete patch. Ingo could you make it appear in the right tip branch? --- Subject: locking, rwsem: Fix down_write_killable() From: Peter Zijlstra Date: Wed, 11 May 2016 11:41:28 +0200 The new signal_pending exit path in __rwsem_down_write_failed_common() was fingered as breaking his kernel by Tetsuo Handa. Upon inspection it was found that there are two things wrong with it; - it forgets to remove WAITING_BIAS if it leaves the list empty, or - it forgets to wake further waiters that were blocked on the now removed waiter. Especially the first issue causes new lock attempts to block and stall indefinitely, as the code assumes that pending waiters mean there is an owner that will wake when it releases the lock. Cc: "David S. Miller" Cc: Waiman Long Cc: Chris Zankel Cc: Ingo Molnar Cc: Andrew Morton Cc: Thomas Gleixner Cc: Davidlohr Bueso Cc: Tony Luck Cc: "H. Peter Anvin" Cc: Max Filippov Reported-by: Tetsuo Handa Tested-by: Tetsuo Handa Tested-by: Michal Hocko Signed-off-by: Peter Zijlstra (Intel) --- kernel/locking/rwsem-xadd.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -487,23 +487,32 @@ __rwsem_down_write_failed_common(struct /* Block until there are no active lockers. */ do { - if (signal_pending_state(state, current)) { - raw_spin_lock_irq(&sem->wait_lock); - ret = ERR_PTR(-EINTR); - goto out; - } + if (signal_pending_state(state, current)) + goto out_nolock; + schedule(); set_current_state(state); } while ((count = sem->count) & RWSEM_ACTIVE_MASK); raw_spin_lock_irq(&sem->wait_lock); } -out: __set_current_state(TASK_RUNNING); list_del(&waiter.list); raw_spin_unlock_irq(&sem->wait_lock); return ret; + +out_nolock: + __set_current_state(TASK_RUNNING); + raw_spin_lock_irq(&sem->wait_lock); + list_del(&waiter.list); + if (list_empty(&sem->wait_list)) + rwsem_atomic_update(-RWSEM_WAITING_BIAS, sem); + else + __rwsem_do_wake(sem, RWSEM_WAKE_ANY); + raw_spin_unlock_irq(&sem->wait_lock); + + return ERR_PTR(-EINTR); } __visible struct rw_semaphore * __sched