From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752250AbcHHQUs (ORCPT ); Mon, 8 Aug 2016 12:20:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35320 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752123AbcHHQUr (ORCPT ); Mon, 8 Aug 2016 12:20:47 -0400 Date: Mon, 8 Aug 2016 18:20:38 +0200 From: Oleg Nesterov To: Bart Van Assche Cc: Peter Zijlstra , "mingo@kernel.org" , Andrew Morton , Johannes Weiner , Neil Brown , Michael Shaver , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs Message-ID: <20160808162038.GA25927@redhat.com> References: <20160803181128.GH6879@twins.programming.kicks-ass.net> <11007730-3fa5-139a-8091-655743894ae8@sandisk.com> <20160803213006.GA11712@redhat.com> <17b65ff9-215f-ab74-9f5f-15dbd308d054@sandisk.com> <20160804140938.GB24652@twins.programming.kicks-ass.net> <16207b90-2e6c-fe23-1b4b-3763e5cf0384@sandisk.com> <20160808102213.GA6879@twins.programming.kicks-ass.net> <4091e252-18d9-1795-de63-9fbc678aa6b1@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4091e252-18d9-1795-de63-9fbc678aa6b1@acm.org> User-Agent: Mutt/1.5.18 (2008-05-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 08 Aug 2016 16:20:46 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/08, Bart Van Assche wrote: > > This is the sequence of which I think that it leads to the missed wakeup: > > Task 1 Task 2 Task 3 Task 4 > > lock_page() > ... > lock_page_killable() > __lock_page_killable() > __wait_on_bit_lock() > bit_wait_io() > io_schedule() > ... > lock_page() > __lock_page() > __wait_on_bit_lock() > bit_wait_io() > io_schedule() > ... > > > (signal delivery to task 2) > try_to_wake_up(task2, ..., ...) > (try_to_wake_up() returns 1) > > unlock_page() > wake_up_page() > __wake_up_bit() > __wake_up(wq, TASK_NORMAL, 1, &key) > __wake_up_common(wq, mode=TASK_NORMAL, nr_exclusive=1, 0, key) > wake_bit_function() > autoremove_wake_function() > default_wake_function() > try_to_wake_up() <- skips task 2 because task 3 already changed > the task state of task 2 > (autoremove_wake_function() does not do > list_del_init(&wait->task_list)) Yes. But since it skips task2, __wake_up_common() doesn't decrement nr_exclusive, doesn't stop. It continues the list_for_each_entry_safe() loop, and finds the sleeping task4, and wakes it up, > bit_wait_io() returns -EINTR > abort_exclusive_wait() is called by __wait_on_bit_lock() > > > In the above sequence task 1 does not remove task 2 from the waitqueue > because task 3 had already woken up task 2. The result is that when task 2 > calls abort_exclusive_wait() that task 2 is still on the waitqueue. Yes, but this is fine, > With the > current implementation of abort_exclusive_wait() in the above scenario task > 4 is not woken up although it should be woken up. See above, it must be already woken by __wake_up_common(). So far _I think_ that the bug is somewhere else... Say, someone clears PG_locked without wake_up(). Then SIGKILL sent to the task sleeping in sys_read() "adds" the necessary wakeup... Do you use external modules during the testing? Oleg.