From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755603AbcIAVBM (ORCPT ); Thu, 1 Sep 2016 17:01:12 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:56851 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752700AbcIAVBK (ORCPT ); Thu, 1 Sep 2016 17:01:10 -0400 Date: Thu, 1 Sep 2016 21:08:58 +0200 From: Peter Zijlstra To: Oleg Nesterov Cc: Ingo Molnar , Al Viro , Bart Van Assche , Johannes Weiner , Linus Torvalds , Neil Brown , linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock() Message-ID: <20160901190858.GI10168@twins.programming.kicks-ass.net> References: <20160826124453.GA28894@redhat.com> <20160826124552.GB28904@redhat.com> <20160901190141.GJ10138@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160901190141.GJ10138@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 01, 2016 at 09:01:41PM +0200, Peter Zijlstra wrote: > > test_and_set_bit() implies mb() so > > the lockless list_empty_careful() case is fine, we can not miss the > > condition if we race with unlock_page(). > > You're talking about this ordering?: > > finish_wait() clear_bit_unlock(); > list_empty_careful() > > /* MB implied */ smp_mb__after_atomic(); > test_and_set_bit() wake_up_page() > ... > autoremove_wake_function() > list_del_init(); > > > That could do with spelling out I feel.. :-) This ^^^ > > I am not sure we even want to conditionalize both finish_wait()'s, > > we could simply call it unconditionally and once before test_and_set(), > > the spurious wakeup is unlikely case. > > > ret = 0; > > for (;;) { > prepare_to_wait_exclusive(wq, &q->wait, mode); > > if (test_bit(&q->key.bit_nr, &q->key.flag)) > ret = action(&q->key, mode); > > if (!test_and_set_bit(&q->key.bit_nr, &q->key.flag)) { > /* we got the lock anyway, ignore the signal */ > ret = 0; > break; > } > > if (ret) > break; > } > finish_wait(wq, &q->wait); > > return ret; > > > Would not that work too? Nope, because we need to do that finish_wait() before test_and_set_bit().. Also the problem with doing finish_wait() unconditionally would be destroying the FIFO order. With a bit of bad luck you'd get starvation cases :/