From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754183AbbBATpG (ORCPT ); Sun, 1 Feb 2015 14:45:06 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46241 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753099AbbBATpC (ORCPT ); Sun, 1 Feb 2015 14:45:02 -0500 Date: Sun, 1 Feb 2015 20:43:06 +0100 From: Oleg Nesterov To: Peter Zijlstra Cc: Linus Torvalds , "Rafael J. Wysocki" , Ingo Molnar , Peter Hurley , Davidlohr Bueso , Bruno =?iso-8859-1?Q?Pr=E9mont?= , Linux Kernel Mailing List , Thomas Gleixner , Ilya Dryomov , Mike Galbraith Subject: Re: Linux 3.19-rc5 Message-ID: <20150201194306.GA29993@redhat.com> References: <1421878320.4903.17.camel@stgolabs.net> <54C02E08.4080405@hurleysoftware.com> <1861286.x5DC37NGWz@vostro.rjw.lan> <20150130154742.GA10547@redhat.com> <20150131201601.GZ2896@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150131201601.GZ2896@worktop.programming.kicks-ass.net> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/31, Peter Zijlstra wrote: > > On Sat, Jan 31, 2015 at 10:32:23AM -0800, Linus Torvalds wrote: > > On Fri, Jan 30, 2015 at 7:47 AM, Oleg Nesterov wrote: > > > > > > Perhaps sched_annotate_sleep() shouldn't depend on CONFIG_DEBUG_ATOMIC_SLEEP > > > too... > > > > Ugh. That thing is horrible. The naming doesn't make it obvious at all > > that it's actually making sure that we have state set to TASK_RUNNING, > > and I could easily imagine that it would cause similar "busy-loops > > while scheduling" issues if anybody ever uses it in the wrong context. > > The alternative was putting unconditional __set_task_state(TASK_RUNNING) > things in a few code paths. I didn't want to cause the extra code in > case we didn't need them. Particularly the include/net/sock.h one, as I > know the network people are cycle counters. And personally I agree. sched_annotate_sleep() looks self-documented, it is clear that it is used to suppress the warning. Still. Can't we avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds? Oleg. --- x/include/linux/kernel.h +++ x/include/linux/kernel.h @@ -176,7 +176,7 @@ extern int _cond_resched(void); */ # define might_sleep() \ do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0) -# define sched_annotate_sleep() __set_current_state(TASK_RUNNING) +# define sched_annotate_sleep() (current->task_state_change = 0) #else static inline void ___might_sleep(const char *file, int line, int preempt_offset) { } --- x/kernel/sched/core.c +++ x/kernel/sched/core.c @@ -7292,7 +7292,7 @@ void __might_sleep(const char *file, int * since we will exit with TASK_RUNNING make sure we enter with it, * otherwise we will destroy state. */ - if (WARN_ONCE(current->state != TASK_RUNNING, + if (WARN_ONCE(current->state != TASK_RUNNING && current->task_state_change, "do not call blocking ops when !TASK_RUNNING; " "state=%lx set at [<%p>] %pS\n", current->state,