From mboxrd@z Thu Jan 1 00:00:00 1970 From: Guenter Roeck Subject: Re: linux-next: tracebacks in workqueue.c/__flush_work() Date: Wed, 6 Feb 2019 08:23:59 -0800 Message-ID: <20190206162359.GA30699@roeck-us.net> References: <72e7d782-85f2-b499-8614-9e3498106569@i-love.sakura.ne.jp> <87munc306z.fsf@rustcorp.com.au> <201902060631.x166V9J8014750@www262.sakura.ne.jp> <20190206143625.GA25998@roeck-us.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Tetsuo Handa Cc: Rusty Russell , Chris Metcalf , linux-kernel , Tejun Heo , linux-mm , linux-arch List-Id: linux-arch.vger.kernel.org On Wed, Feb 06, 2019 at 11:57:45PM +0900, Tetsuo Handa wrote: > On 2019/02/06 23:36, Guenter Roeck wrote: > > On Wed, Feb 06, 2019 at 03:31:09PM +0900, Tetsuo Handa wrote: > >> (Adding linux-arch ML.) > >> > >> Rusty Russell wrote: > >>> Tetsuo Handa writes: > >>>> (Adding Chris Metcalf and Rusty Russell.) > >>>> > >>>> If NR_CPUS == 1 due to CONFIG_SMP=n, for_each_cpu(cpu, &has_work) loop does not > >>>> evaluate "struct cpumask has_work" modified by cpumask_set_cpu(cpu, &has_work) at > >>>> previous for_each_online_cpu() loop. Guenter Roeck found a problem among three > >>>> commits listed below. > >>>> > >>>> Commit 5fbc461636c32efd ("mm: make lru_add_drain_all() selective") > >>>> expects that has_work is evaluated by for_each_cpu(). > >>>> > >>>> Commit 2d3854a37e8b767a ("cpumask: introduce new API, without changing anything") > >>>> assumes that for_each_cpu() does not need to evaluate has_work. > >>>> > >>>> Commit 4d43d395fed12463 ("workqueue: Try to catch flush_work() without INIT_WORK().") > >>>> expects that has_work is evaluated by for_each_cpu(). > >>>> > >>>> What should we do? Do we explicitly evaluate has_work if NR_CPUS == 1 ? > >>> > >>> No, fix the API to be least-surprise. Fix 2d3854a37e8b767a too. > >>> > >>> Doing anything else would be horrible, IMHO. > >>> > >> > >> Fixing 2d3854a37e8b767a might involve subtle changes. If we do > >> > > > > Why not fix the macros ? > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > > > does not really make sense since it does not evaluate mask. > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1 && cpumask_test_cpu((cpu), (mask)); (cpu)++) > > > > or something similar might do it. > > Fixing macros is fine, The problem is that "mask" becomes evaluated > which might be currently undefined or unassigned if CONFIG_SMP=n. > Evaluating "mask" generates expected behavior for lru_add_drain_all() > case. But there might be cases where evaluating "mask" generate > unexpected behavior/results. Interesting notion. I would have assumed that passing a parameter to a function or macro implies that this parameter may be used. This makes me wonder - what is the point of ", (mask)" in the current macros ? It doesn't make sense to me. Anyway, I agree that fixing the macro might result in some failures. However, I would argue that those failures would actually be bugs, hidden by the buggy macros. But of course that it just my opinion. Guenter From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-f193.google.com ([209.85.210.193]:32889 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726306AbfBFQYD (ORCPT ); Wed, 6 Feb 2019 11:24:03 -0500 Date: Wed, 6 Feb 2019 08:23:59 -0800 From: Guenter Roeck Subject: Re: linux-next: tracebacks in workqueue.c/__flush_work() Message-ID: <20190206162359.GA30699@roeck-us.net> References: <72e7d782-85f2-b499-8614-9e3498106569@i-love.sakura.ne.jp> <87munc306z.fsf@rustcorp.com.au> <201902060631.x166V9J8014750@www262.sakura.ne.jp> <20190206143625.GA25998@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Tetsuo Handa Cc: Rusty Russell , Chris Metcalf , linux-kernel , Tejun Heo , linux-mm , linux-arch Message-ID: <20190206162359.Boqzl7TqWraT-mhbjlGLiGJsfKekq5rwf2jqVXXJ5ys@z> On Wed, Feb 06, 2019 at 11:57:45PM +0900, Tetsuo Handa wrote: > On 2019/02/06 23:36, Guenter Roeck wrote: > > On Wed, Feb 06, 2019 at 03:31:09PM +0900, Tetsuo Handa wrote: > >> (Adding linux-arch ML.) > >> > >> Rusty Russell wrote: > >>> Tetsuo Handa writes: > >>>> (Adding Chris Metcalf and Rusty Russell.) > >>>> > >>>> If NR_CPUS == 1 due to CONFIG_SMP=n, for_each_cpu(cpu, &has_work) loop does not > >>>> evaluate "struct cpumask has_work" modified by cpumask_set_cpu(cpu, &has_work) at > >>>> previous for_each_online_cpu() loop. Guenter Roeck found a problem among three > >>>> commits listed below. > >>>> > >>>> Commit 5fbc461636c32efd ("mm: make lru_add_drain_all() selective") > >>>> expects that has_work is evaluated by for_each_cpu(). > >>>> > >>>> Commit 2d3854a37e8b767a ("cpumask: introduce new API, without changing anything") > >>>> assumes that for_each_cpu() does not need to evaluate has_work. > >>>> > >>>> Commit 4d43d395fed12463 ("workqueue: Try to catch flush_work() without INIT_WORK().") > >>>> expects that has_work is evaluated by for_each_cpu(). > >>>> > >>>> What should we do? Do we explicitly evaluate has_work if NR_CPUS == 1 ? > >>> > >>> No, fix the API to be least-surprise. Fix 2d3854a37e8b767a too. > >>> > >>> Doing anything else would be horrible, IMHO. > >>> > >> > >> Fixing 2d3854a37e8b767a might involve subtle changes. If we do > >> > > > > Why not fix the macros ? > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > > > does not really make sense since it does not evaluate mask. > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1 && cpumask_test_cpu((cpu), (mask)); (cpu)++) > > > > or something similar might do it. > > Fixing macros is fine, The problem is that "mask" becomes evaluated > which might be currently undefined or unassigned if CONFIG_SMP=n. > Evaluating "mask" generates expected behavior for lru_add_drain_all() > case. But there might be cases where evaluating "mask" generate > unexpected behavior/results. Interesting notion. I would have assumed that passing a parameter to a function or macro implies that this parameter may be used. This makes me wonder - what is the point of ", (mask)" in the current macros ? It doesn't make sense to me. Anyway, I agree that fixing the macro might result in some failures. However, I would argue that those failures would actually be bugs, hidden by the buggy macros. But of course that it just my opinion. Guenter