From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964897AbcK3MEG (ORCPT ); Wed, 30 Nov 2016 07:04:06 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:50942 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933363AbcK3MDj (ORCPT ); Wed, 30 Nov 2016 07:03:39 -0500 Date: Wed, 30 Nov 2016 04:03:33 -0800 From: "Paul E. McKenney" To: Guenter Roeck Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , sparclinux@vger.kernel.org, davem@davemloft.net Subject: Re: next: Commit 'mm: Prevent __alloc_pages_nodemask() RCU CPU stall ...' causing hang on sparc32 qemu Reply-To: paulmck@linux.vnet.ibm.com References: <20161129212308.GA12447@roeck-us.net> <20161130012817.GH3924@linux.vnet.ibm.com> <20161130070212.GM3924@linux.vnet.ibm.com> <929f6b29-461a-6e94-fcfd-710c3da789e9@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <929f6b29-461a-6e94-fcfd-710c3da789e9@roeck-us.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16113012-0008-0000-0000-000006365BCF X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006167; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000193; SDB=6.00787335; UDB=6.00380840; IPR=6.00565014; BA=6.00004932; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013489; XFM=3.00000011; UTC=2016-11-30 12:03:34 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16113012-0009-0000-0000-00003D736F60 Message-Id: <20161130120333.GQ3924@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-30_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611300205 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 30, 2016 at 02:52:11AM -0800, Guenter Roeck wrote: > On 11/29/2016 11:02 PM, Paul E. McKenney wrote: > >On Tue, Nov 29, 2016 at 08:32:51PM -0800, Guenter Roeck wrote: > >>On 11/29/2016 05:28 PM, Paul E. McKenney wrote: > >>>On Tue, Nov 29, 2016 at 01:23:08PM -0800, Guenter Roeck wrote: > >>>>Hi Paul, > >>>> > >>>>most of my qemu tests for sparc32 targets started to fail in next-20161129. > >>>>The problem is only seen in SMP builds; non-SMP builds are fine. > >>>>Bisect points to commit 2d66cccd73436 ("mm: Prevent __alloc_pages_nodemask() > >>>>RCU CPU stall warnings"); reverting that commit fixes the problem. And I have dropped this patch. Michal Hocko showed me the error of my ways with this patch. Thanx, Paul > >>>>Test scripts are available at: > >>>> https://github.com/groeck/linux-build-test/tree/master/rootfs/sparc > >>>>Test results are at: > >>>> https://github.com/groeck/linux-build-test/tree/master/rootfs/sparc > >>>> > >>>>Bisect log is attached. > >>>> > >>>>Please let me know if there is anything I can do to help tracking down the > >>>>problem. > >>> > >>>Apologies!!! Does the patch below help? > >>> > >>No, sorry, it doesn't make a difference. > > > >Interesting... Could you please send me the build failure messages? > > > > There is no failure message; it just hangs until I abort the qemu session. > > http://kerneltests.org/builders/qemu-sparc-next/builds/532/steps/qemubuildcommand/logs/stdio > > Guenter > > > Thanx, Paul > > > >>Guenter > >> > >>> Thanx, Paul > >>> > >>>------------------------------------------------------------------------ > >>> > >>>commit 97708e737e2a55fed4bdbc005bf05ea909df6b73 > >>>Author: Paul E. McKenney > >>>Date: Tue Nov 29 11:06:05 2016 -0800 > >>> > >>> rcu: Allow boot-time use of cond_resched_rcu_qs() > >>> > >>> The cond_resched_rcu_qs() macro is used to force RCU quiescent states into > >>> long-running in-kernel loops. However, some of these loops can execute > >>> during early boot when interrupts are disabled, and during which time > >>> it is therefore illegal to enter the scheduler. This commit therefore > >>> makes cond_resched_rcu_qs() be a no-op during early boot. > >>> > >>> Signed-off-by: Paul E. McKenney > >>> > >>>diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > >>>index 525ca34603b7..b6944cc19a07 100644 > >>>--- a/include/linux/rcupdate.h > >>>+++ b/include/linux/rcupdate.h > >>>@@ -423,7 +423,7 @@ extern struct srcu_struct tasks_rcu_exit_srcu; > >>> */ > >>>#define cond_resched_rcu_qs() \ > >>>do { \ > >>>- if (!cond_resched()) \ > >>>+ if (!is_idle_task(current) && !cond_resched()) \ > >>> rcu_note_voluntary_context_switch(current); \ > >>>} while (0) > >>> > >>>diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h > >>>index 7232d199a81c..20f5990deeee 100644 > >>>--- a/include/linux/rcutiny.h > >>>+++ b/include/linux/rcutiny.h > >>>@@ -228,6 +228,7 @@ static inline void exit_rcu(void) > >>>extern int rcu_scheduler_active __read_mostly; > >>>void rcu_scheduler_starting(void); > >>>#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ > >>>+#define rcu_scheduler_active false > >>>static inline void rcu_scheduler_starting(void) > >>>{ > >>>} > >>> > >>> > >> > > > > >