From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754027AbcETEF1 (ORCPT ); Fri, 20 May 2016 00:05:27 -0400 Received: from e18.ny.us.ibm.com ([129.33.205.208]:44999 "EHLO e18.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752075AbcETEFZ (ORCPT ); Fri, 20 May 2016 00:05:25 -0400 X-IBM-Helo: d01dlp01.pok.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Thu, 19 May 2016 21:05:27 -0700 From: "Paul E. McKenney" To: Santosh Shilimkar Cc: linux-kernel@vger.kernel.org, Sasha Levin Subject: Re: [rcu_sched stall] regression/miss-config ? Message-ID: <20160520040527.GZ3528@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20160516120329.GB3528@linux.vnet.ibm.com> <3d5a2847-86d2-cd15-a7e8-8f4b2ee5a64d@oracle.com> <20160516173401.GG3528@linux.vnet.ibm.com> <67eb4bf6-c3d2-b9af-30ff-713a6d75e773@oracle.com> <20160517005820.GI3528@linux.vnet.ibm.com> <20160517191529.GK3528@linux.vnet.ibm.com> <140d3624-7167-3e72-3ef2-a4da47ce986c@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <140d3624-7167-3e72-3ef2-a4da47ce986c@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16052004-0045-0000-0000-0000043D29EB X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 19, 2016 at 04:45:26PM -0700, Santosh Shilimkar wrote: > Hi Paul, > > On 5/17/2016 12:15 PM, Paul E. McKenney wrote: > >On Tue, May 17, 2016 at 06:46:22AM -0700, santosh.shilimkar@oracle.com wrote: > >>On 5/16/16 5:58 PM, Paul E. McKenney wrote: > >>>On Mon, May 16, 2016 at 12:49:41PM -0700, Santosh Shilimkar wrote: > >>>>On 5/16/2016 10:34 AM, Paul E. McKenney wrote: > >>>>>On Mon, May 16, 2016 at 09:33:57AM -0700, Santosh Shilimkar wrote: > >> > >>[...] > >> > >>>>>Are you running CONFIG_NO_HZ_FULL=y? If so, the problem might be that > >>>>>you need more housekeeping CPUs than you currently have configured. > >>>>> > >>>>Yes, CONFIG_NO_HZ_FULL=y. Do you mean "CONFIG_NO_HZ_FULL_ALL=y" for > >>>>book keeping. Seems like without that clock-event code will just use > >>>>CPU0 for things like broadcasting which might become bottleneck. > >>>>This could explain connect the hrtimer_interrupt() path getting slowed > >>>>down because of book keeping bottleneck. > >>>> > >>>>$cat .config | grep NO_HZ > >>>>CONFIG_NO_HZ_COMMON=y > >>>># CONFIG_NO_HZ_IDLE is not set > >>>>CONFIG_NO_HZ_FULL=y > >>>># CONFIG_NO_HZ_FULL_ALL is not set > >>>># CONFIG_NO_HZ_FULL_SYSIDLE is not set > >>>>CONFIG_NO_HZ=y > >>>># CONFIG_RCU_FAST_NO_HZ is not set > >>> > >>>Yes, CONFIG_NO_HZ_FULL_ALL=y would give you only one CPU for all > >>>housekeeping tasks, including the RCU grace-period kthreads. So you are > >>>booting without any nohz_full boot parameter? You can end up with the > >>>same problem with CONFIG_NO_HZ_FULL=y and the nohz_full boot parameter > >>>that you can with CONFIG_NO_HZ_FULL_ALL=y. > >>> > >>I see. Yes, the systems are booting without nohz_full boot parameter. > >>Will try to add more CPUs to it & update the thread > >>after the verification since it takes time to reproduce the issue. > >> > >>Thanks for discussion so far Paul. Its very insightful for me. > > > >Please let me know how things go with further testing, especially with > >the priority setting. > > > Sorry for delay. I manage to get information about XEN usecase > custom config as discussed above. To reduce variables, I disabled > "CONFIG_NO_HZ_FULL" altogether. So the effective setting was: > > CONFIG_NO_HZ_IDLE=y > # CONFIG_NO_HZ_FULL is not set > CONFIG_TREE_RCU_TRACE=y > CONFIG_RCU_KTHREAD_PRIO=1 > CONFIG_RCU_CPU_STALL_TIMEOUT=21 > CONFIG_RCU_TRACE=y > > Unfortunately the XEN test still failed. Log end of > the email. This test(s) is bit peculiar though since > its database running in VM with 1 or 2 CPUs. One of > the suspect is because the database RT processes are > hogging the CPU(s), kernel RCU thread is not getting chance > to run which eventually results in stall. Does it > make sense ? > > Please note that its non-preempt kernel using RT processes. ;-) If you have enough real-time processes to consume all CPU, you will indeed starve the grace-period kthread, so what you see below would then be expected behavior. Try setting CONFIG_RCU_KTHREAD_PRIO to be larger than the real-time priority of your processes. Thanx, Paul > # cat .config | grep PREEMPT > CONFIG_PREEMPT_NOTIFIERS=y > # CONFIG_PREEMPT_NONE is not set > CONFIG_PREEMPT_VOLUNTARY=y > # CONFIG_PREEMPT is not set > > Regards, > Santosh > ... > .... > rcu_sched kthread starved for 399032 jiffies! > INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, > t=462037 jiffies, g=118888, c=118887, q=0) > All QSes seen, last rcu_sched kthread activity 462037 > (4296277632-4295815595), jiffies_till_next_fqs=3, root ->qsmask 0x0 > ocssd.bin R running task 0 15375 1 0x00000000 > 0000000000000000 ffff8800ec003bc8 ffffffff810a8581 ffffffff81abf980 > 000000000001d068 ffff8800ec003c28 ffffffff810e9c98 0000000000000000 > 0000000000000086 0000000000000000 0000000000000086 0000000000000082 > Call Trace: > [] sched_show_task+0xb1/0x120 > [] print_other_cpu_stall+0x288/0x2d0 > [] __rcu_pending+0x180/0x230 > [] rcu_check_callbacks+0x95/0x140 > [] update_process_times+0x42/0x70 > [] tick_sched_handle+0x39/0x80 > [] tick_sched_timer+0x52/0xa0 > [] __run_hrtimer+0x74/0x1d0 > [] ? tick_nohz_handler+0xc0/0xc0 > [] hrtimer_interrupt+0x102/0x240 > [] xen_timer_interrupt+0x2e/0x130 > [] ? add_interrupt_randomness+0x3a/0x1f0 > [] ? store_cursor_blink+0xc0/0xc0 > [] handle_irq_event_percpu+0x54/0x1b0 > [] handle_percpu_irq+0x47/0x70 > [] generic_handle_irq+0x27/0x40 > [] evtchn_2l_handle_events+0x25a/0x260 > [] ? __do_softirq+0x191/0x2f0 > [] __xen_evtchn_do_upcall+0x4f/0x90 > [] xen_evtchn_do_upcall+0x34/0x50 > [] xen_hvm_callback_vector+0x6e/0x80 > > rcu_sched kthread starved for 462037 jiffies! > >