From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753880AbbFDR0L (ORCPT ); Thu, 4 Jun 2015 13:26:11 -0400 Received: from mail-am1on0056.outbound.protection.outlook.com ([157.56.112.56]:32448 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752239AbbFDR0I (ORCPT ); Thu, 4 Jun 2015 13:26:08 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=cmetcalf@ezchip.com; Message-ID: <55708A1D.70707@ezchip.com> Date: Thu, 4 Jun 2015 13:25:49 -0400 From: Chris Metcalf User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Frederic Weisbecker , Don Zickus CC: Andrew Morton , Ingo Molnar , Andrew Jones , Ulrich Obergfell , Fabian Frederick , Aaron Tomlin , Ben Zhang , Christoph Lameter , Gilad Ben-Yossef , Steven Rostedt , , Jonathan Corbet , , Thomas Gleixner , Peter Zijlstra Subject: Re: [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads References: <1430422766-19703-1-git-send-email-cmetcalf@ezchip.com> <1430422766-19703-2-git-send-email-cmetcalf@ezchip.com> <20150501085356.GA14149@lerouge> <5543DABF.4060303@ezchip.com> <20150501212329.GA4179@lerouge> In-Reply-To: <20150501212329.GA4179@lerouge> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: CY1PR0801CA0035.namprd08.prod.outlook.com (25.163.136.173) To AM2PR02MB0772.eurprd02.prod.outlook.com (25.163.146.16) X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0772; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(520003)(5005006)(3002001);SRVR:AM2PR02MB0772;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0772; X-Forefront-PRVS: 0597911EE1 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6049001)(6009001)(479174004)(189002)(51704005)(377454003)(199003)(24454002)(65956001)(66066001)(189998001)(64706001)(5001960100002)(47776003)(40100003)(68736005)(50466002)(65806001)(87976001)(54356999)(76176999)(50986999)(65816999)(86362001)(33656002)(101416001)(77156002)(46102003)(92566002)(15975445007)(62966003)(77096005)(122386002)(2950100001)(93886004)(64126003)(106356001)(36756003)(42186005)(4001350100001)(4001540100001)(105586002)(81156007)(97736004)(5001770100001)(19580395003)(23746002)(5001860100001)(5001830100001)(83506001)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM2PR02MB0772;H:[10.7.0.41];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Jun 2015 17:26:00.7996 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM2PR02MB0772 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/01/2015 05:23 PM, Frederic Weisbecker wrote: > On Fri, May 01, 2015 at 03:57:51PM -0400, Chris Metcalf wrote: >> On 05/01/2015 04:53 AM, Frederic Weisbecker wrote: >>>> + /* Unpark any threads that were voluntarily parked. */ >>>> + for_each_cpu_not(cpu, &ht->cpumask) { >>>> + if (cpu_online(cpu)) { >>>> + struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu); >>>> + if (tsk) >>>> + kthread_unpark(tsk); >>> I'm still not clear why we are doing that. kthread_stop() should be able >>> to handle parked kthreads, otherwise it needs to be fixed. >> Checking without the unpark, it's actually only a problem with nohz_full. >> In a system without nohz_full, the kthreads are able to stop even when >> they are parked; it's only in the nohz_full case that things wedge. >> For example, booting with only cpu 0 as a housekeeping core (and >> therefore all watchdogs 1-35 on my 36-core tilegx are parked), and >> immediately doing "echo 0 > /proc/sys/kernel/watchdog", I see >> (via SysRq ^O-l) the first parked watchdog, on cpu 1, hung with: >> >> frame 0: 0xfffffff7000f2928 lock_hrtimer_base+0xb8/0xc0 >> frame 1: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170 >> frame 2: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170 >> frame 3: 0xfffffff7000f2b98 hrtimer_cancel+0x40/0x68 >> frame 4: 0xfffffff70014cce0 watchdog_disable+0x50/0x70 >> frame 5: 0xfffffff70008c2d0 smpboot_thread_fn+0x350/0x438 >> frame 6: 0xfffffff700084b28 kthread+0x160/0x178 I finally had some time to look into this issue some more. With PROVE_LOCKING enabled (after a fix I'll send to LKML shortly), we get no warnings, and ^O-d to print locks shows: Showing all locks held in the system: 3 locks held by watchdog/1/15: #0: (&(&hp->lock)->rlock){-.....}, at: [] hvc_poll+0xb8/0x4b8 #1: (rcu_read_lock){......}, at: [] __handle_sysrq+0x0/0x440 #2: (tasklist_lock){.+.+..}, at: [] debug_show_all_locks+0xc0/0x350 3 locks held by sh/1732: #0: (sb_writers#4){.+.+.+}, at: [] vfs_write+0x268/0x2c0 #1: (watchdog_proc_mutex){+.+.+.}, at: [] proc_watchdog_common+0x78/0x1c8 #2: (smpboot_threads_lock){+.+.+.}, at: [] smpboot_unregister_percpu_thread+0x48/0x88 All the watchdog/1/15 locks are attributable to the fact that it's running on the same core that ended up handling the "^O-d" request from SysRq. The sh process from which I ran the echo eventually shows up as "blocked for more than 120 seconds" and pretty much where you'd expect it to be, waiting on a completion in kthread_stop() at kthread.c:473. I instrumented lock_hrtimer_base(), and timer->base is null, and never gets set non-null, so the loop spins forever. Perhaps something in nohz is preventing the timer->base from being set? I'm happy to keep debugging this but I'm not really clear on what could be going wrong here. Any ideas? -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com