From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753880AbbFDR0L (ORCPT <rfc822;w@1wt.eu>);
	Thu, 4 Jun 2015 13:26:11 -0400
Received: from mail-am1on0056.outbound.protection.outlook.com ([157.56.112.56]:32448
	"EHLO emea01-am1-obe.outbound.protection.outlook.com"
	rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
	id S1752239AbbFDR0I (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 4 Jun 2015 13:26:08 -0400
Authentication-Results: spf=none (sender IP is )
 smtp.mailfrom=cmetcalf@ezchip.com; 
Message-ID: <55708A1D.70707@ezchip.com>
Date: Thu, 4 Jun 2015 13:25:49 -0400
From: Chris Metcalf <cmetcalf@ezchip.com>
User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Frederic Weisbecker <fweisbec@gmail.com>, Don Zickus <dzickus@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>, Ingo Molnar <mingo@kernel.org>,
        Andrew Jones <drjones@redhat.com>,
        Ulrich Obergfell <uobergfe@redhat.com>,
        Fabian Frederick <fabf@skynet.be>, Aaron Tomlin <atomlin@redhat.com>,
        Ben Zhang <benzh@chromium.org>, Christoph Lameter <cl@linux.com>,
        Gilad Ben-Yossef <gilad@benyossef.com>,
        Steven Rostedt <rostedt@goodmis.org>, <linux-kernel@vger.kernel.org>,
        Jonathan Corbet <corbet@lwn.net>, <linux-doc@vger.kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot
 threads
References: <1430422766-19703-1-git-send-email-cmetcalf@ezchip.com> <1430422766-19703-2-git-send-email-cmetcalf@ezchip.com> <20150501085356.GA14149@lerouge> <5543DABF.4060303@ezchip.com> <20150501212329.GA4179@lerouge>
In-Reply-To: <20150501212329.GA4179@lerouge>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
X-Originating-IP: [12.216.194.146]
X-ClientProxiedBy: CY1PR0801CA0035.namprd08.prod.outlook.com (25.163.136.173)
 To AM2PR02MB0772.eurprd02.prod.outlook.com (25.163.146.16)
X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0772;
X-Microsoft-Antispam-PRVS: <AM2PR02MB0772206EC382006CF7C26919AFB30@AM2PR02MB0772.eurprd02.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(520003)(5005006)(3002001);SRVR:AM2PR02MB0772;BCL:0;PCL:0;RULEID:;SRVR:AM2PR02MB0772;
X-Forefront-PRVS: 0597911EE1
X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6049001)(6009001)(479174004)(189002)(51704005)(377454003)(199003)(24454002)(65956001)(66066001)(189998001)(64706001)(5001960100002)(47776003)(40100003)(68736005)(50466002)(65806001)(87976001)(54356999)(76176999)(50986999)(65816999)(86362001)(33656002)(101416001)(77156002)(46102003)(92566002)(15975445007)(62966003)(77096005)(122386002)(2950100001)(93886004)(64126003)(106356001)(36756003)(42186005)(4001350100001)(4001540100001)(105586002)(81156007)(97736004)(5001770100001)(19580395003)(23746002)(5001860100001)(5001830100001)(83506001)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM2PR02MB0772;H:[10.7.0.41];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en;
X-OriginatorOrg: ezchip.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Jun 2015 17:26:00.7996 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM2PR02MB0772
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 05/01/2015 05:23 PM, Frederic Weisbecker wrote:
> On Fri, May 01, 2015 at 03:57:51PM -0400, Chris Metcalf wrote:
>> On 05/01/2015 04:53 AM, Frederic Weisbecker wrote:
>>>> +	/* Unpark any threads that were voluntarily parked. */
>>>> +	for_each_cpu_not(cpu, &ht->cpumask) {
>>>> +		if (cpu_online(cpu)) {
>>>> +			struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
>>>> +			if (tsk)
>>>> +				kthread_unpark(tsk);
>>> I'm still not clear why we are doing that. kthread_stop() should be able
>>> to handle parked kthreads, otherwise it needs to be fixed.
>> Checking without the unpark, it's actually only a problem with nohz_full.
>> In a system without nohz_full, the kthreads are able to stop even when
>> they are parked; it's only in the nohz_full case that things wedge.
>> For example, booting with only cpu 0 as a housekeeping core (and
>> therefore all watchdogs 1-35 on my 36-core tilegx are parked), and
>> immediately doing "echo 0 > /proc/sys/kernel/watchdog", I see
>> (via SysRq ^O-l) the first parked watchdog, on cpu 1, hung with:
>>
>>    frame 0: 0xfffffff7000f2928 lock_hrtimer_base+0xb8/0xc0
>>    frame 1: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170
>>    frame 2: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170
>>    frame 3: 0xfffffff7000f2b98 hrtimer_cancel+0x40/0x68
>>    frame 4: 0xfffffff70014cce0 watchdog_disable+0x50/0x70
>>    frame 5: 0xfffffff70008c2d0 smpboot_thread_fn+0x350/0x438
>>    frame 6: 0xfffffff700084b28 kthread+0x160/0x178

I finally had some time to look into this issue some more.

With PROVE_LOCKING enabled (after a fix I'll send to LKML shortly), we
get no warnings, and ^O-d to print locks shows:

Showing all locks held in the system:
3 locks held by watchdog/1/15:
  #0:  (&(&hp->lock)->rlock){-.....}, at: [<fffffff700620740>] hvc_poll+0xb8/0x4b8
  #1:  (rcu_read_lock){......}, at: [<fffffff70061d710>] __handle_sysrq+0x0/0x440
  #2:  (tasklist_lock){.+.+..}, at: [<fffffff7000d7310>] debug_show_all_locks+0xc0/0x350
3 locks held by sh/1732:
  #0:  (sb_writers#4){.+.+.+}, at: [<fffffff70022f6b8>] vfs_write+0x268/0x2c0
  #1:  (watchdog_proc_mutex){+.+.+.}, at: [<fffffff70016f368>] proc_watchdog_common+0x78/0x1c8
  #2:  (smpboot_threads_lock){+.+.+.}, at: [<fffffff700093558>] smpboot_unregister_percpu_thread+0x48/0x88

All the watchdog/1/15 locks are attributable to the fact that it's running
on the same core that ended up handling the "^O-d" request from SysRq.

The sh process from which I ran the echo eventually shows up as "blocked for
more than 120 seconds" and pretty much where you'd expect it to be, waiting
on a completion in kthread_stop() at kthread.c:473.

I instrumented lock_hrtimer_base(), and timer->base is null, and never gets
set non-null, so the loop spins forever.  Perhaps something in nohz is preventing
the timer->base from being set?

I'm happy to keep debugging this but I'm not really clear on what could
be going wrong here.  Any ideas?

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com