From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp1040.oracle.com ([141.146.126.69]:21790 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752202AbcEPN3l (ORCPT ); Mon, 16 May 2016 09:29:41 -0400 Subject: Re: [PATCH 3.18] timers: Use proper base migration in add_timer_on() To: Konstantin Khlebnikov References: <146338038104.19014.7494844489041339785.stgit@buzz> Cc: Tejun Heo , Thomas Gleixner , stable@vger.kernel.org From: Sasha Levin Message-ID: <5739CB2B.6070401@oracle.com> Date: Mon, 16 May 2016 09:29:15 -0400 MIME-Version: 1.0 In-Reply-To: <146338038104.19014.7494844489041339785.stgit@buzz> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: stable-owner@vger.kernel.org List-ID: On 05/16/2016 02:33 AM, Konstantin Khlebnikov wrote: > From: Tejun Heo > > [ Upstream commit 22b886dd1018093920c4250dee2a9a3cb7cff7b8 ] > > Regardless of the previous CPU a timer was on, add_timer_on() > currently simply sets timer->flags to the new CPU. As the caller must > be seeing the timer as idle, this is locally fine, but the timer > leaving the old base while unlocked can lead to race conditions as > follows. > > Let's say timer was on cpu 0. > > cpu 0 cpu 1 > ----------------------------------------------------------------------------- > del_timer(timer) succeeds > del_timer(timer) > lock_timer_base(timer) locks cpu_0_base > add_timer_on(timer, 1) > spin_lock(&cpu_1_base->lock) > timer->flags set to cpu_1_base > operates on @timer operates on @timer > > This triggered with mod_delayed_work_on() which contains > "if (del_timer()) add_timer_on()" sequence eventually leading to the > following oops. > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [] detach_if_pending+0x69/0x1a0 > ... > Workqueue: wqthrash wqthrash_workfunc [wqthrash] > task: ffff8800172ca680 ti: ffff8800172d0000 task.ti: ffff8800172d0000 > RIP: 0010:[] [] detach_if_pending+0x69/0x1a0 > ... > Call Trace: > [] del_timer+0x44/0x60 > [] try_to_grab_pending+0xb6/0x160 > [] mod_delayed_work_on+0x33/0x80 > [] wqthrash_workfunc+0x61/0x90 [wqthrash] > [] process_one_work+0x1e8/0x650 > [] worker_thread+0x4e/0x450 > [] kthread+0xef/0x110 > [] ret_from_fork+0x3f/0x70 > > Fix it by updating add_timer_on() to perform proper migration as > __mod_timer() does. > > Reported-and-tested-by: Jeff Layton > Signed-off-by: Tejun Heo > Cc: Chris Worley > Cc: bfields@fieldses.org > Cc: Michael Skralivetsky > Cc: Trond Myklebust > Cc: Shaohua Li > Cc: Jeff Layton > Cc: kernel-team@fb.com > Cc: stable@vger.kernel.org > Link: http://lkml.kernel.org/r/20151029103113.2f893924@tlielax.poochiereds.net > Link: http://lkml.kernel.org/r/20151104171533.GI5749@mtj.duckdns.org > Signed-off-by: Thomas Gleixner > Signed-off-by: Konstantin Khlebnikov ( backport for 3.18 ) Added to the queue, thanks! Thanks, Sasha