From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <52AFF20F.5070202@oracle.com> Date: Tue, 17 Dec 2013 01:41:19 -0500 From: Sasha Levin MIME-Version: 1.0 To: John Stultz , LKML CC: Thomas Gleixner , Prarit Bhargava , Richard Cochran , Ingo Molnar , stable Subject: Re: [RFC][PATCH 3/5] timekeeping: Avoid possible deadlock from clock_was_set_delayed References: <1386789098-17391-1-git-send-email-john.stultz@linaro.org> <1386789098-17391-4-git-send-email-john.stultz@linaro.org> <52A9E5B2.8040702@oracle.com> <52AA014B.6000301@oracle.com> <52AA0798.1050709@linaro.org> <52AA08EB.1080703@oracle.com> <52AA0AD2.5030307@linaro.org> <52AFDDE6.2020600@linaro.org> In-Reply-To: <52AFDDE6.2020600@linaro.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: On 12/17/2013 12:15 AM, John Stultz wrote: > On 12/12/2013 11:13 AM, John Stultz wrote: >> On 12/12/2013 11:05 AM, Sasha Levin wrote: >>> On 12/12/2013 01:59 PM, John Stultz wrote: >>>> On 12/12/2013 10:32 AM, Sasha Levin wrote: >>>>> On 12/12/2013 11:34 AM, Sasha Levin wrote: >>>>>> On 12/11/2013 02:11 PM, John Stultz wrote: >>>>>>> As part of normal operaions, the hrtimer subsystem frequently calls >>>>>>> into the timekeeping code, creating a locking order of >>>>>>> hrtimer locks -> timekeeping locks >>>>>>> >>>>>>> clock_was_set_delayed() was suppoed to allow us to avoid deadlocks >>>>>>> between the timekeeping the hrtimer subsystem, so that we could >>>>>>> notify the hrtimer subsytem the time had changed while holding >>>>>>> the timekeeping locks. This was done by scheduling delayed work >>>>>>> that would run later once we were out of the timekeeing code. >>>>>>> >>>>>>> But unfortunately the lock chains are complex enoguh that in >>>>>>> scheduling delayed work, we end up eventually trying to grab >>>>>>> an hrtimer lock. >>>>>>> >>>>>>> Sasha Levin noticed this in testing when the new seqlock lockdep >>>>>>> enablement triggered the following (somewhat abrieviated) message: >>>>>> [snip] >>>>>> >>>>>> This seems to work for me, I don't see the lockdep spew anymore. >>>>>> >>>>>> Tested-by: Sasha Levin >>>>> I think I spoke too soon. >>>>> >>>>> It took way more time to reproduce than previously, but I got: >>>>> >>>>> >>>>> -> #1 (&(&pool->lock)->rlock){-.-...}: >>>>> [ 1195.578519] [] validate_chain+0x6c3/0x7b0 >>>>> [ 1195.578519] [] __lock_acquire+0x4ad/0x580 >>>>> [ 1195.578519] [] lock_acquire+0x182/0x1d0 >>>>> [ 1195.578519] [] _raw_spin_lock+0x40/0x80 >>>>> [ 1195.578519] [] __queue_work+0x14e/0x3f0 >>>>> [ 1195.578519] [] queue_work_on+0x98/0x120 >>>>> [ 1195.578519] [] >>>>> clock_was_set_delayed+0x21/0x30 >>>>> [ 1195.578519] [] do_adjtimex+0x111/0x160 >>>>> [ 1195.578519] [] SYSC_adjtimex+0x43/0x80 >>>>> [ 1195.578519] [] SyS_adjtimex+0xe/0x10 >>>>> [ 1195.578519] [] tracesys+0xdd/0xe2 >>>>> [ 1195.578519] >>>> Are you sure you have that patch applied? >>>> >>>> With it we shouldn't be calling clock_was_set_delayed() from >>>> do_adjtimex(). >>> Hm, It seems that there's a conflict there that wasn't resolved >>> properly. Does this patch >>> depend on anything else that's not currently in -next? >> Oh yes, sorry, I didn't cc you on the entire patch set. Apologies! >> >> You'll probably want to grab the two previous patches: >> https://lkml.org/lkml/2013/12/11/479 >> https://lkml.org/lkml/2013/12/11/758 > > Just wanted to follow up here. Did you happen to get a chance to try to > reproduce w/ the three patch patchset? > > I'm hoping to submit them to Ingo tomorrow, and want to make sure I've > got your tested-by. Oh yeah, have been running it ever since, haven't seen the issue reproduce. Thanks, Sasha