From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933473AbcBDSeJ (ORCPT ); Thu, 4 Feb 2016 13:34:09 -0500 Received: from mail-pf0-f169.google.com ([209.85.192.169]:33014 "EHLO mail-pf0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933144AbcBDSeI (ORCPT ); Thu, 4 Feb 2016 13:34:08 -0500 Subject: Re: [PATCH] lockdep: fix stack trace caching logic To: Dmitry Vyukov References: <1454593240-121647-1-git-send-email-dvyukov@google.com> <56B38E06.3040607@hurleysoftware.com> Cc: Peter Zijlstra , Ingo Molnar , LKML , Kostya Serebryany , Alexander Potapenko , Sasha Levin From: Peter Hurley Message-ID: <56B39997.6040806@hurleysoftware.com> Date: Thu, 4 Feb 2016 10:33:59 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/04/2016 09:50 AM, Dmitry Vyukov wrote: > On Thu, Feb 4, 2016 at 6:44 PM, Peter Hurley wrote: >> On 02/04/2016 05:40 AM, Dmitry Vyukov wrote: >>> check_prev_add() caches saved stack trace in static trace variable >>> to avoid duplicate save_trace() calls in dependencies involving trylocks. >>> But that caching logic contains a bug. >> >>> We may not save trace on first iteration due to early return from check_prev_add(). >> >> This commit log should identify the role test instrumentation plays in triggering >> this bug: is it a recursive read lock dependency injected between existing lock >> dependencies? What test component triggered this? > > > I don't think that it has something to do with instrumentation. > check_prev_add() has explicitly coded early exits paths which don't > save stack trace. > For example, if first pair of locks is: try-read-lock vs read-lock, we > will not save stack trace; then next pair is: write-lock vs read-lock, > we will use bogus stack trace. Ok, but that isn't the case with the triggering stack traces we were getting. But I see now that there's a existing lock dependency that check_prev_add() is skipping. IOW, the other early-out path is being taken; lockdep has already seen: new dependency: (&tty->ldisc_sem){++++..} => (&buf->lock){+.+...} So yeah, not related to test instrumentation then. >>> Then on the second iteration when we actually need the trace we don't save it >>> because we think that we've already saved it. >>> >>> Let check_prev_add() itself control when stack is saved. >>> >>> There is another bug. Trace variable is protected by graph lock. >>> But we can temporary release graph lock during printing. >>> >>> Fix this by invalidating cached stack trace when we release graph lock. >>> >>> Signed-off-by: Dmitry Vyukov >>> --- >>> kernel/locking/lockdep.c | 16 ++++++++++------ >>> 1 file changed, 10 insertions(+), 6 deletions(-) >>> >>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c >>> index 60ace56..c7710e4 100644 >>> --- a/kernel/locking/lockdep.c >>> +++ b/kernel/locking/lockdep.c >>> @@ -1822,7 +1822,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next, >>> */ >>> static int >>> check_prev_add(struct task_struct *curr, struct held_lock *prev, >>> - struct held_lock *next, int distance, int trylock_loop) >>> + struct held_lock *next, int distance, int *stack_saved) >>> { >>> struct lock_list *entry; >>> int ret; >>> @@ -1883,8 +1883,11 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev, >>> } >>> } >>> >>> - if (!trylock_loop && !save_trace(&trace)) >>> - return 0; >>> + if (!*stack_saved) { >>> + if (!save_trace(&trace)) >>> + return 0; >>> + *stack_saved = 1; >>> + } >>> >>> /* >>> * Ok, all validations passed, add the new lock >>> @@ -1907,6 +1910,8 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev, >>> * Debugging printouts: >>> */ >>> if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) { >>> + /* We drop graph lock, so another thread can overwrite trace. */ >>> + *stack_saved = 0; >>> graph_unlock(); >>> printk("\n new dependency: "); >>> print_lock_name(hlock_class(prev)); >>> @@ -1929,7 +1934,7 @@ static int >>> check_prevs_add(struct task_struct *curr, struct held_lock *next) >>> { >>> int depth = curr->lockdep_depth; >>> - int trylock_loop = 0; >>> + int stack_saved = 0; >>> struct held_lock *hlock; >>> >>> /* >>> @@ -1956,7 +1961,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next) >>> */ >>> if (hlock->read != 2 && hlock->check) { >>> if (!check_prev_add(curr, hlock, next, >>> - distance, trylock_loop)) >>> + distance, &stack_saved)) >>> return 0; >>> /* >>> * Stop after the first non-trylock entry, >>> @@ -1979,7 +1984,6 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next) >>> if (curr->held_locks[depth].irq_context != >>> curr->held_locks[depth-1].irq_context) >>> break; >>> - trylock_loop = 1; >>> } >>> return 1; >>> out_bug: >>> >>