From: Peter Hurley <peter@hurleysoftware.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Jiri Slaby <jslaby@suse.com>,
One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
LKML <linux-kernel@vger.kernel.org>,
J Freyensee <james_p_freyensee@linux.intel.com>,
syzkaller <syzkaller@googlegroups.com>,
Kostya Serebryany <kcc@google.com>,
Alexander Potapenko <glider@google.com>,
Sasha Levin <sasha.levin@oracle.com>,
Eric Dumazet <edumazet@google.com>
Subject: Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
Date: Wed, 3 Feb 2016 11:09:04 -0800 [thread overview]
Message-ID: <56B25050.9070003@hurleysoftware.com> (raw)
In-Reply-To: <CACT4Y+YBoLLOKqex01DXh4enoYEH3zk-rvj4BxBCWj-LpKor1Q@mail.gmail.com>
On 02/03/2016 09:32 AM, Dmitry Vyukov wrote:
> On Wed, Feb 3, 2016 at 5:24 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> Hi Dmitry,
>>
>> On 01/21/2016 09:43 AM, Peter Hurley wrote:
>>> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>>
>>>>>> So in any recent code that I look at this function tries to acquire
>>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>>
>>>>> Yes.
>>>>>
>>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>>
>>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>>
>>>>> It's not.
>>>>
>>>> I've got a new report on commit
>>>> a200dcb34693084e56496960d855afdeaaf9578f (Jan 18).
>>>> Here is unprocessed version:
>>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>>> and here is processed one:
>>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>>
>>>> Peter, what exactly is wrong with the post-processed version?
>>>
>>> Yeah, ok, I assumed the problem with this report was post-processing
>>> because of the other report that had mixed-up info.
>>>
>>> However, the #3 stacktrace is obviously wrong, as others have already noted.
>>> Plus, the #1 stacktrace is wrong as well.
>>>
>>>> I would be interested in fixing the processing script.
>>>
>>> Not that it's related (since the original, not-edited report has bogus
>>> stacktraces), but how are you doing debug symbol lookup?
>>>
>>> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
>>>
>>> [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>>
>>>
>>>> As far as I see it contains the same stacks just with line numbers and
>>>> inlined frames.
>>>
>>> Agree, now that I see the original report.
>>>
>>>> I am using a significantly different compilation mode
>>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>>> figure out line numbers based on offsets.
>>>
>>> Weird. Maybe something to do with the compiler.
>>>
>>> Can you get me the dmesg output running the patch below?
>>
>> Wondering if this is still the priority it was not so long ago?
>> If not, that's fine and I'll drop this from my followup list.
>
>
> Yes, it is still the priority for me.
> I've tried to apply your debugging patch, but I noticed that it prints
> dependencies stacks as it discovers them.
Yeah, that's the point; I need to understand why lockdep doesn't
store the correct stack trace at dependency discovery.
Since the correct stack trace will be printed instead, it will help
debug the lockdep problem.
Hopefully, once the problem with the bad stacktraces are fixed, the
actual circular lock dependencies will be clear.
> But in my setup I don't have
> all output from machine start (there is just too many of it).
Kernel parameter:
log_buf_len=1G
> And I don't have a localized reproducer for this.
I really just need the lockdep dependency stacks generated during boot,
and the ctrl+C in a terminal window to trigger one of the dependency
stacks.
> I will try again.
Ok.
> Do you want me to debug with your "tty: Fix lock inversion in
> N_TRACEROUTER" patch applied or not (I still see slightly different
> deadlock reports with it)?
Not.
I think that probably does fix at least one circular dependency, but
I want to figure out the bad stack trace problem first.
There's probably another circular dependency there, as indicated by
your other report.
Regards,
Peter Hurley
>>> Please first just get me the output shortly after boot (please do a
>>> ctrl+C in a terminal window to trigger the expected stacktrace
>>> corresponding to #3 stacktrace from your report).
>>>
>>> Then please run your test that generates the circular lockdep report.
>>>
>>> --- >% ---
>>> Subject: [PATCH] debug: dump stacktraces for tty lock dependencies
>>>
>>> ---
>>> kernel/locking/lockdep.c | 12 +++++++-----
>>> 1 file changed, 7 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>>> index 60ace56..b67cafb 100644
>>> --- a/kernel/locking/lockdep.c
>>> +++ b/kernel/locking/lockdep.c
>>> @@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
>>> * Debugging switches:
>>> */
>>>
>>> -#define VERBOSE 0
>>> +#define VERBOSE 1
>>> #define VERY_VERBOSE 0
>>>
>>> #if VERBOSE
>>> @@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
>>> */
>>> static int class_filter(struct lock_class *class)
>>> {
>>> -#if 0
>>> - /* Example */
>>> +#if 1
>>> if (class->name_version == 1 &&
>>> - !strcmp(class->name, "lockname"))
>>> + !strcmp(class->name, "&buf->lock"))
>>> return 1;
>>> if (class->name_version == 1 &&
>>> - !strcmp(class->name, "&struct->lockfield"))
>>> + !strcmp(class->name, "&port->buf.lock/1"))
>>> + return 1;
>>> + if (class->name_version == 1 &&
>>> + !strcmp(class->name, "routelock"))
>>> return 1;
>>> #endif
>>> /* Filter everything else. 1 would be to allow everything else */
>>>
>>
next prev parent reply other threads:[~2016-02-03 19:09 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-30 10:44 tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc Dmitry Vyukov
2016-01-15 7:51 ` Dmitry Vyukov
2016-01-15 16:33 ` One Thousand Gnomes
2016-01-15 17:22 ` Dmitry Vyukov
2016-01-20 9:36 ` Dmitry Vyukov
2016-01-20 11:44 ` Peter Zijlstra
2016-01-20 11:54 ` Dmitry Vyukov
2016-01-20 12:07 ` Peter Zijlstra
2016-01-20 14:58 ` One Thousand Gnomes
2016-01-20 15:16 ` Dmitry Vyukov
2016-01-20 16:32 ` Peter Zijlstra
2016-01-20 2:09 ` J Freyensee
2016-01-20 12:47 ` Jiri Slaby
2016-01-20 13:02 ` Peter Zijlstra
2016-01-20 13:07 ` Dmitry Vyukov
2016-01-20 16:08 ` Peter Hurley
2016-01-20 20:47 ` Peter Hurley
2016-01-21 10:06 ` Dmitry Vyukov
2016-01-21 10:20 ` Peter Zijlstra
2016-01-21 17:51 ` Peter Hurley
2016-01-22 14:10 ` Dmitry Vyukov
2016-01-25 16:56 ` Peter Hurley
2016-01-21 17:43 ` Peter Hurley
2016-02-03 4:24 ` Peter Hurley
2016-02-03 17:32 ` Dmitry Vyukov
2016-02-03 19:09 ` Peter Hurley [this message]
2016-02-04 12:39 ` Dmitry Vyukov
2016-02-04 13:17 ` Dmitry Vyukov
2016-02-04 18:46 ` Peter Hurley
2016-02-04 18:48 ` Dmitry Vyukov
2016-02-05 21:22 ` Dmitry Vyukov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B25050.9070003@hurleysoftware.com \
--to=peter@hurleysoftware.com \
--cc=dvyukov@google.com \
--cc=edumazet@google.com \
--cc=glider@google.com \
--cc=gnomes@lxorguk.ukuu.org.uk \
--cc=gregkh@linuxfoundation.org \
--cc=james_p_freyensee@linux.intel.com \
--cc=jslaby@suse.com \
--cc=kcc@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=sasha.levin@oracle.com \
--cc=syzkaller@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).