From: Guenter Roeck <linux@roeck-us.net>
To: Will Deacon <will.deacon@arm.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>
Subject: Re: Traceback in ww_mutex test (test_cycle_work) on arm64/x86_64
Date: Tue, 2 Oct 2018 14:53:05 -0700 [thread overview]
Message-ID: <20181002215305.GA9322@roeck-us.net> (raw)
In-Reply-To: <20180924105133.GA12461@arm.com>
Hi Will,
On Mon, Sep 24, 2018 at 11:51:33AM +0100, Will Deacon wrote:
> Hi Guenter,
>
> On Sun, Sep 23, 2018 at 12:57:06PM -0700, Guenter Roeck wrote:
> > when enabling CONFIG_WW_MUTEX_SELFTEST on arm64 or x86_64,
> > I get the following traceback.
> >
> > [ 3.111852] ------------[ cut here ]------------
> > [ 3.112100] DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
> > [ 3.112753] WARNING: CPU: 1 PID: 771 at kernel/locking/mutex.c:1211 __mutex_unlock_slowpath+0x1a8/0x2e0
> > [ 3.113238] Modules linked in:
> > [ 3.113774] CPU: 1 PID: 771 Comm: kworker/u16:8 Not tainted 4.19.0-rc5-dirty #1
> > [ 3.114025] Hardware name: linux,dummy-virt (DT)
> > [ 3.114587] Workqueue: test-ww_mutex test_cycle_work
> > [ 3.114950] pstate: 40000005 (nZcv daif -PAN -UAO)
> > [ 3.115144] pc : __mutex_unlock_slowpath+0x1a8/0x2e0
> > [ 3.115327] lr : __mutex_unlock_slowpath+0x1a8/0x2e0
> > [ 3.115500] sp : ffff00000b7cbc40
> > [ 3.115647] x29: ffff00000b7cbc40 x28: 0000000000000000
> > [ 3.115921] x27: ffff00000942f000 x26: ffff00000a204da0
> > [ 3.116155] x25: ffff00000a1c93d0 x24: ffff000009103cd8
> > [ 3.116376] x23: ffff00000a1c9000 x22: ffff00000942f000
> > [ 3.116596] x21: ffff00000b7cbca8 x20: ffff80001c05f8c8
> > [ 3.116817] x19: 0000000000000000 x18: ffffffffffffffff
> > [ 3.117036] x17: 0000000000000000 x16: 0000000000000000
> > [ 3.117256] x15: ffff00000942f808 x14: ffff00008a1c8bb7
> > [ 3.117476] x13: ffff00000a1c8bc5 x12: ffff00000944f000
> > [ 3.117695] x11: 0000000005f5e0ff x10: ffff0000094b3000
> > [ 3.117947] x9 : 0000000000000000 x8 : ffff00000942f808
> > [ 3.118172] x7 : ffff00000816153c x6 : 0000000000000000
> > [ 3.118392] x5 : 0000000000000000 x4 : ffff00000b7cc000
> > [ 3.118612] x3 : 6172e063a21fe200 x2 : ffff00000944fd80
> > [ 3.118830] x1 : 6172e063a21fe200 x0 : 0000000000000000
> > [ 3.119169] Call trace:
> > [ 3.119348] __mutex_unlock_slowpath+0x1a8/0x2e0
> > [ 3.119540] ww_mutex_unlock+0x48/0xa0
> > [ 3.119709] test_cycle_work+0x10c/0x220
> > [ 3.119864] process_one_work+0x29c/0x708
> > [ 3.120016] worker_thread+0x40/0x458
> > [ 3.120179] kthread+0x12c/0x130
> > [ 3.120317] ret_from_fork+0x10/0x18
>
> Fun: I can reproduce this all the way back to 4.11, when the selftests
> were merged!
>
Yes, I know. Sorry, I should have mentioned it.
> > Debugging shows that the traceback occurs in the following code
> > in test_cycle_work().
> >
> > + err = ww_mutex_lock(cycle->b_mutex, &ctx);
> > + if (err == -EDEADLK) {
> > # true
> > + ww_mutex_unlock(&cycle->a_mutex);
> > + ww_mutex_lock_slow(cycle->b_mutex, &ctx);
> > + err = ww_mutex_lock(&cycle->a_mutex, &ctx);
> > # returns with err == -EDEADLK
> > + }
> > +
> > + if (!err)
> > + ww_mutex_unlock(cycle->b_mutex);
> > + ww_mutex_unlock(&cycle->a_mutex);
> > # traceback seen here:
> > # unlocks a_mutex even though it was not
> > # acquired by this thread
> >
> > Details don't really matter as long as the number of CPUs is at least 8
> > (I have not seen the problem with 1, 2, 4, or 6 CPUs). My test system
> > has 8 CPU cores (times 2 for hyperthreading), so that may be related.
> >
> > The test case above is clearly wrong if both calls to ww_mutex_lock()
> > fail with -EDEADLK. Unfortunately I don't know the expected behavior
> > in this case, so I'll have to pass this on without a proposed fix.
>
> Yeah, I think the test code isn't robust in the face of
> CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which can spuriously return -EDEADLK
> from mutex_lock(). It looks like it's assuming that err will always be
> reset to 0 when it takes a_mutex the second time. Chris?
>
I just sent a patch to fix the kernel warning message. That doesn't fix
the test failure, but then I have no idea if this is a real test failure
or a bad test program.
Thanks,
Guenter
prev parent reply other threads:[~2018-10-02 21:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-23 19:57 Traceback in ww_mutex test (test_cycle_work) on arm64/x86_64 Guenter Roeck
2018-09-24 10:51 ` Will Deacon
2018-10-02 21:53 ` Guenter Roeck [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181002215305.GA9322@roeck-us.net \
--to=linux@roeck-us.net \
--cc=chris@chris-wilson.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.