Traceback in ww_mutex test (test_cycle_work) on arm64/x86_64

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Guenter Roeck <linux@roeck-us.net>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>
Subject: Traceback in ww_mutex test (test_cycle_work) on arm64/x86_64
Date: Sun, 23 Sep 2018 12:57:06 -0700	[thread overview]
Message-ID: <20180923195706.GA1538@roeck-us.net> (raw)

Hi,

when enabling CONFIG_WW_MUTEX_SELFTEST on arm64 or x86_64,
I get the following traceback.

[    3.111852] ------------[ cut here ]------------
[    3.112100] DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
[    3.112753] WARNING: CPU: 1 PID: 771 at kernel/locking/mutex.c:1211 __mutex_unlock_slowpath+0x1a8/0x2e0
[    3.113238] Modules linked in:
[    3.113774] CPU: 1 PID: 771 Comm: kworker/u16:8 Not tainted 4.19.0-rc5-dirty #1
[    3.114025] Hardware name: linux,dummy-virt (DT)
[    3.114587] Workqueue: test-ww_mutex test_cycle_work
[    3.114950] pstate: 40000005 (nZcv daif -PAN -UAO)
[    3.115144] pc : __mutex_unlock_slowpath+0x1a8/0x2e0
[    3.115327] lr : __mutex_unlock_slowpath+0x1a8/0x2e0
[    3.115500] sp : ffff00000b7cbc40
[    3.115647] x29: ffff00000b7cbc40 x28: 0000000000000000 
[    3.115921] x27: ffff00000942f000 x26: ffff00000a204da0 
[    3.116155] x25: ffff00000a1c93d0 x24: ffff000009103cd8 
[    3.116376] x23: ffff00000a1c9000 x22: ffff00000942f000 
[    3.116596] x21: ffff00000b7cbca8 x20: ffff80001c05f8c8 
[    3.116817] x19: 0000000000000000 x18: ffffffffffffffff 
[    3.117036] x17: 0000000000000000 x16: 0000000000000000 
[    3.117256] x15: ffff00000942f808 x14: ffff00008a1c8bb7 
[    3.117476] x13: ffff00000a1c8bc5 x12: ffff00000944f000 
[    3.117695] x11: 0000000005f5e0ff x10: ffff0000094b3000 
[    3.117947] x9 : 0000000000000000 x8 : ffff00000942f808 
[    3.118172] x7 : ffff00000816153c x6 : 0000000000000000 
[    3.118392] x5 : 0000000000000000 x4 : ffff00000b7cc000 
[    3.118612] x3 : 6172e063a21fe200 x2 : ffff00000944fd80 
[    3.118830] x1 : 6172e063a21fe200 x0 : 0000000000000000 
[    3.119169] Call trace:
[    3.119348]  __mutex_unlock_slowpath+0x1a8/0x2e0
[    3.119540]  ww_mutex_unlock+0x48/0xa0
[    3.119709]  test_cycle_work+0x10c/0x220
[    3.119864]  process_one_work+0x29c/0x708
[    3.120016]  worker_thread+0x40/0x458
[    3.120179]  kthread+0x12c/0x130
[    3.120317]  ret_from_fork+0x10/0x18

Debugging shows that the traceback occurs in the following code
in test_cycle_work().

+       err = ww_mutex_lock(cycle->b_mutex, &ctx);
+       if (err == -EDEADLK) {
				# true
+               ww_mutex_unlock(&cycle->a_mutex);
+               ww_mutex_lock_slow(cycle->b_mutex, &ctx);
+               err = ww_mutex_lock(&cycle->a_mutex, &ctx);
				# returns with err == -EDEADLK
+       }
+
+       if (!err)
+               ww_mutex_unlock(cycle->b_mutex);
+       ww_mutex_unlock(&cycle->a_mutex);
				# traceback seen here:
				# unlocks a_mutex even though it was not
				# acquired by this thread

The problem is quite easy to reproduce with the following qemu command.

qemu-system-aarch64 -M virt -cpu cortex-a57 -nographic -monitor none \
	-kernel arch/arm64/boot/Image -no-reboot -smp 8 -m 512
	-device virtio-blk-pci,drive=d0 \
	-drive file=rootfs.ext2,if=none,id=d0,format=raw \
	-append 'console=ttyAMA0 root=/dev/vda rw'

or:

qemu-system-x86_64 \
	-kernel arch/x86/boot/bzImage \
	-M q35 \
	-cpu Skylake-Server \
	-no-reboot -smp 8 -m 1G \
	-usb -device usb-storage,drive=d0 \
	-drive file=rootfs.ext2,if=none,id=d0,format=raw \
	--append 'root=/dev/sda rw rootwait console=ttyS0 console=tty' \
	-nographic

Details don't really matter as long as the number of CPUs is at least 8
(I have not seen the problem with 1, 2, 4, or 6 CPUs). My test system
has 8 CPU cores (times 2 for hyperthreading), so that may be related.

The test case above is clearly wrong if both calls to ww_mutex_lock()
fail with -EDEADLK. Unfortunately I don't know the expected behavior
in this case, so I'll have to pass this on without a proposed fix.

Please let me know if there is anything I can do to help fixing
the problem.

Thanks,
Guenter

next             reply	other threads:[~2018-09-23 19:57 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-23 19:57 Guenter Roeck [this message]
2018-09-24 10:51 ` Traceback in ww_mutex test (test_cycle_work) on arm64/x86_64 Will Deacon
2018-10-02 21:53   ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180923195706.GA1538@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=chris@chris-wilson.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.