From: Amon Ott <a.ott@m-privacy.de>
To: ceph-devel@vger.kernel.org
Subject: Circular lock / deadlock in kernel client
Date: Wed, 30 Nov 2011 13:21:46 +0100 [thread overview]
Message-ID: <201111301321.47514.a.ott@m-privacy.de> (raw)
[-- Attachment #1: Type: text/plain, Size: 917 bytes --]
Hi!
With some kernel debug options for soft and hard lockup detection, I got some
fine traces. My kernel is a 3.1.4 to which I have ported from ceph-client
for-linus branch what is suitable for 3.1. If needed, I can make my exact
ceph code available.
Traces are attached. It seems that two depending locks can be acquired in
different order at different parts of the code, and thus lead to a deadlock.
Additionally, I am still trying to reproduce a partial lockup of single dirs
with this debugging. Those are likely to be related to mutex locking dirs
without unlocking properly.
Amon Ott
--
Dr. Amon Ott
m-privacy GmbH Tel: +49 30 24342334
Am Köllnischen Park 1 Fax: +49 30 24342336
10179 Berlin http://www.m-privacy.de
Amtsgericht Charlottenburg, HRB 84946
Geschäftsführer:
Dipl.-Kfm. Holger Maczkowsky,
Roman Maczkowsky
GnuPG-Key-ID: 0x2DD3A649
[-- Attachment #2: circ1.log --]
[-- Type: text/x-log, Size: 7178 bytes --]
Nov 30 10:40:21 tgpro1 kernel: =======================================================
Nov 30 10:40:21 tgpro1 kernel: [ INFO: possible circular locking dependency detected ]
Nov 30 10:40:21 tgpro1 kernel: 3.1.3-rsbac #1
Nov 30 10:40:21 tgpro1 kernel: -------------------------------------------------------
Nov 30 10:40:21 tgpro1 kernel: kworker/0:0/8787 is trying to acquire lock:
Nov 30 10:40:21 tgpro1 kernel: (&sb->s_type->i_lock_key#16){+.+...}, at: [<000f7176>] igrab+0x11/0x41
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: but task is already holding lock:
Nov 30 10:40:21 tgpro1 kernel: (&(&s->s_cap_lock)->rlock){+.+...}, at: [<006e0de4>] iterate_session_caps+0x40/0x17d [ceph]
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: which lock already depends on the new lock.
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: the existing dependency chain (in reverse order) is:
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: -> #1 (&(&s->s_cap_lock)->rlock){+.+...}:
Nov 30 10:40:21 tgpro1 kernel: [<0009038d>] lock_acquire+0x42/0x59
Nov 30 10:40:21 tgpro1 kernel: [<0052f167>] _raw_spin_lock+0x24/0x33
Nov 30 10:40:21 tgpro1 kernel: [<006dc0bc>] ceph_add_cap+0x302/0x551 [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006cfbfd>] fill_inode+0x5ef/0x72e [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006d0d6b>] ceph_fill_trace+0x663/0x6c5 [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006e4c6e>] dispatch+0xafe/0x10e7 [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006a444d>] con_work+0x14f7/0x16c6 [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<0007a24b>] process_one_work+0x223/0x391
Nov 30 10:40:21 tgpro1 kernel: [<0007a747>] worker_thread+0x182/0x2d7
Nov 30 10:40:21 tgpro1 kernel: [<0007d767>] kthread+0x62/0x67
Nov 30 10:40:21 tgpro1 kernel: [<00530b26>] kernel_thread_helper+0x6/0xd
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: -> #0 (&sb->s_type->i_lock_key#16){+.+...}:
Nov 30 10:40:21 tgpro1 kernel: [<0008fa3e>] __lock_acquire+0xe79/0x1425
Nov 30 10:40:21 tgpro1 kernel: [<0009038d>] lock_acquire+0x42/0x59
Nov 30 10:40:21 tgpro1 kernel: [<0052f167>] _raw_spin_lock+0x24/0x33
Nov 30 10:40:21 tgpro1 kernel: [<000f7176>] igrab+0x11/0x41
Nov 30 10:40:21 tgpro1 kernel: [<006e0e14>] iterate_session_caps+0x70/0x17d [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006e3bca>] send_mds_reconnect+0x319/0x46a [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006e4035>] ceph_mdsc_handle_map+0x2f2/0x42d [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006ce00d>] extra_mon_dispatch+0x18/0x1f [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006a63dd>] dispatch+0x4f7/0x52a [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<006a444d>] con_work+0x14f7/0x16c6 [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<0007a24b>] process_one_work+0x223/0x391
Nov 30 10:40:21 tgpro1 kernel: [<0007a747>] worker_thread+0x182/0x2d7
Nov 30 10:40:21 tgpro1 kernel: [<0007d767>] kthread+0x62/0x67
Nov 30 10:40:21 tgpro1 kernel: [<00530b26>] kernel_thread_helper+0x6/0xd
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: other info that might help us debug this:
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: Possible unsafe locking scenario:
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: CPU0 CPU1
Nov 30 10:40:21 tgpro1 kernel: ---- ----
Nov 30 10:40:21 tgpro1 kernel: lock(&(&s->s_cap_lock)->rlock);
Nov 30 10:40:21 tgpro1 kernel: lock(&sb->s_type->i_lock_key);
Nov 30 10:40:21 tgpro1 kernel: lock(&(&s->s_cap_lock)->rlock);
Nov 30 10:40:21 tgpro1 kernel: lock(&sb->s_type->i_lock_key);
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: *** DEADLOCK ***
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: 5 locks held by kworker/0:0/8787:
Nov 30 10:40:21 tgpro1 kernel: #0: (ceph-msgr){.+.+.+}, at: [<0007a1ff>] process_one_work+0x1d7/0x391
Nov 30 10:40:21 tgpro1 kernel: #1: ((&(&con->work)->work)){+.+.+.}, at: [<0007a1ff>] process_one_work+0x1d7/0x391
Nov 30 10:40:21 tgpro1 kernel: #2: (&s->s_mutex){+.+.+.}, at: [<006e3954>] send_mds_reconnect+0xa3/0x46a [ceph]
Nov 30 10:40:21 tgpro1 kernel: #3: (&mdsc->snap_rwsem){+++++.}, at: [<006e3a4c>] send_mds_reconnect+0x19b/0x46a [ceph]
Nov 30 10:40:21 tgpro1 kernel: #4: (&(&s->s_cap_lock)->rlock){+.+...}, at: [<006e0de4>] iterate_session_caps+0x40/0x17d [ceph]
Nov 30 10:40:21 tgpro1 kernel:
Nov 30 10:40:21 tgpro1 kernel: stack backtrace:
Nov 30 10:40:21 tgpro1 kernel: Pid: 8787, comm: kworker/0:0 Tainted: G W 3.1.3-rsbac #1
Nov 30 10:40:21 tgpro1 kernel: Call Trace:
Nov 30 10:40:21 tgpro1 kernel: [<0008e776>] print_circular_bug+0x21a/0x227
Nov 30 10:40:21 tgpro1 kernel: [<0008fa3e>] __lock_acquire+0xe79/0x1425
Nov 30 10:40:21 tgpro1 kernel: [<0009038d>] lock_acquire+0x42/0x59
Nov 30 10:40:21 tgpro1 kernel: [<000f7176>] ? igrab+0x11/0x41
Nov 30 10:40:21 tgpro1 kernel: [<0052f167>] _raw_spin_lock+0x24/0x33
Nov 30 10:40:21 tgpro1 kernel: [<000f7176>] ? igrab+0x11/0x41
Nov 30 10:40:21 tgpro1 kernel: [<000f7176>] igrab+0x11/0x41
Nov 30 10:40:21 tgpro1 kernel: [<006e0e14>] iterate_session_caps+0x70/0x17d [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006e5488>] ? ceph_mdsc_submit_request+0x55/0x55 [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006a4d03>] ? ceph_pagelist_append+0xbc/0xf9 [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<006e3bca>] send_mds_reconnect+0x319/0x46a [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<0008daa1>] ? trace_hardirqs_on_caller+0x10b/0x13c
Nov 30 10:40:21 tgpro1 kernel: [<0008dadd>] ? trace_hardirqs_on+0xb/0xd
Nov 30 10:40:21 tgpro1 kernel: [<006e4035>] ceph_mdsc_handle_map+0x2f2/0x42d [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<000d7cd3>] ? __kmalloc+0x10d/0x137
Nov 30 10:40:21 tgpro1 kernel: [<00c40015>] ? 0xc40014
Nov 30 10:40:21 tgpro1 kernel: [<006ce00d>] extra_mon_dispatch+0x18/0x1f [ceph]
Nov 30 10:40:21 tgpro1 kernel: [<006a63dd>] dispatch+0x4f7/0x52a [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<0008dadd>] ? trace_hardirqs_on+0xb/0xd
Nov 30 10:40:21 tgpro1 kernel: [<006a444d>] con_work+0x14f7/0x16c6 [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<0007a24b>] process_one_work+0x223/0x391
Nov 30 10:40:21 tgpro1 kernel: [<0007a1ff>] ? process_one_work+0x1d7/0x391
Nov 30 10:40:21 tgpro1 kernel: [<006a2f56>] ? ceph_fault+0x262/0x262 [libceph]
Nov 30 10:40:21 tgpro1 kernel: [<0007a747>] worker_thread+0x182/0x2d7
Nov 30 10:40:21 tgpro1 kernel: [<00002a00>] ? __show_regs+0x6f/0x6f
Nov 30 10:40:21 tgpro1 kernel: [<00002a00>] ? __show_regs+0x6f/0x6f
Nov 30 10:40:21 tgpro1 kernel: [<0007a5c5>] ? rescuer_thread+0x20c/0x20c
Nov 30 10:40:21 tgpro1 kernel: [<0007d767>] kthread+0x62/0x67
Nov 30 10:40:21 tgpro1 kernel: [<0007d705>] ? __init_kthread_worker+0x42/0x42
Nov 30 10:40:21 tgpro1 kernel: [<00530b26>] kernel_thread_helper+0x6/0xd
Nov 30 10:40:26 tgpro1 kernel: ceph: mds0 reconnect success
Nov 30 10:40:40 tgpro1 kernel: ceph: mds0 recovery completed
[-- Attachment #3: circ2.log --]
[-- Type: text/x-log, Size: 7686 bytes --]
Nov 30 10:50:53 tgpro1 kernel: libceph: loaded (mon/osd proto 15/24, osdmap 5/6 5/6)
Nov 30 10:50:53 tgpro1 kernel: ceph: loaded (mds proto 32)
Nov 30 10:50:53 tgpro1 kernel: libceph: client6597 fsid 520649ba-50cc-2c58-7db3-4b30d3bb97d3
Nov 30 10:50:53 tgpro1 kernel: libceph: mon0 192.168.111.1:6789 session established
Nov 30 10:56:43 tgpro1 kernel: ceph: mds0 caps stale
Nov 30 10:57:03 tgpro1 kernel: ceph: mds0 caps stale
Nov 30 10:58:24 tgpro1 kernel: ceph: mds0 reconnect start
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel: =======================================================
Nov 30 10:58:24 tgpro1 kernel: [ INFO: possible circular locking dependency detected ]
Nov 30 10:58:24 tgpro1 kernel: 3.1.4-rsbac #1
Nov 30 10:58:24 tgpro1 kernel: -------------------------------------------------------
Nov 30 10:58:24 tgpro1 kernel: kworker/0:0/4 is trying to acquire lock:
Nov 30 10:58:24 tgpro1 kernel: (&sb->s_type->i_lock_key#16){+.+...}, at: [<000f7166>] igrab+0x11/0x41
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel: but task is already holding lock:
Nov 30 10:58:24 tgpro1 kernel: (&(&s->s_cap_lock)->rlock){+.+...}, at: [<006b2de4>] iterate_session_caps+0x40/0x17d [ceph]
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel: which lock already depends on the new lock.
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel: the existing dependency chain (in reverse order) is:
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel: -> #1 (&(&s->s_cap_lock)->rlock){+.+...}:
Nov 30 10:58:24 tgpro1 kernel: [<0009038d>] lock_acquire+0x42/0x59
Nov 30 10:58:24 tgpro1 kernel: [<0052f147>] _raw_spin_lock+0x24/0x33
Nov 30 10:58:24 tgpro1 kernel: [<006ae0bc>] ceph_add_cap+0x302/0x551 [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006a1bfd>] fill_inode+0x5ef/0x72e [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006a2d6b>] ceph_fill_trace+0x663/0x6c5 [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006b6c6e>] dispatch+0xafe/0x10e7 [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<0067644d>] con_work+0x14f7/0x16c6 [libceph]
Nov 30 10:58:24 tgpro1 kernel: [<0007a24b>] process_one_work+0x223/0x391
Nov 30 10:58:24 tgpro1 kernel: [<0007a747>] worker_thread+0x182/0x2d7
Nov 30 10:58:24 tgpro1 kernel: [<0007d767>] kthread+0x62/0x67
Nov 30 10:58:24 tgpro1 kernel: [<00530b06>] kernel_thread_helper+0x6/0xd
Nov 30 10:58:24 tgpro1 kernel:
Nov 30 10:58:24 tgpro1 kernel: -> #0 (&sb->s_type->i_lock_key#16){+.+...}:
Nov 30 10:58:24 tgpro1 kernel: [<0008fa3e>] __lock_acquire+0xe79/0x1425
Nov 30 10:58:24 tgpro1 kernel: [<0009038d>] lock_acquire+0x42/0x59
Nov 30 10:58:24 tgpro1 kernel: [<0052f147>] _raw_spin_lock+0x24/0x33
Nov 30 10:58:24 tgpro1 kernel: [<000f7166>] igrab+0x11/0x41
Nov 30 10:58:24 tgpro1 kernel: [<006b2e14>] iterate_session_caps+0x70/0x17d [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006b5bca>] send_mds_reconnect+0x319/0x46a [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006b6035>] ceph_mdsc_handle_map+0x2f2/0x42d [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006a000d>] extra_mon_dispatch+0x18/0x1f [ceph]
Nov 30 10:58:24 tgpro1 kernel: [<006783dd>] dispatch+0x4f7/0x52a [libceph]
Nov 30 10:58:24 tgpro1 kernel: [<0067644d>] con_work+0x14f7/0x16c6 [libceph]
Nov 30 10:58:24 tgpro1 kernel: [<0007a24b>] process_one_work+0x223/0x391
Nov 30 10:58:24 tgpro1 kernel: [<0007a747>] worker_thread+0x182/0x2d7
Nov 30 10:58:25 tgpro1 kernel: [<0007d767>] kthread+0x62/0x67
Nov 30 10:58:25 tgpro1 kernel: [<00530b06>] kernel_thread_helper+0x6/0xd
Nov 30 10:58:25 tgpro1 kernel:
Nov 30 10:58:25 tgpro1 kernel: other info that might help us debug this:
Nov 30 10:58:25 tgpro1 kernel:
Nov 30 10:58:25 tgpro1 kernel: Possible unsafe locking scenario:
Nov 30 10:58:25 tgpro1 kernel:
Nov 30 10:58:25 tgpro1 kernel: CPU0 CPU1
Nov 30 10:58:25 tgpro1 kernel: ---- ----
Nov 30 10:58:25 tgpro1 kernel: lock(&(&s->s_cap_lock)->rlock);
Nov 30 10:58:25 tgpro1 kernel: lock(&sb->s_type->i_lock_key);
Nov 30 10:58:25 tgpro1 kernel: lock(&(&s->s_cap_lock)->rlock);
Nov 30 10:58:25 tgpro1 kernel: lock(&sb->s_type->i_lock_key);
Nov 30 10:58:25 tgpro1 kernel:
Nov 30 10:58:25 tgpro1 kernel: *** DEADLOCK ***
Nov 30 10:58:25 tgpro1 kernel:
Nov 30 10:58:25 tgpro1 kernel: 5 locks held by kworker/0:0/4:
Nov 30 10:58:25 tgpro1 kernel: #0: (ceph-msgr){.+.+.+}, at: [<0007a1ff>] process_one_work+0x1d7/0x391
Nov 30 10:58:25 tgpro1 kernel: #1: ((&(&con->work)->work)){+.+.+.}, at: [<0007a1ff>] process_one_work+0x1d7/0x391
Nov 30 10:58:25 tgpro1 kernel: #2: (&s->s_mutex){+.+.+.}, at: [<006b5954>] send_mds_reconnect+0xa3/0x46a [ceph]
Nov 30 10:58:25 tgpro1 kernel: #3: (&mdsc->snap_rwsem){+++++.}, at: [<006b5a4c>] send_mds_reconnect+0x19b/0x46a [ceph]
Nov 30 10:58:25 tgpro1 kernel: #4: (&(&s->s_cap_lock)->rlock){+.+...}, at: [<006b2de4>] iterate_session_caps+0x40/0x17d [ceph]
Nov 30 10:58:25 tgpro1 kernel:
Nov 30 10:58:25 tgpro1 kernel: stack backtrace:
Nov 30 10:58:25 tgpro1 kernel: Pid: 4, comm: kworker/0:0 Tainted: G W 3.1.4-rsbac #1
Nov 30 10:58:25 tgpro1 kernel: Call Trace:
Nov 30 10:58:25 tgpro1 kernel: [<0008e776>] print_circular_bug+0x21a/0x227
Nov 30 10:58:25 tgpro1 kernel: [<0008fa3e>] __lock_acquire+0xe79/0x1425
Nov 30 10:58:25 tgpro1 kernel: [<0009038d>] lock_acquire+0x42/0x59
Nov 30 10:58:25 tgpro1 kernel: [<000f7166>] ? igrab+0x11/0x41
Nov 30 10:58:25 tgpro1 kernel: [<0052f147>] _raw_spin_lock+0x24/0x33
Nov 30 10:58:25 tgpro1 kernel: [<000f7166>] ? igrab+0x11/0x41
Nov 30 10:58:25 tgpro1 kernel: [<000f7166>] igrab+0x11/0x41
Nov 30 10:58:25 tgpro1 kernel: [<006b2e14>] iterate_session_caps+0x70/0x17d [ceph]
Nov 30 10:58:25 tgpro1 kernel: [<006b7488>] ? ceph_mdsc_submit_request+0x55/0x55 [ceph]
Nov 30 10:58:25 tgpro1 kernel: [<00676d03>] ? ceph_pagelist_append+0xbc/0xf9 [libceph]
Nov 30 10:58:25 tgpro1 kernel: [<006b5bca>] send_mds_reconnect+0x319/0x46a [ceph]
Nov 30 10:58:25 tgpro1 kernel: [<0008daa1>] ? trace_hardirqs_on_caller+0x10b/0x13c
Nov 30 10:58:25 tgpro1 kernel: [<0008dadd>] ? trace_hardirqs_on+0xb/0xd
Nov 30 10:58:25 tgpro1 kernel: [<006b6035>] ceph_mdsc_handle_map+0x2f2/0x42d [ceph]
Nov 30 10:58:25 tgpro1 kernel: [<000d7cd3>] ? __kmalloc+0x10d/0x137
Nov 30 10:58:25 tgpro1 kernel: [<00c40015>] ? 0xc40014
Nov 30 10:58:25 tgpro1 kernel: [<006a000d>] extra_mon_dispatch+0x18/0x1f [ceph]
Nov 30 10:58:25 tgpro1 kernel: [<006783dd>] dispatch+0x4f7/0x52a [libceph]
Nov 30 10:58:25 tgpro1 kernel: [<0008dadd>] ? trace_hardirqs_on+0xb/0xd
Nov 30 10:58:25 tgpro1 kernel: [<0067644d>] con_work+0x14f7/0x16c6 [libceph]
Nov 30 10:58:25 tgpro1 kernel: [<0007a24b>] process_one_work+0x223/0x391
Nov 30 10:58:25 tgpro1 kernel: [<0007a1ff>] ? process_one_work+0x1d7/0x391
Nov 30 10:58:25 tgpro1 kernel: [<00674f56>] ? ceph_fault+0x262/0x262 [libceph]
Nov 30 10:58:25 tgpro1 kernel: [<0007a747>] worker_thread+0x182/0x2d7
Nov 30 10:58:25 tgpro1 kernel: [<00002a00>] ? __show_regs+0x6f/0x6f
Nov 30 10:58:25 tgpro1 kernel: [<00002a00>] ? __show_regs+0x6f/0x6f
Nov 30 10:58:25 tgpro1 kernel: [<0007a5c5>] ? rescuer_thread+0x20c/0x20c
Nov 30 10:58:25 tgpro1 kernel: [<0007d767>] kthread+0x62/0x67
Nov 30 10:58:25 tgpro1 kernel: [<0007d705>] ? __init_kthread_worker+0x42/0x42
Nov 30 10:58:25 tgpro1 kernel: [<00530b06>] kernel_thread_helper+0x6/0xd
Nov 30 10:58:26 tgpro1 kernel: ceph: mds0 reconnect success
Nov 30 10:58:30 tgpro1 kernel: ceph: mds0 recovery completed
[-- Attachment #4: circ3.log --]
[-- Type: text/x-log, Size: 6696 bytes --]
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: =======================================================
Nov 30 13:02:05 tgpro1 kernel: [ INFO: possible circular locking dependency detected ]
Nov 30 13:02:05 tgpro1 kernel: 3.1.4-rsbac #1
Nov 30 13:02:05 tgpro1 kernel: -------------------------------------------------------
Nov 30 13:02:05 tgpro1 kernel: kworker/0:1/22099 is trying to acquire lock:
Nov 30 13:02:05 tgpro1 kernel: (&sb->s_type->i_lock_key#16){+.+...}, at: [<000ff721>] igrab+0x11/0x41
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: but task is already holding lock:
Nov 30 13:02:05 tgpro1 kernel: (&(&realm->inodes_with_caps_lock)->rlock){+.+...}, at: [<006ce9b4>] ceph_update_snap_trace+0x258/0x3bc [ceph]
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: which lock already depends on the new lock.
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: the existing dependency chain (in reverse order) is:
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: -> #1 (&(&realm->inodes_with_caps_lock)->rlock){+.+...}:
Nov 30 13:02:05 tgpro1 kernel: [<00091456>] lock_acquire+0x42/0x59
Nov 30 13:02:05 tgpro1 kernel: [<0053f187>] _raw_spin_lock+0x24/0x33
Nov 30 13:02:05 tgpro1 kernel: [<006cc0a3>] ceph_add_cap+0x37d/0x554 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<006bfbf7>] fill_inode+0x5ee/0x72d [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<006c0d61>] ceph_fill_trace+0x663/0x6c5 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<006d4bd4>] dispatch+0xaff/0x10e9 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<00694459>] con_work+0x14f4/0x16c3 [libceph]
Nov 30 13:02:05 tgpro1 kernel: [<0007b331>] process_one_work+0x229/0x397
Nov 30 13:02:05 tgpro1 kernel: [<0007b80e>] worker_thread+0x182/0x2d8
Nov 30 13:02:05 tgpro1 kernel: [<0007e843>] kthread+0x62/0x67
Nov 30 13:02:05 tgpro1 kernel: [<00540c86>] kernel_thread_helper+0x6/0xd
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: -> #0 (&sb->s_type->i_lock_key#16){+.+...}:
Nov 30 13:02:05 tgpro1 kernel: [<00090b06>] __lock_acquire+0xe7b/0x1428
Nov 30 13:02:05 tgpro1 kernel: [<00091456>] lock_acquire+0x42/0x59
Nov 30 13:02:05 tgpro1 kernel: [<0053f187>] _raw_spin_lock+0x24/0x33
Nov 30 13:02:05 tgpro1 kernel: [<000ff721>] igrab+0x11/0x41
Nov 30 13:02:05 tgpro1 kernel: [<006ce9d3>] ceph_update_snap_trace+0x277/0x3bc [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<006cee4e>] ceph_handle_snap+0x336/0x458 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<006d4e7d>] dispatch+0xda8/0x10e9 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<00694459>] con_work+0x14f4/0x16c3 [libceph]
Nov 30 13:02:05 tgpro1 kernel: [<0007b331>] process_one_work+0x229/0x397
Nov 30 13:02:05 tgpro1 kernel: [<0007b80e>] worker_thread+0x182/0x2d8
Nov 30 13:02:05 tgpro1 kernel: [<0007e843>] kthread+0x62/0x67
Nov 30 13:02:05 tgpro1 kernel: [<00540c86>] kernel_thread_helper+0x6/0xd
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: other info that might help us debug this:
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: Possible unsafe locking scenario:
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: CPU0 CPU1
Nov 30 13:02:05 tgpro1 kernel: ---- ----
Nov 30 13:02:05 tgpro1 kernel: lock(&(&realm->inodes_with_caps_lock)->rlock);
Nov 30 13:02:05 tgpro1 kernel: lock(&sb->s_type->i_lock_key);
Nov 30 13:02:05 tgpro1 kernel: lock(&(&realm->inodes_with_caps_lock)->rlock);
Nov 30 13:02:05 tgpro1 kernel: lock(&sb->s_type->i_lock_key);
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: *** DEADLOCK ***
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: 4 locks held by kworker/0:1/22099:
Nov 30 13:02:05 tgpro1 kernel: #0: (ceph-msgr){++++.+}, at: [<0007b2e5>] process_one_work+0x1dd/0x397
Nov 30 13:02:05 tgpro1 kernel: #1: ((&(&con->work)->work)){+.+.+.}, at: [<0007b2e5>] process_one_work+0x1dd/0x397
Nov 30 13:02:05 tgpro1 kernel: #2: (&mdsc->snap_rwsem){+++++.}, at: [<006cebe5>] ceph_handle_snap+0xcd/0x458 [ceph]
Nov 30 13:02:05 tgpro1 kernel: #3: (&(&realm->inodes_with_caps_lock)->rlock){+.+...}, at: [<006ce9b4>] ceph_update_snap_trace+0x258/0x3bc [ceph]
Nov 30 13:02:05 tgpro1 kernel:
Nov 30 13:02:05 tgpro1 kernel: stack backtrace:
Nov 30 13:02:05 tgpro1 kernel: Pid: 22099, comm: kworker/0:1 Tainted: G W 3.1.4-rsbac #1
Nov 30 13:02:05 tgpro1 kernel: Call Trace:
Nov 30 13:02:05 tgpro1 kernel: [<0008f840>] print_circular_bug+0x219/0x226
Nov 30 13:02:05 tgpro1 kernel: [<00090b06>] __lock_acquire+0xe7b/0x1428
Nov 30 13:02:05 tgpro1 kernel: [<00090b54>] ? __lock_acquire+0xec9/0x1428
Nov 30 13:02:05 tgpro1 kernel: [<00091456>] lock_acquire+0x42/0x59
Nov 30 13:02:05 tgpro1 kernel: [<000ff721>] ? igrab+0x11/0x41
Nov 30 13:02:05 tgpro1 kernel: [<0053f187>] _raw_spin_lock+0x24/0x33
Nov 30 13:02:05 tgpro1 kernel: [<000ff721>] ? igrab+0x11/0x41
Nov 30 13:02:05 tgpro1 kernel: [<000ff721>] igrab+0x11/0x41
Nov 30 13:02:05 tgpro1 kernel: [<006ce9d3>] ceph_update_snap_trace+0x277/0x3bc [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<006cee4e>] ceph_handle_snap+0x336/0x458 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<007f0312>] ? 0x7f0311
Nov 30 13:02:05 tgpro1 kernel: [<006d4e7d>] dispatch+0xda8/0x10e9 [ceph]
Nov 30 13:02:05 tgpro1 kernel: [<0048738c>] ? kernel_recvmsg+0x2a/0x34
Nov 30 13:02:05 tgpro1 kernel: [<0053dbb6>] ? __mutex_unlock_slowpath+0xdf/0xf8
Nov 30 13:02:05 tgpro1 kernel: [<0008eb6f>] ? trace_hardirqs_on_caller+0x10b/0x13c
Nov 30 13:02:05 tgpro1 kernel: [<0008ebab>] ? trace_hardirqs_on+0xb/0xd
Nov 30 13:02:05 tgpro1 kernel: [<00694459>] con_work+0x14f4/0x16c3 [libceph]
Nov 30 13:02:05 tgpro1 kernel: [<0000784e>] ? text_poke+0x79/0x95
Nov 30 13:02:05 tgpro1 kernel: [<0007b331>] process_one_work+0x229/0x397
Nov 30 13:02:05 tgpro1 kernel: [<0007b2e5>] ? process_one_work+0x1dd/0x397
Nov 30 13:02:05 tgpro1 kernel: [<00692f65>] ? ceph_fault+0x262/0x262 [libceph]
Nov 30 13:02:05 tgpro1 kernel: [<0007b80e>] worker_thread+0x182/0x2d8
Nov 30 13:02:05 tgpro1 kernel: [<00002b40>] ? setup_sigcontext+0x24/0x24
Nov 30 13:02:05 tgpro1 kernel: [<00002b40>] ? setup_sigcontext+0x24/0x24
Nov 30 13:02:05 tgpro1 kernel: [<0007b68c>] ? rescuer_thread+0x1ed/0x1ed
Nov 30 13:02:05 tgpro1 kernel: [<0007e843>] kthread+0x62/0x67
Nov 30 13:02:05 tgpro1 kernel: [<0007e7e1>] ? __init_kthread_worker+0x42/0x42
Nov 30 13:02:05 tgpro1 kernel: [<00540c86>] kernel_thread_helper+0x6/0xd
next reply other threads:[~2011-11-30 12:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-30 12:21 Amon Ott [this message]
2011-11-30 17:20 ` Circular lock / deadlock in kernel client Sage Weil
2011-11-30 17:55 ` Sage Weil
2011-12-01 8:16 ` Amon Ott
2011-12-01 16:12 ` Sage Weil
2011-12-01 16:24 ` Amon Ott
2011-12-06 13:21 ` Amon Ott
2011-12-07 15:20 ` Amon Ott
2011-12-07 16:33 ` Sage Weil
2011-12-08 9:58 ` Amon Ott
2011-12-08 13:39 ` Amon Ott
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201111301321.47514.a.ott@m-privacy.de \
--to=a.ott@m-privacy.de \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.