From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sunil Mushran Date: Thu, 13 May 2010 15:25:45 -0700 Subject: [Ocfs2-devel] Deadlock in DLM code still there In-Reply-To: <20100513194320.GA28367@quack.suse.cz> References: <20100513194320.GA28367@quack.suse.cz> Message-ID: <4BEC7C69.4030208@oracle.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com Yes. This is a tricky problem. I'll work on it as soon as I have completed my current task. 05/13/2010 12:43 PM, Jan Kara wrote: > Hi, > > in http://www.mail-archive.com/ocfs2-devel at oss.oracle.com/msg03188.html > (more than an year ago) I've reported a lock inversion between dlm->ast_lock > and res->spinlock. The deadlock seems to be still there in 2.6.34-rc7: > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.34-rc7-xen #4 > ------------------------------------------------------- > dlm_thread/2001 is trying to acquire lock: > (&(&dlm->ast_lock)->rlock){+.+...}, at: [] dlm_queue_bast+0x55/0x1e0 [ocfs2_dlm] > > but task is already holding lock: > (&(&res->spinlock)->rlock){+.+...}, at: [] dlm_thread+0x7cd/0x17f0 [ocfs2_dlm] > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #1 (&(&res->spinlock)->rlock){+.+...}: > [] __lock_acquire+0x109f/0x1720 > [] lock_acquire+0x69/0x90 > [] _raw_spin_lock+0x2c/0x40 > [] _atomic_dec_and_lock+0x78/0xa0 > [] dlm_lockres_release_ast+0x29/0xb0 [ocfs2_dlm] > [] dlm_thread+0x10e1/0x17f0 [ocfs2_dlm] > [] kthread+0x8e/0xa0 > [] kernel_thread_helper+0x4/0x10 > > -> #0 (&(&dlm->ast_lock)->rlock){+.+...}: > [] __lock_acquire+0x14f8/0x1720 > [] lock_acquire+0x69/0x90 > [] _raw_spin_lock+0x2c/0x40 > [] dlm_queue_bast+0x55/0x1e0 [ocfs2_dlm] > [] dlm_thread+0xbef/0x17f0 [ocfs2_dlm] > [] kthread+0x8e/0xa0 > [] kernel_thread_helper+0x4/0x10 > > other info that might help us debug this: > > 1 lock held by dlm_thread/2001: > #0: (&(&res->spinlock)->rlock){+.+...}, at: [] dlm_thread+0x7cd/0x17f0 [ocfs2_dlm] > > stack backtrace: > Pid: 2001, comm: dlm_thread Not tainted 2.6.34-rc7-xen #4 > Call Trace: > [] print_circular_bug+0xf0/0x100 > [] __lock_acquire+0x14f8/0x1720 > [] ? xen_force_evtchn_callback+0xd/0x10 > [] lock_acquire+0x69/0x90 > [] ? dlm_queue_bast+0x55/0x1e0 [ocfs2_dlm] > [] _raw_spin_lock+0x2c/0x40 > [] ? dlm_queue_bast+0x55/0x1e0 [ocfs2_dlm] > [] dlm_queue_bast+0x55/0x1e0 [ocfs2_dlm] > [] dlm_thread+0xbef/0x17f0 [ocfs2_dlm] > [] ? trace_hardirqs_off+0xd/0x10 > [] ? trace_hardirqs_on+0xd/0x10 > [] ? _raw_spin_unlock_irq+0x32/0x40 > [] ? autoremove_wake_function+0x0/0x40 > [] ? dlm_thread+0x0/0x17f0 [ocfs2_dlm] > [] kthread+0x8e/0xa0 > [] kernel_thread_helper+0x4/0x10 > [] ? restore_args+0x0/0x30 > [] ? kernel_thread_helper+0x0/0x10 > > I'm now regularly hitting this problem so it stops me from verifying > whether there are other possible deadlocks in ocfs2 quota code... > > Honza >