From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: Deadlocks due to per-process plugging Date: Fri, 13 Jul 2012 16:46:22 +0200 Message-ID: <20120713144622.GB28715@quack.suse.cz> References: <20120711133735.GA8122@quack.suse.cz> <20120711201601.GB9779@quack.suse.cz> <20120713123318.GB20361@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , Jeff Moyer , LKML , linux-fsdevel@vger.kernel.org, Tejun Heo , Jens Axboe , mgalbraith@suse.com To: Thomas Gleixner Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Fri 13-07-12 16:25:05, Thomas Gleixner wrote: > On Fri, 13 Jul 2012, Jan Kara wrote: > > On Thu 12-07-12 16:15:29, Thomas Gleixner wrote: > > > > Ah, I didn't know this. Thanks for the hint. So in the kdump I have I can > > > > see requests queued in tsk->plug despite the process is sleeping in > > > > TASK_UNINTERRUPTIBLE state. So the only way how unplug could have been > > > > omitted is if tsk_is_pi_blocked() was true. Rummaging through the dump... > > > > indeed task has pi_blocked_on = 0xffff8802717d79c8. The dump is from an -rt > > > > kernel (I just didn't originally thought that makes any difference) so > > > > actually any mutex is rtmutex and thus tsk_is_pi_blocked() is true whenever > > > > we are sleeping on a mutex. So this seems like a bug in rtmutex code. > > > > > > Well, the reason why this check is there is that the task which is > > > blocked on a lock can hold another lock which might cause a deadlock > > > in the flush path. > > OK. Let me understand the details. Block layer needs just queue_lock for > > unplug to succeed. That is a spinlock but in RT kernel, even a process > > holding a spinlock can be preempted if I remember correctly. So that > > condition is there effectively to not unplug when a task is being scheduled > > away while holding queue_lock? Did I get it right? > > blk_flush_plug_list() is not only queue_lock. There can be other locks > taken in the callbacks, elevator ... Yeah, right. > > > > Thomas, you seemed to have added that condition... Any idea how to avoid > > > > the deadlock? > > > > > > Good question. We could do the flush when the blocked task does not > > > hold a lock itself. Might be worth a try. > > Yeah, that should work for avoiding the deadlock as well. > > Though we don't have a lock held count except when lockdep is enabled, > which you probably don't want to do when running a production system. Agreed :). > But we only care about stuff being scheduled out while blocked on a > "sleeping spinlock" - i.e. spinlock, rwlock. > > So the patch below should allow the unplug to take place when blocked > on mutexes etc. Thanks for the patch! Mike will give it some testing. Honza -- Jan Kara SUSE Labs, CR