linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Ted Ts'o <tytso@mit.edu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: What am I doing wrong?  submit_bio() suddenly stops working...
Date: Thu, 21 Oct 2010 19:46:15 +0200	[thread overview]
Message-ID: <4CC07C67.9010502@kernel.dk> (raw)
In-Reply-To: <20101021165525.GB3127@thunk.org>

On 2010-10-21 18:55, Ted Ts'o wrote:
> On Thu, Oct 21, 2010 at 08:59:46AM +0200, Jens Axboe wrote:
>>
>> I don't see anything immediately wrong with your approach. I suspect
>> we'll need to see sysrq-t traces of the relevant processes to make a
>> more educated guess!
> 
> I've uploaded a trace output that includes the sysrq-t trace, but I
> don't think it shows anything interesting.  We're not hanging on any
> kind of loack as near as I can tell.  It looks like
> __generic_make_request() is calling q->make_request_fn(), and this is
> returning without actually doing anything.
> 
> http://userweb.kernel.org/~tytso/ext4-bio-patches/kvm-console-2
> 
> In this trace, I added a patch to prove that __generic_make_request()
> is calling __make_request (I wasn't sure what q->make_request_fn was
> indirecting to, so I added a brute force lookup to make sure I
> understood what was going on), but at one point, it just starts
> queuing the request, and it enters cfq, but the request never gets
> dispatched out.  Maybe this is a failure of the plugging/unplugging
> mechanisms?
> 
> I guess I can start putting in more brute-force printk's inside
> __make_request and inside the cfq scheduler to try to understand what
> is going on, but I'm really guessing at this point.
> 
> If you have any suggestions about more elegant ways of figuring what
> is happening, please do let me know....

I will take a look at the traces.

By the sound of things, if I were you I'd turn on the mem and slab
debugging to catch use-before-init and use-after-free. Mysterious hangs
in the IO sub system are usually caused by such bugs. And the regular
debugging aids, just to see if that produces anything of interest.

-- 
Jens Axboe


  parent reply	other threads:[~2010-10-21 17:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-21  2:00 What am I doing wrong? submit_bio() suddenly stops working Theodore Ts'o
2010-10-21  6:59 ` Jens Axboe
2010-10-21 16:55   ` Ted Ts'o
2010-10-21 17:41     ` Boaz Harrosh
2010-10-21 18:07       ` Jens Axboe
2010-10-22 13:04         ` Peter Zijlstra
2010-10-22 14:08           ` Jens Axboe
2010-10-21 18:14       ` Ted Ts'o
2010-10-21 18:21         ` Jens Axboe
2010-10-21 17:46     ` Jens Axboe [this message]
2010-10-21 21:29       ` Ted Ts'o
2010-10-22  3:34       ` Ted Ts'o
2010-10-22  7:19         ` Jens Axboe
2010-10-23 14:48           ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CC07C67.9010502@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).