All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Zheng Liu <gnehzuil.liu@gmail.com>
Cc: Jan Kara <jack@suse.cz>, linux-ext4@vger.kernel.org
Subject: Re: [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269
Date: Wed, 13 Mar 2013 12:04:26 +0100	[thread overview]
Message-ID: <20130313110426.GD29730@quack.suse.cz> (raw)
In-Reply-To: <20130313105233.GB12012@gmail.com>

On Wed 13-03-13 18:52:33, Zheng Liu wrote:
> On Wed, Mar 13, 2013 at 10:15:11AM +0100, Jan Kara wrote:
> [snip]
> > > > I post the sysrq-w output here.  But IMHO it is not very useful.  So I
> > > > also post the sysrq-t output.
> > >   Heh, curious. Thanks for the data. So worker thinks there's nothing to do
> > > but inode has elevated i_ioend_count... Maybe we leaked ioend somewhere.
> > > I'll check the code when I have time.
> >   Ah, I think I see what's going on.
> > a) Code in ext4_ext_direct_IO() is racy wrt iocb->private handling (that
> >    can get cleared concurrently from ext4_end_io_dio()).
> 
> Thanks for tracing this problem.  But I am still confused that iocb is
> allocated on stack in do_sync_write(), and is allocated from slab in
> ioctx_alloc().  You mean iocb in ext4_ext_direct_IO and ext4_end_io_dio
> is the same one?
  Yes, it is.

> Then this iocb could be changed concurrently, and we are blocked for more
> than 120s.  I must miss something.
  Well, the hang results from direct IO code forgetting to call
ext4_free_io_end() in some (likely error recovery) path. So
inode->i_ioend_count remains elevated and we never finish waiting in
ext4_evict_inode(). How that forgotten ext4_free_io_end() really happens
isn't 100% clear to me but I really suspect something with concurrent iocb
modification goes wrong...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

      reply	other threads:[~2013-03-13 11:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-07 12:40 [BUG][dioread_nolock] blocked for more than 120s when we run xfstests #269 Zheng Liu
2013-03-07 15:11 ` Jan Kara
     [not found]   ` <20130308135222.GA2768@gmail.com>
     [not found]     ` <20130311163041.GL29799@quack.suse.cz>
     [not found]       ` <20130313091511.GB29730@quack.suse.cz>
2013-03-13 10:52         ` Zheng Liu
2013-03-13 11:04           ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130313110426.GD29730@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=gnehzuil.liu@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.