From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Jan Kara <jack@suse.cz>
Cc: Mingming Cao <cmm@us.ibm.com>,
ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: Delayed allocation and journal locking order inversion.
Date: Thu, 29 May 2008 13:20:56 +0530 [thread overview]
Message-ID: <20080529075056.GA24919@skywalker> (raw)
In-Reply-To: <20080528100833.GC8289@duck.suse.cz>
On Wed, May 28, 2008 at 12:08:33PM +0200, Jan Kara wrote:
> Hi Aneesh,
> The question here is, who is holding the lock from the page we wait
> for here. The two processes you show below don't seem to hold it. I'll
> check the full log ... searching ... I see!
> The problem is in generic_write_end()! It calls mark_inode_dirty() under
> page lock. That can possibly start a new transaction (which happened in
> your case) and that violates lock ordering (mark_inode_dirty() got stuck
> waiting for journal commit which is stuck waiting for other user to do
> journal_stop which waits for the page lock). Actually, there is no real
> need to call mark_inode_dirty() from under page lock - we just need to
> update i_size there. Something like the patch attached (untested)?
>
The patch works. Peter Zijlstra have patches to add lockdep annotation
to lock_page. I guess we will have to test the lock inversion patch with
the lockdep annotation to catch the deadlock scenarios like above.
http://programming.kicks-ass.net/kernel-patches/lockdep-page_lock/
Regarding delalloc I still have issues. The writepage can get called
with buffer_head marked delay and dirty as show below. This will result
in block allocation under lock_page.
RIP: 0010:[<ffffffff8030cc64>] [<ffffffff8030cc64>] ext4_da_writepage+0x26/0xad
Call Trace:
[<ffffffff8026999f>] shrink_page_list+0x31e/0x588
[<ffffffff80269d35>] shrink_inactive_list+0x12c/0x40d
[<ffffffff805ce6a2>] ? _spin_unlock_irqrestore+0x3f/0x68
[<ffffffff8024e83a>] ? trace_hardirqs_on+0xf1/0x115
[<ffffffff805ce6af>] ? _spin_unlock_irqrestore+0x4c/0x68
[<ffffffff803975b7>] ? __up_read+0x8c/0x94
[<ffffffff8026a0f3>] shrink_zone+0xdd/0x103
[<ffffffff8026ad69>] kswapd+0x34b/0x53e
[<ffffffff80268e4d>] ? isolate_pages_global+0x0/0x34
[<ffffffff80243ff4>] ? autoremove_wake_function+0x0/0x36
[<ffffffff805ce6af>] ? _spin_unlock_irqrestore+0x4c/0x68
[<ffffffff8026aa1e>] ? kswapd+0x0/0x53e
[<ffffffff80243d1b>] kthread+0x44/0x6b
[<ffffffff8020c1f8>] child_rip+0xa/0x12
[<ffffffff8020b8e3>] ? restore_args+0x0/0x30
[<ffffffff80243fcf>] ? kthreadd+0x16b/0x190
[<ffffffff80243cd7>] ? kthread+0x0/0x6b
[<ffffffff8020c1ee>] ? child_rip+0x0/0x12
-aneesh
next prev parent reply other threads:[~2008-05-29 7:51 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-28 9:16 Delayed allocation and journal locking order inversion Aneesh Kumar K.V
2008-05-28 10:08 ` Jan Kara
2008-05-29 7:50 ` Aneesh Kumar K.V [this message]
2008-05-29 12:27 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080529075056.GA24919@skywalker \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=cmm@us.ibm.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.