From: Ted Ts'o <tytso@mit.edu>
To: Tao Ma <tm@tao.ma>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH] ext4: Check io list state and avoid an unnecessary mutex_lock in ext4_end_io_work.
Date: Sat, 29 Oct 2011 16:57:40 -0400 [thread overview]
Message-ID: <20111029205740.GB16825@thunk.org> (raw)
In-Reply-To: <1315989182-6170-1-git-send-email-tm@tao.ma>
On Wed, Sep 14, 2011 at 04:33:02PM +0800, Tao Ma wrote:
> From: Tao Ma <boyu.mt@taobao.com>
>
> When we finish the end io work in ext4_flush_completed_IO, we take
> the io work away from the list, but don't free it. Then in the workqueue,
> we can check the list state and then avoid the extra work if it is also
> done. It is good, but we check list state in ext4_end_io_nolock with i_mutex held
> instead of the spin_lock in other places. This is wrong.
Hi Tao,
Thanks for finding and pointing out this bug in ext4_end_io_nolock().
Unfortunately your proposed fix doesn't take into account that there
are other callers of ext4_end_io_nolock() besides ext4_end_io_work(),
and so it's not sufficient to move the test from former function to
the latter.
Attached please find the patch which I am planning to check into the
ext4 tree to address this bug which you have pointed out.
Regards,
- Ted
commit c0e36d8410bfad4db4edefeb4175f85a5d216c8d
Author: Theodore Ts'o <tytso@mit.edu>
Date: Sat Oct 29 16:57:19 2011 -0400
ext4: use spinlock before checking io->list in ext4_io_end_nolock()
In ext4_end_io_nolock(), io->list is checked to see if it is empty
without taking the ei->i_completed_io_lock spinlock. This violates
the locking protocol, and can cause very hard to debug failures.
Also optimize ext4_end_io_work() so that if ext4_end_io_nolock() is
not going to do any work, don't try getting the i_mutex and possibly
requeuing the end_io request if the trylock doesn't succeed in grabbing
the mutex.
Thanks to Tao Ma <boyu.mt@taobao.com> for spotting the error and
providing an initial fix to address the problem.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 92f38ee..0af5607 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -90,21 +90,27 @@ void ext4_free_io_end(ext4_io_end_t *io)
*/
int ext4_end_io_nolock(ext4_io_end_t *io)
{
- struct inode *inode = io->inode;
- loff_t offset = io->offset;
- ssize_t size = io->size;
- wait_queue_head_t *wq;
- int ret = 0;
+ unsigned long flags;
+ struct inode *inode = io->inode;
+ loff_t offset = io->offset;
+ ssize_t size = io->size;
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ wait_queue_head_t *wq;
+ int ret;
ext4_debug("ext4_end_io_nolock: io 0x%p from inode %lu,list->next 0x%p,"
"list->prev 0x%p\n",
io, inode->i_ino, io->list.next, io->list.prev);
- if (list_empty(&io->list))
- return ret;
+ spin_lock_irqsave(&ei->i_completed_io_lock, flags);
+ if (list_empty(&io->list)) {
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
+ return 0;
+ }
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
if (!(io->flag & EXT4_IO_END_UNWRITTEN))
- return ret;
+ return 0;
ret = ext4_convert_unwritten_extents(inode, offset, size);
if (ret < 0) {
@@ -142,6 +148,16 @@ static void ext4_end_io_work(struct work_struct *work)
unsigned long flags;
int ret;
+ if (!(io->flag & EXT4_IO_END_UNWRITTEN))
+ goto free_io_end;
+
+ spin_lock_irqsave(&ei->i_completed_io_lock, flags);
+ if (list_empty(&io->list)) {
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
+ goto free_io_end;
+ }
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
+
if (!mutex_trylock(&inode->i_mutex)) {
/*
* Requeue the work instead of waiting so that the work
@@ -170,6 +186,7 @@ static void ext4_end_io_work(struct work_struct *work)
list_del_init(&io->list);
spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
mutex_unlock(&inode->i_mutex);
+free_io_end:
ext4_free_io_end(io);
}
next prev parent reply other threads:[~2011-10-29 20:57 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-14 8:33 [PATCH] ext4: Check io list state and avoid an unnecessary mutex_lock in ext4_end_io_work Tao Ma
2011-10-29 20:57 ` Ted Ts'o [this message]
2011-10-30 7:50 ` Tao Ma
2011-10-31 15:02 ` Ted Ts'o
2011-10-31 15:02 ` [PATCH 1/3] ext4: Use correct locking for ext4_end_io_nolock() Theodore Ts'o
2011-10-31 15:02 ` [PATCH 2/3] ext4: remove unnecessary call to waitqueue_active() Theodore Ts'o
2011-10-31 15:02 ` [PATCH 3/3] ext4: optimize locking for end_io extent conversion Theodore Ts'o
2011-11-01 2:51 ` Tao Ma
2011-11-01 10:30 ` Theodore Tso
2011-10-31 15:28 ` [PATCH 1/3] ext4: Use correct locking for ext4_end_io_nolock() Tao Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111029205740.GB16825@thunk.org \
--to=tytso@mit.edu \
--cc=linux-ext4@vger.kernel.org \
--cc=tm@tao.ma \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.