* [PATCH 0/4] Get rid of iput() from flusher thread @ 2012-03-09 9:02 Jan Kara 2012-03-09 9:02 ` [PATCH 1/4] fs: Remove bogus wait in write_inode_now() Jan Kara ` (4 more replies) 0 siblings, 5 replies; 15+ messages in thread From: Jan Kara @ 2012-03-09 9:02 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton Hi, this patch set changes writeback_sb_inodes() to avoid iput() which might be problematic (see patch 4 which tries to summarize our email discussions) for some filesystems. Patches 1 and 2 are trivial mostly unrelated fixes (Fengguang, can you can take these and merge them right away please?). Patch 3 is a preparatory code reshuffle and patch 4 removes the __iget() / iput() from flusher thread. As a side note, your patches to offload writeback from kswapd to flusher thread then won't need iget/iput either if we pass page references as we talked so that should resolve most of the concerns. What do you think guys? Honza -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] fs: Remove bogus wait in write_inode_now() 2012-03-09 9:02 [PATCH 0/4] Get rid of iput() from flusher thread Jan Kara @ 2012-03-09 9:02 ` Jan Kara 2012-03-19 6:57 ` Christoph Hellwig 2012-03-09 9:02 ` [PATCH 2/4] writeback: Remove outdated comment Jan Kara ` (3 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Jan Kara @ 2012-03-09 9:02 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton, Jan Kara inode_sync_wait() in write_inode_now() is just bogus. That function waits for I_SYNC bit to be cleared but writeback_single_inode() clears the bit on return so the wait is effectivelly a nop unless someone else submits the inode for writeback again. All the waiting write_inode_now() needs is achieved by using WB_SYNC_ALL writeback mode. Signed-off-by: Jan Kara <jack@suse.cz> --- fs/fs-writeback.c | 2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 5b4a936..f60297b 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1364,8 +1364,6 @@ int write_inode_now(struct inode *inode, int sync) ret = writeback_single_inode(inode, wb, &wbc); spin_unlock(&inode->i_lock); spin_unlock(&wb->list_lock); - if (sync) - inode_sync_wait(inode); return ret; } EXPORT_SYMBOL(write_inode_now); -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] fs: Remove bogus wait in write_inode_now() 2012-03-09 9:02 ` [PATCH 1/4] fs: Remove bogus wait in write_inode_now() Jan Kara @ 2012-03-19 6:57 ` Christoph Hellwig 0 siblings, 0 replies; 15+ messages in thread From: Christoph Hellwig @ 2012-03-19 6:57 UTC (permalink / raw) To: Jan Kara; +Cc: Wu Fengguang, linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:25AM +0100, Jan Kara wrote: > inode_sync_wait() in write_inode_now() is just bogus. That function waits for > I_SYNC bit to be cleared but writeback_single_inode() clears the bit on return > so the wait is effectivelly a nop unless someone else submits the inode for > writeback again. All the waiting write_inode_now() needs is achieved by using > WB_SYNC_ALL writeback mode. > > Signed-off-by: Jan Kara <jack@suse.cz> Loks good - I have the same in my patchkit to kill write_inode_now (which I really need to get out soon). Signed-off-by: Christoph Hellwig <hch@lst.de> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/4] writeback: Remove outdated comment 2012-03-09 9:02 [PATCH 0/4] Get rid of iput() from flusher thread Jan Kara 2012-03-09 9:02 ` [PATCH 1/4] fs: Remove bogus wait in write_inode_now() Jan Kara @ 2012-03-09 9:02 ` Jan Kara 2012-03-19 6:58 ` Christoph Hellwig 2012-03-09 9:02 ` [PATCH 3/4] writeback: Refactor writeback_single_inode() Jan Kara ` (2 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Jan Kara @ 2012-03-09 9:02 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton, Jan Kara The comment is hopelessly outdated and misplaced. We no longer have 'bdi' part of writeback work, the comment about blockdev super is outdated, comment about throttling as well. Information about list handling is in more detail at queue_io(). So just move the bit about older_than_this to close to move_expired_inodes() and remove the rest. Signed-off-by: Jan Kara <jack@suse.cz> --- fs/fs-writeback.c | 20 ++------------------ 1 files changed, 2 insertions(+), 18 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index f60297b..be84e28 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -256,7 +256,8 @@ static bool inode_dirtied_after(struct inode *inode, unsigned long t) } /* - * Move expired dirty inodes from @delaying_queue to @dispatch_queue. + * Move expired (dirtied after work->older_than_this) dirty inodes from + * @delaying_queue to @dispatch_queue. */ static int move_expired_inodes(struct list_head *delaying_queue, struct list_head *dispatch_queue, @@ -1148,23 +1149,6 @@ out_unlock_inode: } EXPORT_SYMBOL(__mark_inode_dirty); -/* - * Write out a superblock's list of dirty inodes. A wait will be performed - * upon no inodes, all inodes or the final one, depending upon sync_mode. - * - * If older_than_this is non-NULL, then only write out inodes which - * had their first dirtying at a time earlier than *older_than_this. - * - * If `bdi' is non-zero then we're being asked to writeback a specific queue. - * This function assumes that the blockdev superblock's inodes are backed by - * a variety of queues, so all inodes are searched. For other superblocks, - * assume that all inodes are backed by the same queue. - * - * The inodes to be written are parked on bdi->b_io. They are moved back onto - * bdi->b_dirty as they are selected for writing. This way, none can be missed - * on the writer throttling path, and we get decent balancing between many - * throttled threads: we don't want them all piling up on inode_sync_wait. - */ static void wait_sb_inodes(struct super_block *sb) { struct inode *inode, *old_inode = NULL; -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] writeback: Remove outdated comment 2012-03-09 9:02 ` [PATCH 2/4] writeback: Remove outdated comment Jan Kara @ 2012-03-19 6:58 ` Christoph Hellwig 0 siblings, 0 replies; 15+ messages in thread From: Christoph Hellwig @ 2012-03-19 6:58 UTC (permalink / raw) To: Jan Kara; +Cc: Wu Fengguang, linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:26AM +0100, Jan Kara wrote: > The comment is hopelessly outdated and misplaced. We no longer have 'bdi' > part of writeback work, the comment about blockdev super is outdated, > comment about throttling as well. Information about list handling is in > more detail at queue_io(). So just move the bit about older_than_this to > close to move_expired_inodes() and remove the rest. > > Signed-off-by: Jan Kara <jack@suse.cz> Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/4] writeback: Refactor writeback_single_inode() 2012-03-09 9:02 [PATCH 0/4] Get rid of iput() from flusher thread Jan Kara 2012-03-09 9:02 ` [PATCH 1/4] fs: Remove bogus wait in write_inode_now() Jan Kara 2012-03-09 9:02 ` [PATCH 2/4] writeback: Remove outdated comment Jan Kara @ 2012-03-09 9:02 ` Jan Kara 2012-03-19 5:07 ` Fengguang Wu 2012-03-19 7:13 ` Christoph Hellwig 2012-03-09 9:02 ` [PATCH 4/4] writeback: Avoid iput() from flusher thread Jan Kara 2012-03-19 5:16 ` [PATCH 0/4] Get rid of " Fengguang Wu 4 siblings, 2 replies; 15+ messages in thread From: Jan Kara @ 2012-03-09 9:02 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton, Jan Kara The code in writeback_single_inode() is relatively complex. The list requeing logic makes sense only for flusher thread but not really for sync_inode() or write_inode_now() callers. Also when we want to get rid of inode references held by flusher thread, we will need a special I_SYNC handling there. So separate part of writeback_single_inode() which does the real writeback work into __writeback_single_inode(). Make writeback_single_inode() do only stuff necessary for callers writing only one inode, and move the special list handling into writeback_sb_inodes() and a helper function inode_wb_requeue(). Signed-off-by: Jan Kara <jack@suse.cz> --- fs/fs-writeback.c | 264 +++++++++++++++++++++----------------- include/trace/events/writeback.h | 36 ++++- 2 files changed, 174 insertions(+), 126 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index be84e28..1e8bf44 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -231,11 +231,7 @@ static void requeue_io(struct inode *inode, struct bdi_writeback *wb) static void inode_sync_complete(struct inode *inode) { - /* - * Prevent speculative execution through - * spin_unlock(&wb->list_lock); - */ - + inode->i_state &= ~I_SYNC; smp_mb(); wake_up_bit(&inode->i_state, __I_SYNC); } @@ -331,8 +327,7 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc) /* * Wait for writeback on an inode to complete. */ -static void inode_wait_for_writeback(struct inode *inode, - struct bdi_writeback *wb) +static void inode_wait_for_writeback(struct inode *inode) { DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); wait_queue_head_t *wqh; @@ -340,70 +335,34 @@ static void inode_wait_for_writeback(struct inode *inode, wqh = bit_waitqueue(&inode->i_state, __I_SYNC); while (inode->i_state & I_SYNC) { spin_unlock(&inode->i_lock); - spin_unlock(&wb->list_lock); __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); - spin_lock(&wb->list_lock); spin_lock(&inode->i_lock); } } /* - * Write out an inode's dirty pages. Called under wb->list_lock and - * inode->i_lock. Either the caller has an active reference on the inode or - * the inode has I_WILL_FREE set. - * - * If `wait' is set, wait on the writeout. + * Do real work connected with writing out inode and its dirty pages. * - * The whole writeout design is quite complex and fragile. We want to avoid - * starvation of particular inodes when others are being redirtied, prevent - * livelocks, etc. + * The function must be called with i_lock held and drops it. + * I_SYNC flag of the inode must be clear on entry and the function returns + * with I_SYNC set. Caller must call inode_sync_complete() when it is done + * with postprocessing of the inode. */ static int -writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, - struct writeback_control *wbc) +__writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, + struct writeback_control *wbc) { struct address_space *mapping = inode->i_mapping; long nr_to_write = wbc->nr_to_write; unsigned dirty; int ret; - assert_spin_locked(&wb->list_lock); - assert_spin_locked(&inode->i_lock); - - if (!atomic_read(&inode->i_count)) - WARN_ON(!(inode->i_state & (I_WILL_FREE|I_FREEING))); - else - WARN_ON(inode->i_state & I_WILL_FREE); - - if (inode->i_state & I_SYNC) { - /* - * If this inode is locked for writeback and we are not doing - * writeback-for-data-integrity, move it to b_more_io so that - * writeback can proceed with the other inodes on s_io. - * - * We'll have another go at writing back this inode when we - * completed a full scan of b_io. - */ - if (wbc->sync_mode != WB_SYNC_ALL) { - requeue_io(inode, wb); - trace_writeback_single_inode_requeue(inode, wbc, - nr_to_write); - return 0; - } - - /* - * It's a data-integrity sync. We must wait. - */ - inode_wait_for_writeback(inode, wb); - } - BUG_ON(inode->i_state & I_SYNC); - + assert_spin_locked(&inode->i_lock); /* Set I_SYNC, reset I_DIRTY_PAGES */ inode->i_state |= I_SYNC; inode->i_state &= ~I_DIRTY_PAGES; spin_unlock(&inode->i_lock); - spin_unlock(&wb->list_lock); ret = do_writepages(mapping, wbc); @@ -424,6 +383,9 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, * write_inode() */ spin_lock(&inode->i_lock); + /* Didn't write out all pages or some became dirty? */ + if (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY)) + inode->i_state |= I_DIRTY_PAGES; dirty = inode->i_state & I_DIRTY; inode->i_state &= ~(I_DIRTY_SYNC | I_DIRTY_DATASYNC); spin_unlock(&inode->i_lock); @@ -434,59 +396,56 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, ret = err; } - spin_lock(&wb->list_lock); + trace_writeback_single_inode(inode, wbc, nr_to_write); + return ret; +} + +/* + * Write out an inode's dirty pages. Either the caller has an active reference + * on the inode or the inode has I_WILL_FREE set. + * + * This function is designed to be called for writing back one inode which + * we go e.g. from filesystem. Flusher thread uses __writeback_single_inode() + * and does more profound writeback list handling in writeback_sb_inodes(). + */ +static int +writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, + struct writeback_control *wbc) +{ + int ret; + spin_lock(&inode->i_lock); - inode->i_state &= ~I_SYNC; - if (!(inode->i_state & I_FREEING)) { - /* - * Sync livelock prevention. Each inode is tagged and synced in - * one shot. If still dirty, it will be redirty_tail()'ed below. - * Update the dirty time to prevent enqueue and sync it again. - */ - if ((inode->i_state & I_DIRTY) && - (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)) - inode->dirtied_when = jiffies; + if (!atomic_read(&inode->i_count)) + WARN_ON(!(inode->i_state & (I_WILL_FREE|I_FREEING))); + else + WARN_ON(inode->i_state & I_WILL_FREE); - if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { - /* - * We didn't write back all the pages. nfs_writepages() - * sometimes bales out without doing anything. - */ - inode->i_state |= I_DIRTY_PAGES; - if (wbc->nr_to_write <= 0) { - /* - * slice used up: queue for next turn - */ - requeue_io(inode, wb); - } else { - /* - * Writeback blocked by something other than - * congestion. Delay the inode for some time to - * avoid spinning on the CPU (100% iowait) - * retrying writeback of the dirty page/inode - * that cannot be performed immediately. - */ - redirty_tail(inode, wb); - } - } else if (inode->i_state & I_DIRTY) { - /* - * Filesystems can dirty the inode during writeback - * operations, such as delayed allocation during - * submission or metadata updates after data IO - * completion. - */ - redirty_tail(inode, wb); - } else { - /* - * The inode is clean. At this point we either have - * a reference to the inode or it's on it's way out. - * No need to add it back to the LRU. - */ - list_del_init(&inode->i_wb_list); + if (inode->i_state & I_SYNC) { + if (wbc->sync_mode != WB_SYNC_ALL) { + spin_unlock(&inode->i_lock); + return 0; } + /* + * It's a data-integrity sync. We must wait. + */ + inode_wait_for_writeback(inode); } + + ret = __writeback_single_inode(inode, wb, wbc); + + spin_lock(&wb->list_lock); + spin_lock(&inode->i_lock); + if (inode->i_state & I_FREEING) + goto out_unlock; + if (inode->i_state & I_DIRTY) + redirty_tail(inode, wb); + else + list_del_init(&inode->i_wb_list); +out_unlock: inode_sync_complete(inode); - trace_writeback_single_inode(inode, wbc, nr_to_write); + spin_unlock(&inode->i_lock); + spin_unlock(&wb->list_lock); + return ret; } @@ -521,6 +480,54 @@ static long writeback_chunk_size(struct backing_dev_info *bdi, return pages; } +static void inode_wb_requeue(struct inode *inode, struct bdi_writeback *wb, + struct writeback_control *wbc) +{ + if (wbc->pages_skipped) { + /* + * Writeback is not making progress due to locked buffers. + * Skip this inode for now. + */ + redirty_tail(inode, wb); + return; + } + if (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY)) { + /* + * We didn't write back all the pages. nfs_writepages() + * sometimes bales out without doing anything. + */ + if (wbc->nr_to_write <= 0) { + /* + * slice used up: queue for next turn + */ + requeue_io(inode, wb); + } else { + /* + * Writeback blocked by something other than + * congestion. Delay the inode for some time to + * avoid spinning on the CPU (100% iowait) + * retrying writeback of the dirty page/inode + * that cannot be performed immediately. + */ + redirty_tail(inode, wb); + } + return; + } + if (inode->i_state & I_DIRTY) { + /* + * Filesystems can dirty the inode during writeback operations, + * such as delayed allocation during submission or metadata + * updates after data IO completion. + */ + redirty_tail(inode, wb); + return; + } + /* + * The inode is clean. + */ + list_del_init(&inode->i_wb_list); +} + /* * Write a portion of b_io inodes which belong to @sb. * @@ -580,24 +587,51 @@ static long writeback_sb_inodes(struct super_block *sb, redirty_tail(inode, wb); continue; } + if (inode->i_state & I_SYNC && work->sync_mode != WB_SYNC_ALL) { + /* + * If this inode is locked for writeback and we are not + * doing writeback-for-data-integrity, move it to + * b_more_io so that writeback can proceed with the + * other inodes on s_io. + * + * We'll have another go at writing back this inode + * when we completed a full scan of b_io. + */ + spin_unlock(&inode->i_lock); + requeue_io(inode, wb); + trace_writeback_sb_inodes_requeue(inode); + continue; + } + spin_unlock(&wb->list_lock); + __iget(inode); + inode_wait_for_writeback(inode); write_chunk = writeback_chunk_size(wb->bdi, work); wbc.nr_to_write = write_chunk; wbc.pages_skipped = 0; - writeback_single_inode(inode, wb, &wbc); + __writeback_single_inode(inode, wb, &wbc); work->nr_pages -= write_chunk - wbc.nr_to_write; wrote += write_chunk - wbc.nr_to_write; + spin_lock(&wb->list_lock); + spin_lock(&inode->i_lock); if (!(inode->i_state & I_DIRTY)) wrote++; - if (wbc.pages_skipped) { - /* - * writeback is not making progress due to locked - * buffers. Skip this inode for now. - */ - redirty_tail(inode, wb); - } + if (inode->i_state & I_FREEING) + goto continue_unlock; + /* + * Sync livelock prevention. Each inode is tagged and synced in + * one shot. If still dirty, it will be redirty_tail()'ed in + * inode_wb_requeue(). We update the dirty time to prevent + * queueing and syncing it again. + */ + if ((inode->i_state & I_DIRTY) && + (wbc.sync_mode == WB_SYNC_ALL || wbc.tagged_writepages)) + inode->dirtied_when = jiffies; + inode_wb_requeue(inode, wb, &wbc); +continue_unlock: + inode_sync_complete(inode); spin_unlock(&inode->i_lock); spin_unlock(&wb->list_lock); iput(inode); @@ -796,8 +830,10 @@ static long wb_writeback(struct bdi_writeback *wb, trace_writeback_wait(wb->bdi, work); inode = wb_inode(wb->b_more_io.prev); spin_lock(&inode->i_lock); - inode_wait_for_writeback(inode, wb); + spin_unlock(&wb->list_lock); + inode_wait_for_writeback(inode); spin_unlock(&inode->i_lock); + spin_lock(&wb->list_lock); } } spin_unlock(&wb->list_lock); @@ -1343,11 +1379,7 @@ int write_inode_now(struct inode *inode, int sync) wbc.nr_to_write = 0; might_sleep(); - spin_lock(&wb->list_lock); - spin_lock(&inode->i_lock); ret = writeback_single_inode(inode, wb, &wbc); - spin_unlock(&inode->i_lock); - spin_unlock(&wb->list_lock); return ret; } EXPORT_SYMBOL(write_inode_now); @@ -1366,14 +1398,8 @@ EXPORT_SYMBOL(write_inode_now); int sync_inode(struct inode *inode, struct writeback_control *wbc) { struct bdi_writeback *wb = &inode_to_bdi(inode)->wb; - int ret; - spin_lock(&wb->list_lock); - spin_lock(&inode->i_lock); - ret = writeback_single_inode(inode, wb, wbc); - spin_unlock(&inode->i_lock); - spin_unlock(&wb->list_lock); - return ret; + return writeback_single_inode(inode, wb, wbc); } EXPORT_SYMBOL(sync_inode); diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 5973410..ec0a2b8 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -373,6 +373,35 @@ TRACE_EVENT(balance_dirty_pages, ) ); +TRACE_EVENT(writeback_sb_inodes_requeue, + + TP_PROTO(struct inode *inode), + TP_ARGS(inode), + + TP_STRUCT__entry( + __array(char, name, 32) + __field(unsigned long, ino) + __field(unsigned long, state) + __field(unsigned long, dirtied_when) + ), + + TP_fast_assign( + strncpy(__entry->name, + dev_name(inode_to_bdi(inode)->dev), 32); + __entry->ino = inode->i_ino; + __entry->state = inode->i_state; + __entry->dirtied_when = inode->dirtied_when; + ), + + TP_printk("bdi %s: ino=%lu state=%s dirtied_when=%lu age=%lu", + __entry->name, + __entry->ino, + show_inode_state(__entry->state), + __entry->dirtied_when, + (jiffies - __entry->dirtied_when) / HZ + ) +); + DECLARE_EVENT_CLASS(writeback_congest_waited_template, TP_PROTO(unsigned int usec_timeout, unsigned int usec_delayed), @@ -451,13 +480,6 @@ DECLARE_EVENT_CLASS(writeback_single_inode_template, ) ); -DEFINE_EVENT(writeback_single_inode_template, writeback_single_inode_requeue, - TP_PROTO(struct inode *inode, - struct writeback_control *wbc, - unsigned long nr_to_write), - TP_ARGS(inode, wbc, nr_to_write) -); - DEFINE_EVENT(writeback_single_inode_template, writeback_single_inode, TP_PROTO(struct inode *inode, struct writeback_control *wbc, -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] writeback: Refactor writeback_single_inode() 2012-03-09 9:02 ` [PATCH 3/4] writeback: Refactor writeback_single_inode() Jan Kara @ 2012-03-19 5:07 ` Fengguang Wu 2012-03-19 7:13 ` Christoph Hellwig 1 sibling, 0 replies; 15+ messages in thread From: Fengguang Wu @ 2012-03-19 5:07 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:27AM +0100, Jan Kara wrote: > The code in writeback_single_inode() is relatively complex. The list > requeing logic makes sense only for flusher thread but not really for > sync_inode() or write_inode_now() callers. Also when we want to get > rid of inode references held by flusher thread, we will need a special > I_SYNC handling there. > > So separate part of writeback_single_inode() which does the real writeback work > into __writeback_single_inode(). Make writeback_single_inode() do only stuff > necessary for callers writing only one inode, and move the special list > handling into writeback_sb_inodes() and a helper function inode_wb_requeue(). > > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/fs-writeback.c | 264 +++++++++++++++++++++----------------- > include/trace/events/writeback.h | 36 ++++- > 2 files changed, 174 insertions(+), 126 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > + > + ret = __writeback_single_inode(inode, wb, wbc); > + > + spin_lock(&wb->list_lock); > + spin_lock(&inode->i_lock); > + if (inode->i_state & I_FREEING) > + goto out_unlock; > + if (inode->i_state & I_DIRTY) > + redirty_tail(inode, wb); > + else > + list_del_init(&inode->i_wb_list); It seems that the above redirty_tail() and hence I_FREEING check can be eliminated? writeback_single_inode() does not need to deal with wb list requeue now, but only need to care about dequeue. The patch looks fine otherwise. > +out_unlock: > inode_sync_complete(inode); > - trace_writeback_single_inode(inode, wbc, nr_to_write); > + spin_unlock(&inode->i_lock); > + spin_unlock(&wb->list_lock); > + > return ret; > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] writeback: Refactor writeback_single_inode() 2012-03-09 9:02 ` [PATCH 3/4] writeback: Refactor writeback_single_inode() Jan Kara 2012-03-19 5:07 ` Fengguang Wu @ 2012-03-19 7:13 ` Christoph Hellwig 2012-03-19 16:16 ` Jan Kara 1 sibling, 1 reply; 15+ messages in thread From: Christoph Hellwig @ 2012-03-19 7:13 UTC (permalink / raw) To: Jan Kara; +Cc: Wu Fengguang, linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:27AM +0100, Jan Kara wrote: > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/fs-writeback.c | 264 +++++++++++++++++++++----------------- > include/trace/events/writeback.h | 36 ++++- > 2 files changed, 174 insertions(+), 126 deletions(-) Can you split this into a more gradual patch series? This a a huge change of lots of little bits in a very sensitive area. > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index be84e28..1e8bf44 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -231,11 +231,7 @@ static void requeue_io(struct inode *inode, struct bdi_writeback *wb) > > static void inode_sync_complete(struct inode *inode) > { > - /* > - * Prevent speculative execution through > - * spin_unlock(&wb->list_lock); > - */ > - > + inode->i_state &= ~I_SYNC; > smp_mb(); > wake_up_bit(&inode->i_state, __I_SYNC); E.g. Moving the I_SYNC clearing later should be a small patch of it's own with a changelog describing why it is safe. > -static void inode_wait_for_writeback(struct inode *inode, > - struct bdi_writeback *wb) > +static void inode_wait_for_writeback(struct inode *inode) > { > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > wait_queue_head_t *wqh; > @@ -340,70 +335,34 @@ static void inode_wait_for_writeback(struct inode *inode, > wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > while (inode->i_state & I_SYNC) { > spin_unlock(&inode->i_lock); > - spin_unlock(&wb->list_lock); > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > - spin_lock(&wb->list_lock); > spin_lock(&inode->i_lock); > } > } Ditto for why calling inode_wait_for_writeback without the list_lock is fine now. > > /* > + * Do real work connected with writing out inode and its dirty pages. * Write out an inode and its dirty pages, but do not update the writeback list linkage, which is left to the caller. > + * The function must be called with i_lock held and drops it. Can we avoid these assymetric calling conventions if possible? If not pleae add least add the sparse locking context annotations. > + * I_SYNC flag of the inode must be clear on entry and the function returns > + * with I_SYNC set. Caller must call inode_sync_complete() when it is done > + * with postprocessing of the inode. Ewww.. > > ret = do_writepages(mapping, wbc); > > @@ -424,6 +383,9 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, > * write_inode() > */ > spin_lock(&inode->i_lock); > + /* Didn't write out all pages or some became dirty? */ > + if (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY)) > + inode->i_state |= I_DIRTY_PAGES; Where did this hunk come from? > + if (inode->i_state & I_FREEING) > + goto out_unlock; > + if (inode->i_state & I_DIRTY) > + redirty_tail(inode, wb); > + else > + list_del_init(&inode->i_wb_list); These lines should be factored into a small helper shared with the writeback thread code, which would also avoid the out_unlock goto. > @@ -580,24 +587,51 @@ static long writeback_sb_inodes(struct super_block *sb, > redirty_tail(inode, wb); > continue; > } > + if (inode->i_state & I_SYNC && work->sync_mode != WB_SYNC_ALL) { Please add braces around the inode->i_state & I_SYNC. > + if (inode->i_state & I_FREEING) > + goto continue_unlock; > + /* > + * Sync livelock prevention. Each inode is tagged and synced in > + * one shot. If still dirty, it will be redirty_tail()'ed in > + * inode_wb_requeue(). We update the dirty time to prevent > + * queueing and syncing it again. > + */ > + if ((inode->i_state & I_DIRTY) && > + (wbc.sync_mode == WB_SYNC_ALL || wbc.tagged_writepages)) > + inode->dirtied_when = jiffies; > + inode_wb_requeue(inode, wb, &wbc); > +continue_unlock: I'd rather have the non-freeing code indentented one more level than the goto magic here. What's the problem with moving the dirtied_when update into inode_wb_requeue, which would make the whole thing a lot more readable? (Also factoring out inode_wb_requeue would be another good split patch) > + inode_sync_complete(inode); > spin_unlock(&inode->i_lock); > spin_unlock(&wb->list_lock); > iput(inode); > @@ -796,8 +830,10 @@ static long wb_writeback(struct bdi_writeback *wb, > trace_writeback_wait(wb->bdi, work); > inode = wb_inode(wb->b_more_io.prev); > spin_lock(&inode->i_lock); > + spin_unlock(&wb->list_lock); > + inode_wait_for_writeback(inode); > spin_unlock(&inode->i_lock); > + spin_lock(&wb->list_lock); > } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/4] writeback: Refactor writeback_single_inode() 2012-03-19 7:13 ` Christoph Hellwig @ 2012-03-19 16:16 ` Jan Kara 0 siblings, 0 replies; 15+ messages in thread From: Jan Kara @ 2012-03-19 16:16 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, Wu Fengguang, linux-fsdevel, LKML, linux-mm, Andrew Morton On Mon 19-03-12 03:13:58, Christoph Hellwig wrote: > On Fri, Mar 09, 2012 at 10:02:27AM +0100, Jan Kara wrote: > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/fs-writeback.c | 264 +++++++++++++++++++++----------------- > > include/trace/events/writeback.h | 36 ++++- > > 2 files changed, 174 insertions(+), 126 deletions(-) > > Can you split this into a more gradual patch series? This a a huge > change of lots of little bits in a very sensitive area. OK, I'll try. > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index be84e28..1e8bf44 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -231,11 +231,7 @@ static void requeue_io(struct inode *inode, struct bdi_writeback *wb) > > > > static void inode_sync_complete(struct inode *inode) > > { > > - /* > > - * Prevent speculative execution through > > - * spin_unlock(&wb->list_lock); > > - */ > > - > > + inode->i_state &= ~I_SYNC; > > smp_mb(); > > wake_up_bit(&inode->i_state, __I_SYNC); > > E.g. Moving the I_SYNC clearing later should be a small patch of it's > own with a changelog describing why it is safe. OK. > > -static void inode_wait_for_writeback(struct inode *inode, > > - struct bdi_writeback *wb) > > +static void inode_wait_for_writeback(struct inode *inode) > > { > > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > > wait_queue_head_t *wqh; > > @@ -340,70 +335,34 @@ static void inode_wait_for_writeback(struct inode *inode, > > wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > > while (inode->i_state & I_SYNC) { > > spin_unlock(&inode->i_lock); > > - spin_unlock(&wb->list_lock); > > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > > - spin_lock(&wb->list_lock); > > spin_lock(&inode->i_lock); > > } > > } > > Ditto for why calling inode_wait_for_writeback without the list_lock > is fine now. OK. > > > > /* > > + * Do real work connected with writing out inode and its dirty pages. > > * Write out an inode and its dirty pages, but do not update the > writeback list linkage, which is left to the caller. > > > + * The function must be called with i_lock held and drops it. > > Can we avoid these assymetric calling conventions if possible? If not > pleae add least add the sparse locking context annotations. Returning from __writeback_single_inode() with i_lock would be impractical because callers need to take wb->list_lock which ranks above it. Entering __writeback_single_inode() without i_lock might be possible (see comment below) but I'm not sure it's worth it. > > + * I_SYNC flag of the inode must be clear on entry and the function returns > > + * with I_SYNC set. Caller must call inode_sync_complete() when it is done > > + * with postprocessing of the inode. > > Ewww.. Yeah, it is awkward. Sadly I don't have much better solution. Flusher thread cannot really afford normal waiting for I_SYNC because it doesn't hold inode reference and thus inode can go away once we drop i_lock. Because of that we have to enter without I_SYNC set. We could possibly set I_SYNC already in the caller. That would at least make the function less asymmetric. Also that would solve the problem with inode pinning so we could make the function completely symmetric (entering without i_lock). But still both callers would do unnecessary unlock of i_lock just for __writeback_single_inode() to take it again (to clear I_DIRTY_PAGES). And we cannot clear I_SYNC before we have moved the inode to proper writeback list which is done in the caller. > > ret = do_writepages(mapping, wbc); > > > > @@ -424,6 +383,9 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, > > * write_inode() > > */ > > spin_lock(&inode->i_lock); > > + /* Didn't write out all pages or some became dirty? */ > > + if (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY)) > > + inode->i_state |= I_DIRTY_PAGES; > > Where did this hunk come from? writeback_single_inode() had: if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { /* * We didn't write back all the pages. nfs_writepages() * sometimes bales out without doing anything. */ inode->i_state |= I_DIRTY_PAGES; ... > > + if (inode->i_state & I_FREEING) > > + goto out_unlock; > > > + if (inode->i_state & I_DIRTY) > > + redirty_tail(inode, wb); > > + else > > + list_del_init(&inode->i_wb_list); > > These lines should be factored into a small helper shared with the > writeback thread code, which would also avoid the out_unlock goto. Yup. Will do. > > @@ -580,24 +587,51 @@ static long writeback_sb_inodes(struct super_block *sb, > > redirty_tail(inode, wb); > > continue; > > } > > + if (inode->i_state & I_SYNC && work->sync_mode != WB_SYNC_ALL) { > > Please add braces around the inode->i_state & I_SYNC. OK. > > + if (inode->i_state & I_FREEING) > > + goto continue_unlock; > > + /* > > + * Sync livelock prevention. Each inode is tagged and synced in > > + * one shot. If still dirty, it will be redirty_tail()'ed in > > + * inode_wb_requeue(). We update the dirty time to prevent > > + * queueing and syncing it again. > > + */ > > + if ((inode->i_state & I_DIRTY) && > > + (wbc.sync_mode == WB_SYNC_ALL || wbc.tagged_writepages)) > > + inode->dirtied_when = jiffies; > > + inode_wb_requeue(inode, wb, &wbc); > > +continue_unlock: > > I'd rather have the non-freeing code indentented one more level than the > goto magic here. What's the problem with moving the dirtied_when update > into inode_wb_requeue, which would make the whole thing a lot more > readable? Not sure why I put it here. Will move it inside inode_wb_requeue(). > (Also factoring out inode_wb_requeue would be another good split patch) > > > + inode_sync_complete(inode); > > spin_unlock(&inode->i_lock); > > spin_unlock(&wb->list_lock); > > iput(inode); > > @@ -796,8 +830,10 @@ static long wb_writeback(struct bdi_writeback *wb, > > trace_writeback_wait(wb->bdi, work); > > inode = wb_inode(wb->b_more_io.prev); > > spin_lock(&inode->i_lock); > > + spin_unlock(&wb->list_lock); > > + inode_wait_for_writeback(inode); > > spin_unlock(&inode->i_lock); > > + spin_lock(&wb->list_lock); > > } Honza -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 4/4] writeback: Avoid iput() from flusher thread 2012-03-09 9:02 [PATCH 0/4] Get rid of iput() from flusher thread Jan Kara ` (2 preceding siblings ...) 2012-03-09 9:02 ` [PATCH 3/4] writeback: Refactor writeback_single_inode() Jan Kara @ 2012-03-09 9:02 ` Jan Kara 2012-03-19 3:42 ` Fengguang Wu 2012-03-19 8:55 ` Christoph Hellwig 2012-03-19 5:16 ` [PATCH 0/4] Get rid of " Fengguang Wu 4 siblings, 2 replies; 15+ messages in thread From: Jan Kara @ 2012-03-09 9:02 UTC (permalink / raw) To: Wu Fengguang; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton, Jan Kara Doing iput() from flusher thread (writeback_sb_inodes()) can create problems because iput() can do a lot of work - for example truncate the inode if it's the last iput on unlinked file. Some filesystems (e.g. ubifs) may need to allocate blocks during truncate (due to their COW nature) and in some cases they thus need to flush dirty data from truncate to reduce uncertainty in the amount of free space. This effectively creates a deadlock. We get rid of iput() in flusher thread by using the fact that I_SYNC inode flag effectively pins the inode in memory. So if we take care to either hold i_lock or have I_SYNC set, we can get away without taking inode reference in writeback_sb_inodes(). As a side effect, we also fix possible use-after-free in wb_writeback() because inode_wait_for_writeback() call could try to reacquire i_lock on the inode that was already free. Signed-off-by: Jan Kara <jack@suse.cz> --- fs/fs-writeback.c | 38 ++++++++++++++++++++++++-------------- fs/inode.c | 11 ++++++++++- include/linux/fs.h | 7 ++++--- include/linux/writeback.h | 7 +------ 4 files changed, 39 insertions(+), 24 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 1e8bf44..f9f9b61 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -325,19 +325,21 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc) } /* - * Wait for writeback on an inode to complete. + * Wait for writeback on an inode to complete. Called with i_lock held. + * Return 1 if we dropped i_lock and waited, 0 is returned otherwise. */ -static void inode_wait_for_writeback(struct inode *inode) +int __must_check inode_wait_for_writeback(struct inode *inode) { DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); wait_queue_head_t *wqh; wqh = bit_waitqueue(&inode->i_state, __I_SYNC); - while (inode->i_state & I_SYNC) { + if (inode->i_state & I_SYNC) { spin_unlock(&inode->i_lock); __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); - spin_lock(&inode->i_lock); + return 1; } + return 0; } /* @@ -426,9 +428,12 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, return 0; } /* - * It's a data-integrity sync. We must wait. + * It's a data-integrity sync. We must wait. Since callers hold + * inode reference or inode has I_WILL_FREE set, it cannot go + * away under us. */ - inode_wait_for_writeback(inode); + while (inode_wait_for_writeback(inode)) + spin_lock(&inode->i_lock); } ret = __writeback_single_inode(inode, wb, wbc); @@ -604,12 +609,20 @@ static long writeback_sb_inodes(struct super_block *sb, } spin_unlock(&wb->list_lock); - __iget(inode); - inode_wait_for_writeback(inode); + /* Did we drop i_lock to wait for I_SYNC? */ + if (inode_wait_for_writeback(inode)) { + /* Inode may be gone, start again */ + spin_lock(&wb->list_lock); + continue; + } write_chunk = writeback_chunk_size(wb->bdi, work); wbc.nr_to_write = write_chunk; wbc.pages_skipped = 0; + /* + * We use I_SYNC to pin the inode in memory. While it is set + * end_writeback() will wait so the inode cannot be freed. + */ __writeback_single_inode(inode, wb, &wbc); work->nr_pages -= write_chunk - wbc.nr_to_write; @@ -633,10 +646,7 @@ static long writeback_sb_inodes(struct super_block *sb, continue_unlock: inode_sync_complete(inode); spin_unlock(&inode->i_lock); - spin_unlock(&wb->list_lock); - iput(inode); - cond_resched(); - spin_lock(&wb->list_lock); + cond_resched_lock(&wb->list_lock); /* * bail out to wb_writeback() often enough to check * background threshold and other termination conditions. @@ -831,8 +841,8 @@ static long wb_writeback(struct bdi_writeback *wb, inode = wb_inode(wb->b_more_io.prev); spin_lock(&inode->i_lock); spin_unlock(&wb->list_lock); - inode_wait_for_writeback(inode); - spin_unlock(&inode->i_lock); + if (!inode_wait_for_writeback(inode)) + spin_unlock(&inode->i_lock); spin_lock(&wb->list_lock); } } diff --git a/fs/inode.c b/fs/inode.c index d3ebdbe..b64e2fe 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -510,7 +510,16 @@ void end_writeback(struct inode *inode) BUG_ON(!list_empty(&inode->i_data.private_list)); BUG_ON(!(inode->i_state & I_FREEING)); BUG_ON(inode->i_state & I_CLEAR); - inode_sync_wait(inode); + /* + * Wait for flusher thread to be done with the inode. Since the inode + * has I_FREEING set, flusher thread won't start new work on the inode. + * We just have to wait for running writeback to finish. We must use + * i_lock here because flusher thread might be working with the inode + * without I_SYNC set but under i_lock. + */ + spin_lock(&inode->i_lock); + if (!inode_wait_for_writeback(inode)) + spin_unlock(&inode->i_lock); /* don't need i_lock here, no concurrent mods to i_state */ inode->i_state = I_FREEING | I_CLEAR; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 69cd5bb..e1f0f5a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1742,9 +1742,10 @@ struct super_operations { * anew. Other functions will just ignore such inodes, * if appropriate. I_NEW is used for waiting. * - * I_SYNC Synchonized write of dirty inode data. The bits is - * set during data writeback, and cleared with a wakeup - * on the bit address once it is done. + * I_SYNC Writeback of inode is running. The bits is set during + * data writeback, and cleared with a wakeup on the bit + * address once it is done. The bit is also used to pin + * the inode in memory for flusher thread. * * I_REFERENCED Marks the inode as recently references on the LRU list. * diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 995b8bf..3a34dc0 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -94,6 +94,7 @@ long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages, enum wb_reason reason); long wb_do_writeback(struct bdi_writeback *wb, int force_wait); void wakeup_flusher_threads(long nr_pages, enum wb_reason reason); +int __must_check inode_wait_for_writeback(struct inode *inode); /* writeback.h requires fs.h; it, too, is not included from here. */ static inline void wait_on_inode(struct inode *inode) @@ -101,12 +102,6 @@ static inline void wait_on_inode(struct inode *inode) might_sleep(); wait_on_bit(&inode->i_state, __I_NEW, inode_wait, TASK_UNINTERRUPTIBLE); } -static inline void inode_sync_wait(struct inode *inode) -{ - might_sleep(); - wait_on_bit(&inode->i_state, __I_SYNC, inode_wait, - TASK_UNINTERRUPTIBLE); -} /* -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] writeback: Avoid iput() from flusher thread 2012-03-09 9:02 ` [PATCH 4/4] writeback: Avoid iput() from flusher thread Jan Kara @ 2012-03-19 3:42 ` Fengguang Wu 2012-03-19 8:55 ` Christoph Hellwig 1 sibling, 0 replies; 15+ messages in thread From: Fengguang Wu @ 2012-03-19 3:42 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:28AM +0100, Jan Kara wrote: > Doing iput() from flusher thread (writeback_sb_inodes()) can create problems > because iput() can do a lot of work - for example truncate the inode if it's > the last iput on unlinked file. Some filesystems (e.g. ubifs) may need to > allocate blocks during truncate (due to their COW nature) and in some cases > they thus need to flush dirty data from truncate to reduce uncertainty in the > amount of free space. This effectively creates a deadlock. > > We get rid of iput() in flusher thread by using the fact that I_SYNC inode > flag effectively pins the inode in memory. So if we take care to either hold > i_lock or have I_SYNC set, we can get away without taking inode reference > in writeback_sb_inodes(). I created this graph while trying to understand/prove the new locking rules, and it looks perfectly fine. flusher: I_FREEING check requeue/dequeue . . b_io.prev . writepages write_inode . . . . . . . . . . . I_SYNC . . ============================== i_lock . ============ === ====== list_lock ============= . . ============== . . . . check/wait I_SYNC I_DIRTY handling evict/end_writeback: check I_SYNC set I_FREEING . wait I_SYNC destroy_inode . dequeue . . . I_SYNC . . . ********************* . i_lock ===== . ==== *==== . list_lock ====== It may be worthwhile to note that evict() needs to grab i_lock *after* waiting for I_SYNC, so that the flusher won't be accessing possibly freed inode when unlocking i_lock. > As a side effect, we also fix possible use-after-free in wb_writeback() because > inode_wait_for_writeback() call could try to reacquire i_lock on the inode that > was already free. Good catch! I think you are referring to this code: /* * Nothing written. Wait for some inode to * become available for writeback. Otherwise * we'll just busyloop. */ if (!list_empty(&wb->b_more_io)) { trace_writeback_wait(wb->bdi, work); inode = wb_inode(wb->b_more_io.prev); spin_lock(&inode->i_lock); inode_wait_for_writeback(inode, wb); spin_unlock(&inode->i_lock); } > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/fs-writeback.c | 38 ++++++++++++++++++++++++-------------- > fs/inode.c | 11 ++++++++++- > include/linux/fs.h | 7 ++++--- > include/linux/writeback.h | 7 +------ > 4 files changed, 39 insertions(+), 24 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 1e8bf44..f9f9b61 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -325,19 +325,21 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc) > } > > /* > - * Wait for writeback on an inode to complete. > + * Wait for writeback on an inode to complete. Called with i_lock held. > + * Return 1 if we dropped i_lock and waited, 0 is returned otherwise. > */ > -static void inode_wait_for_writeback(struct inode *inode) > +int __must_check inode_wait_for_writeback(struct inode *inode) > { > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > wait_queue_head_t *wqh; > > wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > - while (inode->i_state & I_SYNC) { > + if (inode->i_state & I_SYNC) { > spin_unlock(&inode->i_lock); > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > - spin_lock(&inode->i_lock); > + return 1; > } > + return 0; > } > > /* > @@ -426,9 +428,12 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, > return 0; > } > /* > - * It's a data-integrity sync. We must wait. > + * It's a data-integrity sync. We must wait. Since callers hold > + * inode reference or inode has I_WILL_FREE set, it cannot go > + * away under us. > */ > - inode_wait_for_writeback(inode); > + while (inode_wait_for_writeback(inode)) > + spin_lock(&inode->i_lock); > } > > ret = __writeback_single_inode(inode, wb, wbc); > @@ -604,12 +609,20 @@ static long writeback_sb_inodes(struct super_block *sb, > } > spin_unlock(&wb->list_lock); > > - __iget(inode); > - inode_wait_for_writeback(inode); > + /* Did we drop i_lock to wait for I_SYNC? */ > + if (inode_wait_for_writeback(inode)) { > + /* Inode may be gone, start again */ > + spin_lock(&wb->list_lock); > + continue; > + } > write_chunk = writeback_chunk_size(wb->bdi, work); > wbc.nr_to_write = write_chunk; > wbc.pages_skipped = 0; > > + /* > + * We use I_SYNC to pin the inode in memory. While it is set > + * end_writeback() will wait so the inode cannot be freed. > + */ > __writeback_single_inode(inode, wb, &wbc); > > work->nr_pages -= write_chunk - wbc.nr_to_write; > @@ -633,10 +646,7 @@ static long writeback_sb_inodes(struct super_block *sb, > continue_unlock: > inode_sync_complete(inode); > spin_unlock(&inode->i_lock); > - spin_unlock(&wb->list_lock); > - iput(inode); > - cond_resched(); > - spin_lock(&wb->list_lock); > + cond_resched_lock(&wb->list_lock); > /* > * bail out to wb_writeback() often enough to check > * background threshold and other termination conditions. > @@ -831,8 +841,8 @@ static long wb_writeback(struct bdi_writeback *wb, > inode = wb_inode(wb->b_more_io.prev); > spin_lock(&inode->i_lock); > spin_unlock(&wb->list_lock); > - inode_wait_for_writeback(inode); > - spin_unlock(&inode->i_lock); > + if (!inode_wait_for_writeback(inode)) > + spin_unlock(&inode->i_lock); > spin_lock(&wb->list_lock); > } > } > diff --git a/fs/inode.c b/fs/inode.c > index d3ebdbe..b64e2fe 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -510,7 +510,16 @@ void end_writeback(struct inode *inode) > BUG_ON(!list_empty(&inode->i_data.private_list)); > BUG_ON(!(inode->i_state & I_FREEING)); > BUG_ON(inode->i_state & I_CLEAR); > - inode_sync_wait(inode); > + /* > + * Wait for flusher thread to be done with the inode. Since the inode > + * has I_FREEING set, flusher thread won't start new work on the inode. > + * We just have to wait for running writeback to finish. We must use > + * i_lock here because flusher thread might be working with the inode > + * without I_SYNC set but under i_lock. > + */ > + spin_lock(&inode->i_lock); > + if (!inode_wait_for_writeback(inode)) > + spin_unlock(&inode->i_lock); > /* don't need i_lock here, no concurrent mods to i_state */ > inode->i_state = I_FREEING | I_CLEAR; > } > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 69cd5bb..e1f0f5a 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1742,9 +1742,10 @@ struct super_operations { > * anew. Other functions will just ignore such inodes, > * if appropriate. I_NEW is used for waiting. > * > - * I_SYNC Synchonized write of dirty inode data. The bits is > - * set during data writeback, and cleared with a wakeup > - * on the bit address once it is done. > + * I_SYNC Writeback of inode is running. The bits is set during s/bits/bit/ > + * data writeback, and cleared with a wakeup on the bit > + * address once it is done. The bit is also used to pin > + * the inode in memory for flusher thread. > * > * I_REFERENCED Marks the inode as recently references on the LRU list. > * > diff --git a/include/linux/writeback.h b/include/linux/writeback.h > index 995b8bf..3a34dc0 100644 > --- a/include/linux/writeback.h > +++ b/include/linux/writeback.h > @@ -94,6 +94,7 @@ long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages, > enum wb_reason reason); > long wb_do_writeback(struct bdi_writeback *wb, int force_wait); > void wakeup_flusher_threads(long nr_pages, enum wb_reason reason); > +int __must_check inode_wait_for_writeback(struct inode *inode); > > /* writeback.h requires fs.h; it, too, is not included from here. */ > static inline void wait_on_inode(struct inode *inode) > @@ -101,12 +102,6 @@ static inline void wait_on_inode(struct inode *inode) > might_sleep(); > wait_on_bit(&inode->i_state, __I_NEW, inode_wait, TASK_UNINTERRUPTIBLE); > } > -static inline void inode_sync_wait(struct inode *inode) > -{ > - might_sleep(); > - wait_on_bit(&inode->i_state, __I_SYNC, inode_wait, > - TASK_UNINTERRUPTIBLE); > -} > > > /* > -- > 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] writeback: Avoid iput() from flusher thread 2012-03-09 9:02 ` [PATCH 4/4] writeback: Avoid iput() from flusher thread Jan Kara 2012-03-19 3:42 ` Fengguang Wu @ 2012-03-19 8:55 ` Christoph Hellwig 2012-03-19 10:46 ` Jan Kara 1 sibling, 1 reply; 15+ messages in thread From: Christoph Hellwig @ 2012-03-19 8:55 UTC (permalink / raw) To: Jan Kara; +Cc: Wu Fengguang, linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:28AM +0100, Jan Kara wrote: > Doing iput() from flusher thread (writeback_sb_inodes()) can create problems > because iput() can do a lot of work - for example truncate the inode if it's > the last iput on unlinked file. Some filesystems (e.g. ubifs) may need to > allocate blocks during truncate (due to their COW nature) and in some cases > they thus need to flush dirty data from truncate to reduce uncertainty in the > amount of free space. This effectively creates a deadlock. > > We get rid of iput() in flusher thread by using the fact that I_SYNC inode > flag effectively pins the inode in memory. So if we take care to either hold > i_lock or have I_SYNC set, we can get away without taking inode reference > in writeback_sb_inodes(). > > As a side effect, we also fix possible use-after-free in wb_writeback() because > inode_wait_for_writeback() call could try to reacquire i_lock on the inode that > was already free. > > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/fs-writeback.c | 38 ++++++++++++++++++++++++-------------- > fs/inode.c | 11 ++++++++++- > include/linux/fs.h | 7 ++++--- > include/linux/writeback.h | 7 +------ > 4 files changed, 39 insertions(+), 24 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 1e8bf44..f9f9b61 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -325,19 +325,21 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc) > } > > /* > - * Wait for writeback on an inode to complete. > + * Wait for writeback on an inode to complete. Called with i_lock held. > + * Return 1 if we dropped i_lock and waited, 0 is returned otherwise. > */ > -static void inode_wait_for_writeback(struct inode *inode) > +int __must_check inode_wait_for_writeback(struct inode *inode) > { > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > wait_queue_head_t *wqh; > > wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > + if (inode->i_state & I_SYNC) { > spin_unlock(&inode->i_lock); > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > + return 1; > } > + return 0; This is a horribly ugl primitive. I'd rather add a void inode_wait_for_writeback(struct inode *inode) { DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); wait_queue_head_t *wqh = bit_waitqueue(&inode->i_state, __I_SYNC); __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); } and opencode all the locking ad I_SYNC checking logic in the callers. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] writeback: Avoid iput() from flusher thread 2012-03-19 8:55 ` Christoph Hellwig @ 2012-03-19 10:46 ` Jan Kara 2012-03-19 11:17 ` Fengguang Wu 0 siblings, 1 reply; 15+ messages in thread From: Jan Kara @ 2012-03-19 10:46 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, Wu Fengguang, linux-fsdevel, LKML, linux-mm, Andrew Morton On Mon 19-03-12 04:55:15, Christoph Hellwig wrote: > On Fri, Mar 09, 2012 at 10:02:28AM +0100, Jan Kara wrote: > > Doing iput() from flusher thread (writeback_sb_inodes()) can create problems > > because iput() can do a lot of work - for example truncate the inode if it's > > the last iput on unlinked file. Some filesystems (e.g. ubifs) may need to > > allocate blocks during truncate (due to their COW nature) and in some cases > > they thus need to flush dirty data from truncate to reduce uncertainty in the > > amount of free space. This effectively creates a deadlock. > > > > We get rid of iput() in flusher thread by using the fact that I_SYNC inode > > flag effectively pins the inode in memory. So if we take care to either hold > > i_lock or have I_SYNC set, we can get away without taking inode reference > > in writeback_sb_inodes(). > > > > As a side effect, we also fix possible use-after-free in wb_writeback() because > > inode_wait_for_writeback() call could try to reacquire i_lock on the inode that > > was already free. > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/fs-writeback.c | 38 ++++++++++++++++++++++++-------------- > > fs/inode.c | 11 ++++++++++- > > include/linux/fs.h | 7 ++++--- > > include/linux/writeback.h | 7 +------ > > 4 files changed, 39 insertions(+), 24 deletions(-) > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 1e8bf44..f9f9b61 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -325,19 +325,21 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc) > > } > > > > /* > > - * Wait for writeback on an inode to complete. > > + * Wait for writeback on an inode to complete. Called with i_lock held. > > + * Return 1 if we dropped i_lock and waited, 0 is returned otherwise. > > */ > > -static void inode_wait_for_writeback(struct inode *inode) > > +int __must_check inode_wait_for_writeback(struct inode *inode) > > { > > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > > wait_queue_head_t *wqh; > > > > wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > > + if (inode->i_state & I_SYNC) { > > spin_unlock(&inode->i_lock); > > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > > + return 1; > > } > > + return 0; > > This is a horribly ugl primitive. > > I'd rather add a > > void inode_wait_for_writeback(struct inode *inode) > { > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > wait_queue_head_t *wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > } > > and opencode all the locking ad I_SYNC checking logic in the callers. I agree the primitive is ugly. And actually it is buggy the way I wrote it. It should have been: __wait_on_bit(wqh, &wq, isync_wait, TASK_UNINTERRUPTIBLE); where isync_wait is: int isync_wait(void *word) { struct inode *inode = container_of(word, struct inode, i_state); spin_unlock(&inode->i_lock); schedule(); return 1; } The problem is i_lock pins the inode for us in some cases. So once we drop i_lock, inode can go away so we cannot test the bit anymore. But there are just two places where we really need this. So maybe I can just opencode it there and for others use normal obvious variant. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] writeback: Avoid iput() from flusher thread 2012-03-19 10:46 ` Jan Kara @ 2012-03-19 11:17 ` Fengguang Wu 0 siblings, 0 replies; 15+ messages in thread From: Fengguang Wu @ 2012-03-19 11:17 UTC (permalink / raw) To: Jan Kara; +Cc: Christoph Hellwig, linux-fsdevel, LKML, linux-mm, Andrew Morton On Mon, Mar 19, 2012 at 11:46:59AM +0100, Jan Kara wrote: > On Mon 19-03-12 04:55:15, Christoph Hellwig wrote: > > On Fri, Mar 09, 2012 at 10:02:28AM +0100, Jan Kara wrote: > > > Doing iput() from flusher thread (writeback_sb_inodes()) can create problems > > > because iput() can do a lot of work - for example truncate the inode if it's > > > the last iput on unlinked file. Some filesystems (e.g. ubifs) may need to > > > allocate blocks during truncate (due to their COW nature) and in some cases > > > they thus need to flush dirty data from truncate to reduce uncertainty in the > > > amount of free space. This effectively creates a deadlock. > > > > > > We get rid of iput() in flusher thread by using the fact that I_SYNC inode > > > flag effectively pins the inode in memory. So if we take care to either hold > > > i_lock or have I_SYNC set, we can get away without taking inode reference > > > in writeback_sb_inodes(). > > > > > > As a side effect, we also fix possible use-after-free in wb_writeback() because > > > inode_wait_for_writeback() call could try to reacquire i_lock on the inode that > > > was already free. > > > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > --- > > > fs/fs-writeback.c | 38 ++++++++++++++++++++++++-------------- > > > fs/inode.c | 11 ++++++++++- > > > include/linux/fs.h | 7 ++++--- > > > include/linux/writeback.h | 7 +------ > > > 4 files changed, 39 insertions(+), 24 deletions(-) > > > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > > index 1e8bf44..f9f9b61 100644 > > > --- a/fs/fs-writeback.c > > > +++ b/fs/fs-writeback.c > > > @@ -325,19 +325,21 @@ static int write_inode(struct inode *inode, struct writeback_control *wbc) > > > } > > > > > > /* > > > - * Wait for writeback on an inode to complete. > > > + * Wait for writeback on an inode to complete. Called with i_lock held. > > > + * Return 1 if we dropped i_lock and waited, 0 is returned otherwise. > > > */ > > > -static void inode_wait_for_writeback(struct inode *inode) > > > +int __must_check inode_wait_for_writeback(struct inode *inode) > > > { > > > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > > > wait_queue_head_t *wqh; > > > > > > wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > > > + if (inode->i_state & I_SYNC) { > > > spin_unlock(&inode->i_lock); > > > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > > > + return 1; > > > } > > > + return 0; > > > > This is a horribly ugl primitive. > > > > I'd rather add a > > > > void inode_wait_for_writeback(struct inode *inode) > > { > > DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC); > > wait_queue_head_t *wqh = bit_waitqueue(&inode->i_state, __I_SYNC); > > > > __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); > > } > > > > and opencode all the locking ad I_SYNC checking logic in the callers. > I agree the primitive is ugly. And actually it is buggy the way I wrote > it. It should have been: > __wait_on_bit(wqh, &wq, isync_wait, TASK_UNINTERRUPTIBLE); > > where isync_wait is: > > int isync_wait(void *word) > { > struct inode *inode = container_of(word, struct inode, i_state); > > spin_unlock(&inode->i_lock); > schedule(); > return 1; > } > > The problem is i_lock pins the inode for us in some cases. So once we > drop i_lock, inode can go away so we cannot test the bit anymore. Good point, it may not be valid to test &inode->i_state any more... Given that __wait_on_bit() is do { prepare_to_wait(wq, &q->wait, mode); if (test_bit(q->key.bit_nr, q->key.flags)) ret = (*action)(q->key.flags); } while (test_bit(q->key.bit_nr, q->key.flags) && !ret); The isync_wait() will do good for the first test_bit, however still cannot avoid invalid access for the second test_bit. The fix could be - } while (test_bit(q->key.bit_nr, q->key.flags) && !ret); + } while (!ret && test_bit(q->key.bit_nr, q->key.flags)); > But there are just two places where we really need this. So maybe I can > just opencode it there and for others use normal obvious variant. OK. Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] Get rid of iput() from flusher thread 2012-03-09 9:02 [PATCH 0/4] Get rid of iput() from flusher thread Jan Kara ` (3 preceding siblings ...) 2012-03-09 9:02 ` [PATCH 4/4] writeback: Avoid iput() from flusher thread Jan Kara @ 2012-03-19 5:16 ` Fengguang Wu 4 siblings, 0 replies; 15+ messages in thread From: Fengguang Wu @ 2012-03-19 5:16 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, LKML, linux-mm, Andrew Morton On Fri, Mar 09, 2012 at 10:02:24AM +0100, Jan Kara wrote: > > Hi, > > this patch set changes writeback_sb_inodes() to avoid iput() which might > be problematic (see patch 4 which tries to summarize our email discussions) > for some filesystems. > > Patches 1 and 2 are trivial mostly unrelated fixes (Fengguang, can you can > take these and merge them right away please?). Sure, they have been tested in linux-next for some days. > Patch 3 is a preparatory code > reshuffle and patch 4 removes the __iget() / iput() from flusher thread. These are pretty sane changes towards the right direction (in despite of the complexities involved). > As a side note, your patches to offload writeback from kswapd to flusher > thread then won't need iget/iput either if we pass page references as we talked > so that should resolve most of the concerns. > > What do you think guys? I appreciate the work a lot. Thank you for removing the major problem with the patches on pageout writeback work :-) Regards, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-03-19 16:16 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-09 9:02 [PATCH 0/4] Get rid of iput() from flusher thread Jan Kara 2012-03-09 9:02 ` [PATCH 1/4] fs: Remove bogus wait in write_inode_now() Jan Kara 2012-03-19 6:57 ` Christoph Hellwig 2012-03-09 9:02 ` [PATCH 2/4] writeback: Remove outdated comment Jan Kara 2012-03-19 6:58 ` Christoph Hellwig 2012-03-09 9:02 ` [PATCH 3/4] writeback: Refactor writeback_single_inode() Jan Kara 2012-03-19 5:07 ` Fengguang Wu 2012-03-19 7:13 ` Christoph Hellwig 2012-03-19 16:16 ` Jan Kara 2012-03-09 9:02 ` [PATCH 4/4] writeback: Avoid iput() from flusher thread Jan Kara 2012-03-19 3:42 ` Fengguang Wu 2012-03-19 8:55 ` Christoph Hellwig 2012-03-19 10:46 ` Jan Kara 2012-03-19 11:17 ` Fengguang Wu 2012-03-19 5:16 ` [PATCH 0/4] Get rid of " Fengguang Wu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).