linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>,
	Linux Filesystem Development List <linux-fsdevel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Linux btrfs Developers List <linux-btrfs@vger.kernel.org>,
	XFS Developers <xfs@oss.sgi.com>
Subject: Re: [PATCH-v4 2/7] vfs: add support for a lazytime mount option
Date: Fri, 28 Nov 2014 17:24:45 +0100	[thread overview]
Message-ID: <20141128162445.GB27902@quack.suse.cz> (raw)
In-Reply-To: <20141127230016.GH14091@thunk.org>

On Thu 27-11-14 18:00:16, Ted Tso wrote:
> On Thu, Nov 27, 2014 at 02:14:21PM +0100, Jan Kara wrote:
> > * change queue_io() to also call
> > 	moved += move_expired_inodes(&wb->b_dirty_time, &wb->b_io, time + 24hours)
> >   For this you need to tweak move_expired_inodes() to take pointer to
> >   timestamp instead of pointer to work but that's trivial. Also you want
> >   probably leave time ->older_than_this value (i.e. without +24 hours) if
> >   we are doing WB_SYNC_ALL writeback. With this you can remove
> >   flush_sb_dirty_time() completely.
> 
> Well.... it's not quite enough.  The problem is that for ext3 and
> ext4, the actual work of writing the inode happens in dirty_inode(),
> not in write_inode().  Which means we need to do something like this.
  Right, I didn't realize this problem.

> I'm not entirely sure whether or not this is too ugly to live;
> personally, I think my hack of handling this in update_time() might be
> preferable....
  Actually handling the copying of timestamps in __writeback_single_inode()
would look fine to me. You mention in your next email, calling
mark_inode_dirty_sync() from flusher may be problematic - why? How is this
any different from calling mark_inode_dirty_sync() from
flush_sb_dirty_time()?

I will note that for a while I thought copying the full inode to on-disk
buffer may be problematic because inode may be in an intermediate state of
some transactional change. But that isn't an issue - if there's any
transactional change in progress, it has a handle open and until the
change is node, thus the buffer with the partial change cannot go to the
journal (transaction cannot commit) until mark_inode_dirty_sync() copies
the final state of the inode.

Another solution may be to convey the information that copying of timestamps
is necessary to ->write_inode method. We could do that via a
flag bit in writeback_control. Each filesystem can then copy timestamps
when this bit is set. But calling mark_inode_dirty_sync() from
__writeback_single_inode() looks simpler to me.

								Honza

> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index b93c529..95a42b3 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -253,7 +253,7 @@ static bool inode_dirtied_after(struct inode *inode, unsigned long t)
>   */
>  static int move_expired_inodes(struct list_head *delaying_queue,
>  			       struct list_head *dispatch_queue,
> -			       struct wb_writeback_work *work)
> +			       unsigned long *older_than_this)
>  {
>  	LIST_HEAD(tmp);
>  	struct list_head *pos, *node;
> @@ -264,8 +264,8 @@ static int move_expired_inodes(struct list_head *delaying_queue,
>  
>  	while (!list_empty(delaying_queue)) {
>  		inode = wb_inode(delaying_queue->prev);
> -		if (work->older_than_this &&
> -		    inode_dirtied_after(inode, *work->older_than_this))
> +		if (older_than_this &&
> +		    inode_dirtied_after(inode, *older_than_this))
>  			break;
>  		list_move(&inode->i_wb_list, &tmp);
>  		moved++;
> @@ -309,9 +309,14 @@ out:
>  static void queue_io(struct bdi_writeback *wb, struct wb_writeback_work *work)
>  {
>  	int moved;
> +	unsigned long one_day_later = jiffies + (HZ * 86400);
> +
>  	assert_spin_locked(&wb->list_lock);
>  	list_splice_init(&wb->b_more_io, &wb->b_io);
> -	moved = move_expired_inodes(&wb->b_dirty, &wb->b_io, work);
> +	moved = move_expired_inodes(&wb->b_dirty, &wb->b_io,
> +				    work->older_than_this);
> +	moved += move_expired_inodes(&wb->b_dirty_time, &wb->b_io,
> +				     &one_day_later);
>  	trace_writeback_queue_io(wb, work, moved);
>  }
>  
> @@ -637,6 +642,17 @@ static long writeback_sb_inodes(struct super_block *sb,
>  		}
>  
>  		/*
> +		 * If the inode is marked dirty time but is not dirty,
> +		 * then at last for ext3 and ext4 we need to call
> +		 * mark_inode_dirty_sync in order to get the inode
> +		 * timestamp transferred to the on disk inode, since
> +		 * write_inode is a no-op for those file systems.  HACK HACK HACK
> +		 */
> +		if ((inode->i_state & I_DIRTY_TIME) &&
> +		    ((inode->i_state & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) == 0))
> +			mark_inode_dirty_sync(inode);
> +
> +		/*
>  		 * Don't bother with new inodes or inodes being freed, first
>  		 * kind does not need periodic writeout yet, and for the latter
>  		 * kind writeout is handled by the freer.
> @@ -1233,9 +1249,10 @@ void inode_requeue_dirtytime(struct inode *inode)
>  	spin_lock(&bdi->wb.list_lock);
>  	spin_lock(&inode->i_lock);
>  	if ((inode->i_state & I_DIRTY_WB) == 0) {
> -		if (inode->i_state & I_DIRTY_TIME)
> +		if (inode->i_state & I_DIRTY_TIME) {
> +			inode->dirtied_when = jiffies;
>  			list_move(&inode->i_wb_list, &bdi->wb.b_dirty_time);
> -		else
> +		} else
>  			list_del_init(&inode->i_wb_list);
>  	}
>  	spin_unlock(&inode->i_lock);
> 
> Comments?
> 
> 						- Ted
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  parent reply	other threads:[~2014-11-28 16:24 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-26 10:23 [PATCH-v4 0/7] add support for a lazytime mount option Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time() Theodore Ts'o
2014-11-26 19:23   ` Christoph Hellwig
2014-11-27 12:34     ` Jan Kara
2014-11-27 15:25       ` Christoph Hellwig
2014-11-27 14:41     ` Theodore Ts'o
2014-11-27 15:28       ` Christoph Hellwig
2014-11-27 15:33       ` Theodore Ts'o
2014-11-27 16:49         ` Christoph Hellwig
2014-11-27 20:27           ` Theodore Ts'o
2014-12-01  9:28             ` Christoph Hellwig
2014-12-01 15:04               ` Theodore Ts'o
2014-12-01 17:18                 ` David Sterba
2014-12-02  9:20                 ` Christoph Hellwig
2014-12-02 15:09                   ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 2/7] vfs: add support for a lazytime mount option Theodore Ts'o
2014-11-27 13:14   ` Jan Kara
2014-11-27 20:19     ` Theodore Ts'o
2014-11-28 12:41       ` Jan Kara
2014-11-27 23:00     ` Theodore Ts'o
2014-11-28  5:36       ` Theodore Ts'o
2014-11-28 16:24       ` Jan Kara [this message]
2014-11-26 10:23 ` [PATCH-v4 3/7] vfs: don't let the dirty time inodes get more than a day stale Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 4/7] vfs: add lazytime tracepoints for better debugging Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 5/7] vfs: add find_active_inode_nowait() function Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 6/7] ext4: add support for a lazytime mount option Theodore Ts'o
2014-11-26 19:24   ` Christoph Hellwig
2014-11-26 22:48   ` Dave Chinner
2014-11-26 23:10     ` Andreas Dilger
2014-11-26 23:35       ` Dave Chinner
2014-11-27 13:27         ` Jan Kara
2014-11-27 13:32           ` Jan Kara
2014-11-27 15:25             ` Theodore Ts'o
2014-11-27 15:41               ` Jan Kara
2014-11-27 20:13                 ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 7/7] btrfs: add an is_readonly() so btrfs can use common code for update_time() Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141128162445.GB27902@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).