From: Jan Kara <jack@suse.cz>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
linux-fsdevel@vger.kernel.org,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
linux-btrfs@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH 2/4] vfs: add support for a lazytime mount option
Date: Tue, 25 Nov 2014 21:18:10 +0100 [thread overview]
Message-ID: <20141125201810.GA18592@quack.suse.cz> (raw)
In-Reply-To: <20141125175716.GC11648@thunk.org>
On Tue 25-11-14 12:57:16, Ted Tso wrote:
> On Tue, Nov 25, 2014 at 06:19:27PM +0100, Jan Kara wrote:
> > Actually, I'd also prefer to do the writing from iput_final(). My main
> > reason is that shrinker starts behaving very differently when you put
> > inodes with I_DIRTY_TIME to the LRU. See inode_lru_isolate() and in
> > particular:
> > /*
> > * Referenced or dirty inodes are still in use. Give them another
> > * pass
> > * through the LRU as we canot reclaim them now.
> > */
> > if (atomic_read(&inode->i_count) ||
> > (inode->i_state & ~I_REFERENCED)) {
> > list_del_init(&inode->i_lru);
> > spin_unlock(&inode->i_lock);
> > this_cpu_dec(nr_unused);
> > return LRU_REMOVED;
> > }
>
> I must be missing something; how would the shirnker behave
> differently? I_DIRTY_TIME shouldn't have any effect on the shrinker;
> note that I_DIRTY_TIME is *not* part of I_DIRTY, and this was quite
> deliberate, because I didn't want I_DIRTY_TIME to have any affect on
> any of the other parts of the writeback or inode management parts.
Sure, but the test tests whether the inode has *any other* bit than
I_REFERENCED set. So I_DIRTY_TIME will trigger the test and we just remove
the inode from lru list. You could exclude I_DIRTY_TIME from this test to
avoid this problem but then the shrinker latency would get much larger
because it will suddently do IO in evict(). So I still think doing the
write in iput_final() is the best solution.
> > Regarding your concern that we'd write the inode when file is closed -
> > that's not true. We'll write the inode only after corresponding dentry is
> > evicted and thus drops inode reference. That doesn't seem too bad to me.
>
> True, fair enough. It's not quite so lazy, but it should be close
> enough.
>
> I'm still not seeing the benefit in waiting until the last possible
> minute to write out the timestamps; evict() can block as it is if
> there are any writeback that needs to be completed, and if the
> writeback happens to pages subject to delalloc, the timestamp update
> could happen for free at that point.
Yeah, doing IO from evict is OK in princible but the change in shrinker
success rate / latency worries me... It would certainly need careful
testing under memory pressure & IO load with lots of outstanding timestamp
updates and see how shrinker behaves (change in CPU consumption, numbers of
evicted inodes, etc.).
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>,
xfs@oss.sgi.com, linux-fsdevel@vger.kernel.org,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/4] vfs: add support for a lazytime mount option
Date: Tue, 25 Nov 2014 21:18:10 +0100 [thread overview]
Message-ID: <20141125201810.GA18592@quack.suse.cz> (raw)
In-Reply-To: <20141125175716.GC11648@thunk.org>
On Tue 25-11-14 12:57:16, Ted Tso wrote:
> On Tue, Nov 25, 2014 at 06:19:27PM +0100, Jan Kara wrote:
> > Actually, I'd also prefer to do the writing from iput_final(). My main
> > reason is that shrinker starts behaving very differently when you put
> > inodes with I_DIRTY_TIME to the LRU. See inode_lru_isolate() and in
> > particular:
> > /*
> > * Referenced or dirty inodes are still in use. Give them another
> > * pass
> > * through the LRU as we canot reclaim them now.
> > */
> > if (atomic_read(&inode->i_count) ||
> > (inode->i_state & ~I_REFERENCED)) {
> > list_del_init(&inode->i_lru);
> > spin_unlock(&inode->i_lock);
> > this_cpu_dec(nr_unused);
> > return LRU_REMOVED;
> > }
>
> I must be missing something; how would the shirnker behave
> differently? I_DIRTY_TIME shouldn't have any effect on the shrinker;
> note that I_DIRTY_TIME is *not* part of I_DIRTY, and this was quite
> deliberate, because I didn't want I_DIRTY_TIME to have any affect on
> any of the other parts of the writeback or inode management parts.
Sure, but the test tests whether the inode has *any other* bit than
I_REFERENCED set. So I_DIRTY_TIME will trigger the test and we just remove
the inode from lru list. You could exclude I_DIRTY_TIME from this test to
avoid this problem but then the shrinker latency would get much larger
because it will suddently do IO in evict(). So I still think doing the
write in iput_final() is the best solution.
> > Regarding your concern that we'd write the inode when file is closed -
> > that's not true. We'll write the inode only after corresponding dentry is
> > evicted and thus drops inode reference. That doesn't seem too bad to me.
>
> True, fair enough. It's not quite so lazy, but it should be close
> enough.
>
> I'm still not seeing the benefit in waiting until the last possible
> minute to write out the timestamps; evict() can block as it is if
> there are any writeback that needs to be completed, and if the
> writeback happens to pages subject to delalloc, the timestamp update
> could happen for free at that point.
Yeah, doing IO from evict is OK in princible but the change in shrinker
success rate / latency worries me... It would certainly need careful
testing under memory pressure & IO load with lots of outstanding timestamp
updates and see how shrinker behaves (change in CPU consumption, numbers of
evicted inodes, etc.).
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-11-25 20:18 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-21 19:59 [PATCH 0/4] add support for a lazytime mount option Theodore Ts'o
2014-11-21 19:59 ` Theodore Ts'o
2014-11-21 19:59 ` [PATCH 1/4] fs: split update_time() into update_time() and write_time() Theodore Ts'o
2014-11-21 19:59 ` Theodore Ts'o
2014-11-21 20:08 ` Chris Mason
2014-11-21 20:08 ` Chris Mason
2014-11-21 21:42 ` Theodore Ts'o
2014-11-21 21:42 ` Theodore Ts'o
2014-11-24 16:38 ` David Sterba
2014-11-24 16:38 ` David Sterba
2014-11-24 17:22 ` Theodore Ts'o
2014-11-24 17:22 ` Theodore Ts'o
2014-11-24 18:09 ` David Sterba
2014-11-24 18:09 ` David Sterba
2014-11-24 15:21 ` Christoph Hellwig
2014-11-24 15:21 ` Christoph Hellwig
2014-11-24 15:56 ` Theodore Ts'o
2014-11-24 15:56 ` Theodore Ts'o
2014-11-24 17:34 ` David Sterba
2014-11-24 17:34 ` David Sterba
2014-11-25 15:51 ` David Sterba
2014-11-25 15:51 ` David Sterba
2014-11-25 17:01 ` Christoph Hellwig
2014-11-21 19:59 ` [PATCH 2/4] vfs: add support for a lazytime mount option Theodore Ts'o
2014-11-21 19:59 ` Theodore Ts'o
2014-11-25 1:52 ` Dave Chinner
2014-11-25 1:52 ` Dave Chinner
2014-11-25 4:33 ` Theodore Ts'o
2014-11-25 4:33 ` Theodore Ts'o
2014-11-25 15:32 ` Boaz Harrosh
2014-11-25 15:32 ` Boaz Harrosh
2014-11-25 17:19 ` Jan Kara
2014-11-25 17:19 ` Jan Kara
2014-11-25 17:57 ` Theodore Ts'o
2014-11-25 17:57 ` Theodore Ts'o
2014-11-25 20:18 ` Jan Kara [this message]
2014-11-25 20:18 ` Jan Kara
2014-11-25 17:30 ` Jan Kara
2014-11-25 17:30 ` Jan Kara
2014-11-25 19:26 ` Theodore Ts'o
2014-11-25 19:26 ` Theodore Ts'o
2014-11-26 0:24 ` Dave Chinner
2014-11-26 0:24 ` Dave Chinner
2014-11-21 19:59 ` [PATCH 3/4] vfs: don't let the dirty time inodes get more than a day stale Theodore Ts'o
2014-11-21 19:59 ` Theodore Ts'o
2014-11-21 20:19 ` Andreas Dilger
2014-11-21 20:19 ` Andreas Dilger
2014-11-21 21:36 ` Theodore Ts'o
2014-11-21 21:36 ` Theodore Ts'o
2014-11-21 23:09 ` Andreas Dilger
2014-11-25 1:53 ` Dave Chinner
2014-11-25 1:53 ` Dave Chinner
2014-11-25 4:45 ` Theodore Ts'o
2014-11-25 4:45 ` Theodore Ts'o
2014-11-25 23:48 ` Dave Chinner
2014-11-25 23:48 ` Dave Chinner
2014-11-26 10:20 ` Theodore Ts'o
2014-11-26 10:20 ` Theodore Ts'o
2014-11-26 22:39 ` Dave Chinner
2014-11-26 22:39 ` Dave Chinner
2014-11-25 17:31 ` Jan Kara
2014-11-25 17:31 ` Jan Kara
2014-11-21 19:59 ` [PATCH 4/4] ext4: add support for a lazytime mount option Theodore Ts'o
2014-11-21 19:59 ` Theodore Ts'o
2014-11-25 17:34 ` Jan Kara
2014-11-25 17:34 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141125201810.GA18592@quack.suse.cz \
--to=jack@suse.cz \
--cc=david@fromorbit.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.