* [PATCH] f2fs: check bdi->dirty_exceeded when trying to skip data writes
@ 2014-06-28 11:58 Jaegeuk Kim
2014-07-02 5:54 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Jaegeuk Kim @ 2014-06-28 11:58 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim
If we don't check the current backing device status, balance_dirty_pages can
fall into infinite pausing routine.
This can be occurred when a lot of directories make a small number of dirty
dentry pages including files.
Reported-by: Brian Chadwick <brianchad@westnet.com.au>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
fs/f2fs/node.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 56907c6..a90f51d 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -43,6 +43,8 @@ bool available_free_memory(struct f2fs_sb_info *sbi, int type)
mem_size = (nm_i->nat_cnt * sizeof(struct nat_entry)) >> 12;
res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 2);
} else if (type == DIRTY_DENTS) {
+ if (sbi->sb->s_bdi->dirty_exceeded)
+ return false;
mem_size = get_pages(sbi, F2FS_DIRTY_DENTS);
res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 1);
}
--
1.8.5.2 (Apple Git-48)
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] f2fs: check bdi->dirty_exceeded when trying to skip data writes
2014-06-28 11:58 [PATCH] f2fs: check bdi->dirty_exceeded when trying to skip data writes Jaegeuk Kim
@ 2014-07-02 5:54 ` Andrew Morton
2014-07-02 9:31 ` Jaegeuk Kim
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2014-07-02 5:54 UTC (permalink / raw)
To: Jaegeuk Kim; +Cc: linux-kernel, linux-fsdevel, linux-f2fs-devel
On Sat, 28 Jun 2014 20:58:38 +0900 Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> If we don't check the current backing device status, balance_dirty_pages can
> fall into infinite pausing routine.
>
> This can be occurred when a lot of directories make a small number of dirty
> dentry pages including files.
>
> ...
>
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -43,6 +43,8 @@ bool available_free_memory(struct f2fs_sb_info *sbi, int type)
> mem_size = (nm_i->nat_cnt * sizeof(struct nat_entry)) >> 12;
> res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 2);
> } else if (type == DIRTY_DENTS) {
> + if (sbi->sb->s_bdi->dirty_exceeded)
> + return false;
> mem_size = get_pages(sbi, F2FS_DIRTY_DENTS);
> res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 1);
> }
err, filesystems should not be playing around with this.
Perhaps VFS changes are needed. Please tell us much much more about
what is going on here.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] f2fs: check bdi->dirty_exceeded when trying to skip data writes
2014-07-02 5:54 ` Andrew Morton
@ 2014-07-02 9:31 ` Jaegeuk Kim
2014-07-02 19:12 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Jaegeuk Kim @ 2014-07-02 9:31 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, linux-fsdevel, linux-f2fs-devel
On Tue, Jul 01, 2014 at 10:54:20PM -0700, Andrew Morton wrote:
> On Sat, 28 Jun 2014 20:58:38 +0900 Jaegeuk Kim <jaegeuk@kernel.org> wrote:
>
> > If we don't check the current backing device status, balance_dirty_pages can
> > fall into infinite pausing routine.
> >
> > This can be occurred when a lot of directories make a small number of dirty
> > dentry pages including files.
> >
> > ...
> >
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -43,6 +43,8 @@ bool available_free_memory(struct f2fs_sb_info *sbi, int type)
> > mem_size = (nm_i->nat_cnt * sizeof(struct nat_entry)) >> 12;
> > res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 2);
> > } else if (type == DIRTY_DENTS) {
> > + if (sbi->sb->s_bdi->dirty_exceeded)
> > + return false;
> > mem_size = get_pages(sbi, F2FS_DIRTY_DENTS);
> > res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 1);
> > }
>
> err, filesystems should not be playing around with this.
>
> Perhaps VFS changes are needed. Please tell us much much more about
> what is going on here.
The f2fs has a feature which throttles IOs to merge bios in the fs level as much
as possible by bypassing writepages in some cases.
One of the cases is related to the dentry pages.
If a direcotry has a small number of dirty dentry pages and there is an amount
of free memory, f2fs skips writepages.
The code in f2fs_write_data_pages is:
if (S_ISDIR(inode->i_mode) && wbc->sync_mode == WB_SYNC_NONE &&
get_dirty_dents(inode) < nr_pages_to_skip(sbi, DATA) &&
available_free_memory(sbi, DIRTY_DENTS))
goto skip_write;
However, if many many directories have been created and all of each directories
has a small number of dirty pages in a very short time, it makes an effect on
balance_dirty_pages.
In such the case, balance_dirty_pages waits for decreasing dirty pages but f2fs
starts to skip flushing the dirty pages continuously.
So, this patch adds a condition to avoid that behavior by checking bdi's
dirty_exceeded.
So, any recommendation instead of this kinda workaround?
IMHO, how about setting wbc->sync_mode with WB_SYNC_ALL when detecting the case?
Thanks,
--
Jaegeuk Kim
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] f2fs: check bdi->dirty_exceeded when trying to skip data writes
2014-07-02 9:31 ` Jaegeuk Kim
@ 2014-07-02 19:12 ` Andrew Morton
0 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2014-07-02 19:12 UTC (permalink / raw)
To: Jaegeuk Kim; +Cc: linux-fsdevel, linux-kernel, linux-f2fs-devel
On Wed, 2 Jul 2014 02:31:26 -0700 Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> > > --- a/fs/f2fs/node.c
> > > +++ b/fs/f2fs/node.c
> > > @@ -43,6 +43,8 @@ bool available_free_memory(struct f2fs_sb_info *sbi, int type)
> > > mem_size = (nm_i->nat_cnt * sizeof(struct nat_entry)) >> 12;
> > > res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 2);
> > > } else if (type == DIRTY_DENTS) {
> > > + if (sbi->sb->s_bdi->dirty_exceeded)
> > > + return false;
> > > mem_size = get_pages(sbi, F2FS_DIRTY_DENTS);
> > > res = mem_size < ((val.totalram * nm_i->ram_thresh / 100) >> 1);
> > > }
> >
> > err, filesystems should not be playing around with this.
> >
> > Perhaps VFS changes are needed. Please tell us much much more about
> > what is going on here.
>
> The f2fs has a feature which throttles IOs to merge bios in the fs level as much
> as possible by bypassing writepages in some cases.
OK, I just looked at fs/f2fs/data.c. AFAICT it has basically
bypassed/reimplemented/worked around the VFS.
That may be good or it may be bad. Maybe it indicates shortcomings in
the VFS, maybe it doesn't. Presumably there were good reasons for this
design but I am unable to determine what they were, because the code is
undocumented. No description of what it is trying to achieve or how or
why. It's just a great blob of C statements.
This is all rather a shame, because perhaps there were opportunities
here to improve the core VFS.
Oh well, I think I'll pretend I never saw it. Good luck!
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-07-02 19:12 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-28 11:58 [PATCH] f2fs: check bdi->dirty_exceeded when trying to skip data writes Jaegeuk Kim
2014-07-02 5:54 ` Andrew Morton
2014-07-02 9:31 ` Jaegeuk Kim
2014-07-02 19:12 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).