From: Dave Chinner <david@fromorbit.com>
To: Daniel Phillips <daniel@phunq.net>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: Re: [RFC][PATCH 1/2] Add a super operation for writeback
Date: Mon, 2 Jun 2014 13:15:26 +1000 [thread overview]
Message-ID: <20140602031526.GS14410@dastard> (raw)
In-Reply-To: <538B9DEE.20800@phunq.net>
On Sun, Jun 01, 2014 at 02:41:02PM -0700, Daniel Phillips wrote:
> ---
> From: Daniel Phillips <daniel@tux3.org>
> Subject: [PATCH] Add a super operation for writeback
>
> Add a "writeback" super operation to be called in the
> form:
>
> progress = sb->s_op->writeback(sb, &wbc, &pages);
>
> The filesystem is expected to flush some inodes to disk
> and return progress of at least 1, or if no inodes are
> flushed, return progress of zero. The filesystem should
> try to flush at least the number of pages specified in
> *pages, or if that is not possible, return approximately
> the number of pages not flushed into *pages.
>
> Within the ->writeback callback, the filesystem should
> call inode_writeback_done(inode) for each inode flushed
> (and therefore set clean) or inode_writeback_touch(inode)
> for any inode that will be retained dirty in cache.
>
> Signed-off-by: Daniel Phillips <daniel@tux3.org>
> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
> ---
>
> fs/fs-writeback.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++---
> include/linux/fs.h | 4 +++
> 2 files changed, 60 insertions(+), 3 deletions(-)
>
> diff -puN fs/fs-writeback.c~core-writeback fs/fs-writeback.c
> --- linux-tux3/fs/fs-writeback.c~core-writeback 2014-05-31 06:43:19.416031712 +0900
> +++ linux-tux3-hirofumi/fs/fs-writeback.c 2014-05-31 06:44:46.087904373 +0900
> @@ -192,6 +192,35 @@ void inode_wb_list_del(struct inode *ino
> }
>
> /*
> + * Remove inode from writeback list if clean.
> + */
> +void inode_writeback_done(struct inode *inode)
> +{
> + struct backing_dev_info *bdi = inode_to_bdi(inode);
> +
> + spin_lock(&bdi->wb.list_lock);
> + spin_lock(&inode->i_lock);
> + if (!(inode->i_state & I_DIRTY))
> + list_del_init(&inode->i_wb_list);
> + spin_unlock(&inode->i_lock);
> + spin_unlock(&bdi->wb.list_lock);
> +}
> +EXPORT_SYMBOL_GPL(inode_writeback_done);
> +
> +/*
> + * Add inode to writeback dirty list with current time.
> + */
> +void inode_writeback_touch(struct inode *inode)
> +{
> + struct backing_dev_info *bdi = inode->i_sb->s_bdi;
> + spin_lock(&bdi->wb.list_lock);
> + inode->dirtied_when = jiffies;
> + list_move(&inode->i_wb_list, &bdi->wb.b_dirty);
> + spin_unlock(&bdi->wb.list_lock);
> +}
> +EXPORT_SYMBOL_GPL(inode_writeback_touch);
You should be able to use redirty_tail() for this....
Hmmmm - this is using the wb dirty lists and locks, but you
don't pass the wb structure to the writeback callout? That seem
wrong to me - why would you bother manipulating these lists if you
aren't using them to track dirty inodes in the first place?
> +
> +/*
> * Redirty an inode: set its when-it-was dirtied timestamp and move it to the
> * furthest end of its superblock's dirty-inode list.
> *
> @@ -593,9 +622,9 @@ static long writeback_chunk_size(struct
> *
> * Return the number of pages and/or inodes written.
> */
> -static long writeback_sb_inodes(struct super_block *sb,
> - struct bdi_writeback *wb,
> - struct wb_writeback_work *work)
> +static long __writeback_sb_inodes(struct super_block *sb,
> + struct bdi_writeback *wb,
> + struct wb_writeback_work *work)
> {
> struct writeback_control wbc = {
> .sync_mode = work->sync_mode,
> @@ -710,6 +739,30 @@ static long writeback_sb_inodes(struct s
> return wrote;
> }
>
> +static long writeback_sb_inodes(struct super_block *sb,
> + struct bdi_writeback *wb,
> + struct wb_writeback_work *work)
> +{
> + if (sb->s_op->writeback) {
> + struct writeback_control wbc = {
> + .sync_mode = work->sync_mode,
> + .tagged_writepages = work->tagged_writepages,
> + .for_kupdate = work->for_kupdate,
> + .for_background = work->for_background,
> + .for_sync = work->for_sync,
> + .range_cyclic = work->range_cyclic,
> + };
> + long ret;
> +
> + spin_unlock(&wb->list_lock);
> + ret = sb->s_op->writeback(sb, &wbc, &work->nr_pages);
> + spin_lock(&wb->list_lock);
> + return ret;
> + }
> +
> + return __writeback_sb_inodes(sb, wb, work);
> +}
The first thing that __writeback_sb_inodes() does is create a struct
writeback_control from the wb_writeback_work. That should be done
here and passed to __writeback_sb_inodes(), which should be renamed
"generic_writeback_sb_inodes()". Also, all the fields in the wbc
need to be initialised correctly (i.e including range_start/end).
Further, a writeback implementation will need to use the generic bdi
list and lock structures and so we need to pass the bdi_writeback.
Similarly, if we are going to pass nr_pages, we might as well pass
the entire work structure.
Finally, I don't like the way the wb->list_lock is treated
differently by this code. I suspect that we need to rationalise the
layering of the wb->list_lock as it is currently not clear what it
protects and what (nested) layers of the writeback code actually
require it.
What I'd like to see is this work:
struct super_ops ... = {
....
.writeback = generic_writeback_sb_inodes,
....
And that means writeback_sb_inodes() would become:
static long writeback_sb_inodes(struct super_block *sb,
struct bdi_writeback *wb,
struct wb_writeback_work *work)
{
struct writeback_control wbc = {
.sync_mode = work->sync_mode,
.tagged_writepages = work->tagged_writepages,
.for_kupdate = work->for_kupdate,
.for_background = work->for_background,
.for_sync = work->for_sync,
.range_cyclic = work->range_cyclic,
.range_start = 0,
.range_end = LLONG_MAX,
};
if (sb->s_op->writeback)
return sb->s_op->writeback(sb, wb, work, &wbc);
return generic_writeback_sb_inodes(sb, wb, work, &wbc);
}
And the higher/lower layers deal with wb->list_lock appropriately.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2014-06-02 3:15 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-01 21:41 [RFC][PATCH 1/2] Add a super operation for writeback Daniel Phillips
2014-06-01 21:42 ` [RFC][PATCH 2/2] tux3: Use writeback hook to remove duplicated core code Daniel Phillips
2014-06-02 3:30 ` Dave Chinner
2014-06-02 20:07 ` Daniel Phillips
2014-06-02 3:15 ` Dave Chinner [this message]
2014-06-02 20:02 ` [RFC][PATCH 1/2] Add a super operation for writeback Daniel Phillips
2014-06-03 3:33 ` Dave Chinner
2014-06-03 7:01 ` Daniel Phillips
2014-06-03 7:26 ` Daniel Phillips
2014-06-03 7:47 ` OGAWA Hirofumi
2014-06-03 8:12 ` Dave Chinner
2014-06-03 8:57 ` OGAWA Hirofumi
2014-06-03 7:52 ` Dave Chinner
2014-06-03 14:05 ` Jan Kara
2014-06-03 14:14 ` Christoph Hellwig
2014-06-03 14:25 ` Theodore Ts'o
2014-06-03 15:21 ` Jan Kara
2014-06-03 22:37 ` Daniel Phillips
2014-06-04 20:16 ` Jan Kara
2014-06-02 8:30 ` Christian Stroetmann
2014-06-03 3:39 ` Dave Chinner
2014-06-03 5:30 ` Christian Stroetmann
2014-06-03 14:57 ` Theodore Ts'o
2014-06-03 16:30 ` Christian Stroetmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140602031526.GS14410@dastard \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=daniel@phunq.net \
--cc=hirofumi@mail.parknet.co.jp \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.