From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: [RFC PATCH 00/14] Per-sb tracking of dirty inodes
Date: Tue, 5 Aug 2014 12:31:51 +0200 [thread overview]
Message-ID: <20140805103151.GA22276@quack.suse.cz> (raw)
In-Reply-To: <20140805052217.GD20518@dastard>
On Tue 05-08-14 15:22:17, Dave Chinner wrote:
> On Fri, Aug 01, 2014 at 12:00:39AM +0200, Jan Kara wrote:
> > Hello,
> >
> > here is my attempt to implement per superblock tracking of dirty inodes.
> > I have two motivations for this:
> > 1) I've tried to get rid of overwriting of inode's dirty time stamp during
> > writeback and filtering of dirty inodes by superblock makes this
> > significantly harder. For similar reasons also improving scalability
> > of inode dirty tracking is more complicated than it has to be.
> > 2) Filesystems like Tux3 (but to some extent also XFS) would like to
> > influence order in which inodes are written back. Currently this isn't
> > possible. Tracking dirty inodes per superblock makes it easy to later
> > implement filesystem callback for writing back inodes and also possibly
> > allow filesystems to implement their own dirty tracking if they desire.
> >
> > The patches pass xfstests run and also some sync livelock avoidance tests
> > I have with 4 filesystems on 2 disks so they should be reasonably sound.
> > Before I go and base more work on this I'd like to hear some feedback about
> > whether people find this sane and workable.
> >
> > After this patch set it is trivial to provide a per-sb callback for writeback
> > (at level of writeback_inodes()). It is also fairly easy to allow filesystem to
> > completely override dirty tracking (only needs some restructuring of
> > mark_inode_dirty()). I can write these as a proof-of-concept patches for Tux3
> > guys once the general approach in this patch set is acked. Or if there are
> > some in-tree users (XFS?, btrfs?) I can include them in the patch set.
> >
> > Any comments welcome!
>
> My initial performance tests haven't shown any regressions, but
> those same tests show that we still need to add plugging to
> writeback_inodes(). Patch with numbers below. I haven't done any
> sanity testing yet - I'll do that over the next few days...
Thanks for tests! I was concentrating on no-regression part first with
adding possible performance improvements on top of that. I have added your
patch with plugging to the series. Thanks for that.
> FWIW, the patch set doesn't solve the sync lock contention problems -
> populate all of memory with a millions of inodes on a mounted
> filesystem, then run xfs/297 on a different filesystem. The system
> will trigger major contention in sync_inodes_sb() and
> inode_sb_list_add() on the inode_sb_list_lock because xfs/297 will
> cause lots of concurrent sync() calls to occur. The system will
> perform really badly on anything filesystem related while this
> contention occurs. Normally xfs/297 runs in 36s on the machine I
> just ran this test on, with the extra cached inodes it's been
> running for 15 minutes burning 8-9 CPU cores and there's no end in
> sight....
Yes, I didn't mean to address this yet. When I was last looking into this
problem, redirty_tail() logic was really making handling of dirty & under
writeback inodes difficult (I didn't want to add another list_head to
struct inode for completely separate under-writeback list). So I deferred
this until redirty_tail() gets sorted out. But maybe I should revisit this
with the per-sb dirty tracking unless you beat me to it ;).
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2014-08-05 10:31 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-31 22:00 [RFC PATCH 00/14] Per-sb tracking of dirty inodes Jan Kara
2014-07-31 22:00 ` [PATCH 01/14] writeback: Get rid of superblock pinning Jan Kara
2014-07-31 22:00 ` [PATCH 02/14] writeback: Remove writeback_inodes_wb() Jan Kara
2014-07-31 22:00 ` [PATCH 03/14] writeback: Remove useless argument of writeback_single_inode() Jan Kara
2014-07-31 22:00 ` [PATCH 04/14] writeback: Don't put inodes which cannot be written to b_more_io Jan Kara
2014-07-31 22:00 ` [PATCH 05/14] writeback: Move dwork and last_old_flush into backing_dev_info Jan Kara
2014-07-31 22:00 ` [PATCH 06/14] writeback: Switch locking of bandwidth fields to wb_lock Jan Kara
2014-07-31 22:00 ` [PATCH 07/14] writeback: Provide a function to get bdi from bdi_writeback Jan Kara
2014-07-31 22:00 ` [PATCH 08/14] writeback: Schedule future writeback if bdi (not wb) has dirty inodes Jan Kara
2014-07-31 22:00 ` [PATCH 09/14] writeback: Switch some function arguments from bdi_writeback to bdi Jan Kara
2014-07-31 22:00 ` [PATCH 10/14] writeback: Move rechecking of work list into bdi_process_work_items() Jan Kara
2014-07-31 22:00 ` [PATCH 11/14] writeback: Shorten list_lock hold times in bdi_writeback() Jan Kara
2014-07-31 22:00 ` [PATCH 12/14] writeback: Move refill of b_io list into writeback_inodes() Jan Kara
2014-07-31 22:00 ` [PATCH 13/14] writeback: Comment update Jan Kara
2014-07-31 22:00 ` [PATCH 14/14] writeback: Per-sb dirty tracking Jan Kara
2014-08-01 5:14 ` Daniel Phillips
2014-08-05 23:44 ` Dave Chinner
2014-08-06 8:46 ` Jan Kara
2014-08-06 21:13 ` Dave Chinner
2014-08-08 10:46 ` Jan Kara
2014-08-10 23:16 ` Dave Chinner
2014-08-01 5:32 ` [RFC PATCH 00/14] Per-sb tracking of dirty inodes Daniel Phillips
2014-08-05 5:22 ` Dave Chinner
2014-08-05 10:31 ` Jan Kara [this message]
2014-08-05 8:20 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140805103151.GA22276@quack.suse.cz \
--to=jack@suse.cz \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).