linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: Jan Kara <jack@suse.cz>
Cc: lsf-pc@lists.linuxfoundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: [LSF/MM TOPIC] Writeback - current state and future
Date: Sun, 06 Feb 2011 12:43:20 +0200	[thread overview]
Message-ID: <4D4E7B48.9020500@panasas.com> (raw)
In-Reply-To: <20110204164222.GG4104@quack.suse.cz>

On 02/04/2011 06:42 PM, Jan Kara wrote:
>   Hi,
> 
>   I'd like to have one session about writeback. The content would highly
> depend on the current state of things but on a general level, I'd like to
> quickly sum up what went into the kernel (or is mostly ready to go) since
> last LSF (handling of background writeback, livelock avoidance), what is
> being worked on - IO-less balance_dirty_pages() (if it won't be in the
> mostly done section), what other things need to be improved (kswapd
> writeout, writeback_inodes_sb_if_idle() mess, come to my mind now)
> 
> 								Honza

Ha, I most certainly want to participate in this talk. I wanted to
suggest it myself.

Topics that I would like to raise on the matter.

[IO-less balance_dirty_pages]
As said, I'd really like if Wu or Jan could explain more about the math
and IO patterns that went into this tremendous work, and how it should
affect us fs maintainers in means of advantages and disadvantages. If
digging too deeply into this is not interesting for every body, perhaps
a side meeting with fewer people is also possible.

[Aligned write-back]
I have just finished raid5/6 support in my filesystem and will be sending
a patch that tries very aggressively to align IO on stripe boundaries.
I did not take the btrfs way of cut/paste of the write_cache_pages() function
to better fit the bill. I used the wbc->nr_to_write to trim down IO on stripe
alignment. Together with some internal structure games, I now have a much
better situation then untouched code. Better I mean that if I have simple
linear dd IO on a file, I can see o(90%) aligned IOs as opposed to 20% before
that patch. The only remaining issue, I think I have not fully investigated
it yet, is that: because I do not want any residues left from outside the
writepages() call so I do not need to sync and lock with flush, and have a
"flushing" flag in my writeout path. So what I still get is that sometimes
the writeback is able to catch up with dd and I get short writes at the
reminder, which makes the end of this call and the start of the next call
unaligned.

I envision a simple BDI members just like ra_pages for readahead that better
govern the writeback chunking. (And is accounted for in the fairness).

[Smarter/more cache eviction patterns]
I love it when I do a simple dd test in a UML (300Mg of ram) and half way down
I get these fat WARN_ONs of the iscsi tcp writeback failing to allocate network
buffers. And I did lower the writeback ratio a lot because the default of 20% does
not work for a long time, like since 35 or 36. The UML is not the only affected
system any low-memory embedded-like but 64 bit system would be. Now the IO does
complete eventually but the performance is down to 20%.

Now for a dd or cp like work pattern I would like the pages be freed much more
aggressively, like right after IO completion because I most certainly will not
use them again. On the other side git for example will write a big sequential
file then immediately turn and read it, so cache presence is a win. But I think
we can still come up with good patterns that take into account the number of
fileh opened on an inode, and some hot inode history to come up with better
patterns. (Some of this history we already have with the security plugins)

And there are other topics that I had, but can remember right now.

Thanks
Boaz

  parent reply	other threads:[~2011-02-06 10:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-04 16:42 [LSF/MM TOPIC] Writeback - current state and future Jan Kara
2011-02-04 18:06 ` Curt Wohlgemuth
2011-02-05  7:55   ` Tao Ma
2011-02-06 10:43 ` Boaz Harrosh [this message]
2011-02-06 15:13   ` Sorin Faibish
2011-02-06 16:24     ` Boaz Harrosh
2011-02-11 14:47     ` Jan Kara
2011-02-11 16:22       ` sfaibish
2011-02-26 21:03         ` Sorin Faibish
2011-02-26 21:07           ` [Lsf-pc] " James Bottomley
2011-02-26 23:21             ` Sorin Faibish
2011-02-26 23:48               ` James Bottomley
2011-02-27  1:50                 ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D4E7B48.9020500@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=fengguang.wu@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).