From: Jan Kara <jack@suse.cz>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>, Jan Kara <jack@suse.cz>,
Mike Snitzer <snitzer@redhat.com>,
linux-scsi@vger.kernel.org, neilb@suse.de, dm-devel@redhat.com,
linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
"Darrick J. Wong" <djwong@us.ibm.com>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] a few storage topics
Date: Mon, 23 Jan 2012 17:30:24 +0100 [thread overview]
Message-ID: <20120123163024.GD28526@quack.suse.cz> (raw)
In-Reply-To: <4F1BF39A.6090709@panasas.com>
On Sun 22-01-12 13:31:38, Boaz Harrosh wrote:
> On 01/19/2012 11:39 PM, Andrea Arcangeli wrote:
> > On Thu, Jan 19, 2012 at 09:52:11PM +0100, Jan Kara wrote:
> >> anything. So what will be cheaper depends on how often are redirtied pages
> >> under IO. This is rather rare because pages aren't flushed all that often.
> >> So the effect of stable pages in not observable on throughput. But you can
> >> certainly see it on max latency...
> >
> > I see your point. A problem with migrate though is that the page must
> > be pinned by the I/O layer to prevent migration to free the page under
> > I/O, or how else it could be safe to read from a freed page? And if
> > the page is pinned migration won't work at all. See page_freeze_refs
> > in migrate_page_move_mapping. So the pinning issue would need to be
> > handled somehow. It's needed for example when there's an O_DIRECT
> > read, and the I/O is going to the page, if the page is migrated in
> > that case, we'd lose a part of the I/O. Differentiating how many page
> > pins are ok to be ignored by migration won't be trivial but probably
> > possible to do.
> >
> > Another way maybe would be to detect when there's too much re-dirtying
> > of pages in flight in a short amount of time, and to start the bounce
> > buffering and stop waiting, until the re-dirtying stops, and then you
> > stop the bounce buffering. But unlike migration, it can't prevent an
> > initial burst of high fault latency...
>
> Or just change that RT program that is one - latency bound but, two - does
> unpredictable, statistically bad, things to a memory mapped file.
Right. That's what I told the RT guy as well :) But he didn't like to
hear that because it meant more coding for him.
> Can a memory-mapped-file writer have some control on the time of
> writeback with data_sync or such, or it's purely: Timer fired, Kernel see
> a dirty page, start a writeout? What about if the application maps a
> portion of the file at a time, and the Kernel gets more lazy on an active
> memory mapped region. (That's what windows NT do. It will never IO a mapped
> section unless in OOM conditions. The application needs to map small sections
> and unmap to IO. It's more of a direct_io than mmap)
You can always start writeback by sync_file_range() but you have no
guarantees what writeback does. Also if you need to redirty the page
pernamently (e.g. it's a head of your transaction log), there's simply no
good time when it can be written when you also want stable pages.
> In any case, if you are very latency sensitive an mmap writeout is bad for
> you. Not only because of this new problem, but because mmap writeout can
> sync with tones of other things, that are do to memory management. (As mentioned
> by Andrea). The best for latency sensitive application is asynchronous direct-io
> by far. Only with asynchronous and direct-io you can have any real control on
> your latency. (I understand they used to have empirically observed latency bound
> but that is just luck, not real control)
>
> BTW: The application mentioned would probably not want it's IO bounced at
> the block layer, other wise why would it use mmap if not for preventing
> the copy induced by buffer IO?
Yeah, I'm not sure why their design was as it was.
> All that said, a mount option to ext4 (Is ext4 used?) to revert to the old
> behavior is the easiest solution. When originally we brought this up in LSF
> my thought was that the block request Q should have some flag that says
> need_stable_pages. If set by the likes of dm/md-raid, iscsi-with-data-signed, DIFF
> enabled devices and so on, and the FS does not guaranty/wants stable pages
> then an IO bounce is set up. But if not set then the like of ext4 need not
> bother.
There's no mount option. The behavior is on unconditionally. And so far I
have not seen enough people complain to introduce something like that -
automatic logic is a different thing of course. That might be nice to have.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2012-01-23 16:30 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-21 17:26 [CFP] Linux Storage, Filesystem & Memory Management Summit 2012 (April 1-2) Williams, Dan J
2012-01-17 20:06 ` [LSF/MM TOPIC] a few storage topics Mike Snitzer
2012-01-17 21:36 ` [Lsf-pc] " Jan Kara
2012-01-18 22:58 ` Darrick J. Wong
2012-01-18 23:22 ` Jan Kara
2012-01-18 23:42 ` Boaz Harrosh
2012-01-19 9:46 ` Jan Kara
2012-01-19 15:08 ` Andrea Arcangeli
2012-01-19 20:52 ` Jan Kara
2012-01-19 21:39 ` Andrea Arcangeli
2012-01-22 11:31 ` Boaz Harrosh
2012-01-23 16:30 ` Jan Kara [this message]
2012-01-22 12:21 ` Boaz Harrosh
2012-01-23 16:18 ` Jan Kara
2012-01-23 17:53 ` Andrea Arcangeli
2012-01-23 18:28 ` Jeff Moyer
2012-01-23 18:56 ` Andrea Arcangeli
2012-01-23 19:19 ` Jeff Moyer
2012-01-24 15:15 ` Chris Mason
2012-01-24 16:56 ` [dm-devel] " Christoph Hellwig
2012-01-24 17:01 ` Andreas Dilger
2012-01-24 17:06 ` [Lsf-pc] [dm-devel] " Andrea Arcangeli
2012-01-24 17:08 ` Chris Mason
2012-01-24 17:08 ` [Lsf-pc] " Andreas Dilger
2012-01-24 18:05 ` [dm-devel] " Jeff Moyer
2012-01-24 18:40 ` Christoph Hellwig
2012-01-24 19:07 ` Chris Mason
2012-01-24 19:14 ` Jeff Moyer
2012-01-24 20:09 ` [Lsf-pc] [dm-devel] " Jan Kara
2012-01-24 20:13 ` [Lsf-pc] " Jeff Moyer
2012-01-24 20:39 ` [Lsf-pc] [dm-devel] " Jan Kara
2012-01-24 20:59 ` Jeff Moyer
2012-01-24 21:08 ` Jan Kara
2012-01-25 3:29 ` Wu Fengguang
2012-01-25 6:15 ` [Lsf-pc] " Andreas Dilger
2012-01-25 6:35 ` [Lsf-pc] [dm-devel] " Wu Fengguang
2012-01-25 14:00 ` Jan Kara
2012-01-26 12:29 ` Andreas Dilger
2012-01-27 17:03 ` Ted Ts'o
2012-01-26 16:25 ` Vivek Goyal
2012-01-26 20:37 ` Jan Kara
2012-01-26 22:34 ` Dave Chinner
2012-01-27 3:27 ` Wu Fengguang
2012-01-27 5:25 ` Andreas Dilger
2012-01-27 7:53 ` Wu Fengguang
2012-01-25 14:33 ` Steven Whitehouse
2012-01-25 14:45 ` Jan Kara
2012-01-25 16:22 ` Loke, Chetan
2012-01-25 16:40 ` Steven Whitehouse
2012-01-25 17:08 ` Loke, Chetan
2012-01-25 17:32 ` James Bottomley
2012-01-25 18:28 ` Loke, Chetan
2012-01-25 18:37 ` Loke, Chetan
2012-01-25 18:37 ` James Bottomley
2012-01-25 20:06 ` Chris Mason
2012-01-25 22:46 ` Andrea Arcangeli
2012-01-25 22:58 ` Jan Kara
2012-01-26 8:59 ` Boaz Harrosh
2012-01-26 16:40 ` Loke, Chetan
2012-01-26 17:00 ` Andreas Dilger
2012-01-26 17:16 ` Loke, Chetan
2012-02-03 12:37 ` Wu Fengguang
2012-01-26 22:38 ` Dave Chinner
2012-01-26 16:17 ` Loke, Chetan
2012-01-25 18:44 ` Boaz Harrosh
2012-02-03 12:55 ` Wu Fengguang
2012-01-24 19:11 ` [dm-devel] [Lsf-pc] " Jeff Moyer
2012-01-26 22:31 ` Dave Chinner
2012-01-24 17:12 ` Jeff Moyer
2012-01-24 17:32 ` Chris Mason
2012-01-24 18:14 ` Jeff Moyer
2012-01-25 0:23 ` NeilBrown
2012-01-25 6:11 ` Andreas Dilger
2012-01-18 23:39 ` Dan Williams
2012-01-24 17:59 ` Martin K. Petersen
2012-01-24 19:48 ` Douglas Gilbert
2012-01-24 20:04 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120123163024.GD28526@quack.suse.cz \
--to=jack@suse.cz \
--cc=aarcange@redhat.com \
--cc=bharrosh@panasas.com \
--cc=djwong@us.ibm.com \
--cc=dm-devel@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=neilb@suse.de \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).