From: Jan Kara <jack@suse.cz>
To: Nai Xia <nai.xia@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Mel Gorman <mgorman@suse.de>,
Shaohua Li <shaohua.li@intel.com>, Linux-MM <linux-mm@kvack.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Minchan Kim <minchan.kim@gmail.com>,
Andy Isaacson <adi@hexapodia.org>,
Johannes Weiner <jweiner@redhat.com>,
Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 7/7] mm: compaction: Introduce sync-light migration for use by compaction
Date: Tue, 22 Nov 2011 20:13:02 +0100 [thread overview]
Message-ID: <20111122191302.GF8058@quack.suse.cz> (raw)
In-Reply-To: <201111222159.24987.nai.xia@gmail.com>
On Tue 22-11-11 21:59:24, Nai Xia wrote:
> On Tuesday 22 November 2011 19:54:27 Jan Kara wrote:
> > On Tue 22-11-11 10:14:51, Mel Gorman wrote:
> > > On Tue, Nov 22, 2011 at 02:56:51PM +0800, Shaohua Li wrote:
> > > > On Tue, 2011-11-22 at 02:36 +0800, Mel Gorman wrote:
> > > > on the other hand, MIGRATE_SYNC_LIGHT now waits for pagelock and buffer
> > > > lock, so could wait on page read. page read and page out have the same
> > > > latency, why takes them different?
> > > >
> > >
> > > That's a very reasonable question.
> > >
> > > To date, the stalls that were reported to be a problem were related to
> > > heavy writing workloads. Workloads are naturally throttled on reads
> > > but not necessarily on writes and the IO scheduler priorities sync
> > > reads over writes which contributes to keeping stalls due to page
> > > reads low. In my own tests, there have been no significant stalls
> > > due to waiting on page reads. I accept this may be because the stall
> > > threshold I record is too low.
> > >
> > > Still, I double checked an old USB copy based test to see what the
> > > compaction-related stalls really were.
> > >
> > > 58 seconds waiting on PageWriteback
> > > 22 seconds waiting on generic_make_request calling ->writepage
> > >
> > > These are total times, each stall was about 2-5 seconds and very rough
> > > estimates. There were no other sources of stalls that had compaction
> > > in the stacktrace I'm rerunning to gather more accurate stall times
> > > and for a workload similar to Andrea's and will see if page reads
> > > crop up as a major source of stalls.
> > OK, but the fact that reads do not stall may pretty much depend on the
> > behavior of the underlying IO scheduler and we probably don't want to rely
> > on it's behavior too closely. So if you are going to treat reads in a
> > special way, check with NOOP or DEADLINE io schedulers that read-stalls
> > are not a problem with them as well.
>
> Compared to the IO scheduler, I actually expect this behavior is more related
> to these two facts:
>
> 1) Due to the IO direction , most pages to be read are still in disk,
> while most pages to be write are in memory.
>
> 2) And as Mel explained, read trends to be sync, write trends to be async,
> so for decent IO schedulers, no matter what they differ in each other,
> should almost agree no favoring read more than write.
This is not true. CFQ heavily prefers read IO over write IO. Deadline
scheduler slightly prefers reads and noop io scheduler has no preference.
As a result, page which is read from disk is going to be locked for shorter
time with CFQ scheduler than with NOOP scheduler on average.
> So that amounts to the following calculation that is important to the
> statistical stall time for the compaction:
>
> page_nr * average_stall_window_time
>
> where average_stall_window_time is the window for a page between
> NotUptoDate ---> UptoDate or Dirty --> Clean. And page_nr is the
> number of pages in stall window for read or write.
>
> So for general cases,
> Fact 1) may ensure that the page_nr is smaller for read, while
> fact 2) may ensure the same for average_locking_window_time.
Well, page_nr really depends on the load. If the workload is only reads,
clearly number of read pages is going to be higher than number of written
pages. Once workload does heavy writing, I agree number of pages under
writeback is likely going to be higher.
> I am not sure this will be the same case for all workloads,
> don't know if Mel has tested large readahead workloads which
> has more async read IOs and less writebacks.
>
> But theoretically I expect things are not that bad even for large
> readahead, because readahead is triggered by the readahead TAG in
> linear order, which means for a process to generating readahead IO,
> its speed is still somewhat govened by the read IO speed. While
> for a process writing to a file mapped memory area, it may well
> exceed the speed of its backing-store writing speed.
>
>
> Aside from that, I think the relation between page locking and
> page read is not 1-to-1, in other words, there maybe quite some
> transient page locking is caused by mmap and then page fault into
> already good-state pages requiring no IO at all. For these
> transient page lockings I think it's reasonable to have light
> waiting.
Definitely there are other lockings than for read. E.g. to write a page,
we lock it first, submit IO (which can actually block waiting for request
to get freed), set PageWriteback, and unlock the page. And there are more
transient ones like you mention above...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-11-22 19:13 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-21 18:36 [RFC PATCH 0/7] Reduce compaction-related stalls and improve asynchronous migration of dirty pages v4r2 Mel Gorman
2011-11-21 18:36 ` [PATCH 1/7] mm: compaction: Allow compaction to isolate dirty pages Mel Gorman
2011-11-22 16:58 ` Minchan Kim
2011-11-21 18:36 ` [PATCH 2/7] mm: compaction: Use synchronous compaction for /proc/sys/vm/compact_memory Mel Gorman
2011-11-22 17:00 ` Minchan Kim
2011-11-21 18:36 ` [PATCH 3/7] mm: check if we isolated a compound page during lumpy scan Mel Gorman
2011-11-22 17:05 ` Minchan Kim
2011-11-21 18:36 ` [PATCH 4/7] mm: compaction: Determine if dirty pages can be migrated without blocking within ->migratepage Mel Gorman
2011-11-21 18:36 ` [PATCH 5/7] mm: compaction: make isolate_lru_page() filter-aware again Mel Gorman
2011-11-22 17:30 ` Minchan Kim
2011-11-23 9:19 ` Mel Gorman
2011-11-21 18:36 ` [PATCH 6/7] mm: page allocator: Limit when direct reclaim is used when compaction is deferred Mel Gorman
2011-11-22 17:50 ` Minchan Kim
2011-11-21 18:36 ` [PATCH 7/7] mm: compaction: Introduce sync-light migration for use by compaction Mel Gorman
2011-11-22 6:56 ` Shaohua Li
2011-11-22 10:14 ` Mel Gorman
2011-11-22 11:54 ` Jan Kara
2011-11-22 13:59 ` Nai Xia
2011-11-22 15:07 ` Nai Xia
2011-11-22 19:13 ` Jan Kara [this message]
2011-11-22 22:44 ` Nai Xia
2011-11-23 11:39 ` Jan Kara
2011-11-23 12:20 ` Nai Xia
2011-11-23 2:01 ` Nai Xia
2011-11-23 2:25 ` Shaohua Li
2011-11-23 11:00 ` Mel Gorman
2011-11-23 12:51 ` Nai Xia
2011-11-23 13:05 ` Nai Xia
2011-11-23 13:45 ` Mel Gorman
2011-11-23 14:35 ` Nai Xia
2011-11-23 15:08 ` Mel Gorman
2011-11-23 15:23 ` Nai Xia
2011-11-23 15:57 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111122191302.GF8058@quack.suse.cz \
--to=jack@suse.cz \
--cc=aarcange@redhat.com \
--cc=adi@hexapodia.org \
--cc=jweiner@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan.kim@gmail.com \
--cc=nai.xia@gmail.com \
--cc=riel@redhat.com \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).