From: Andrew Morton <akpm@linux-foundation.org>
To: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>,
linux-mm@kvack.org, Nick Piggin <npiggin@suse.de>,
Chris Mason <chris.mason@oracle.com>,
Jens Axboe <jens.axboe@oracle.com>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure
Date: Fri, 12 Mar 2010 02:05:26 -0500 [thread overview]
Message-ID: <20100312020526.d424f2a8.akpm@linux-foundation.org> (raw)
In-Reply-To: <4B99E19E.6070301@linux.vnet.ibm.com>
On Fri, 12 Mar 2010 07:39:26 +0100 Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> wrote:
>
>
> Andrew Morton wrote:
> > On Mon, 8 Mar 2010 11:48:20 +0000
> > Mel Gorman <mel@csn.ul.ie> wrote:
> >
> >> Under memory pressure, the page allocator and kswapd can go to sleep using
> >> congestion_wait(). In two of these cases, it may not be the appropriate
> >> action as congestion may not be the problem.
> >
> > clear_bdi_congested() is called each time a write completes and the
> > queue is below the congestion threshold.
> >
> > So if the page allocator or kswapd call congestion_wait() against a
> > non-congested queue, they'll wake up on the very next write completion.
>
> Well the issue came up in all kind of loads where you don't have any
> writes at all that can wake up congestion_wait.
> Thats true for several benchmarks, but also real workload as well e.g. A
> backup job reading almost all files sequentially and pumping out stuff
> via network.
Why is reclaim going into congestion_wait() at all if there's heaps of
clean reclaimable pagecache lying around?
(I don't thing the read side of the congestion_wqh[] has ever been used, btw)
> > Hence the above-quoted claim seems to me to be a significant mis-analysis and
> > perhaps explains why the patchset didn't seem to help anything?
>
> While I might have misunderstood you and it is a mis-analysis in your
> opinion, it fixes a -80% Throughput regression on sequential read
> workloads, thats not nothing - its more like absolutely required :-)
>
> You might check out the discussion with the subject "Performance
> regression in scsi sequential throughput (iozone) due to "e084b -
> page-allocator: preserve PFN ordering when __GFP_COLD is set"".
> While the original subject is misleading from todays point of view, it
> contains a lengthy discussion about exactly when/why/where time is lost
> due to congestion wait with a lot of traces, counters, data attachments
> and such stuff.
Well if we're not encountering lots of dirty pages in reclaim then we
shouldn't be waiting for writes to retire, of course.
But if we're not encountering lots of dirty pages in reclaim, we should
be reclaiming pages, normally.
I could understand reclaim accidentally going into congestion_wait() if
it hit a large pile of pages which are unreclaimable for reasons other
than being dirty, but is that happening in this case?
If not, we broke it again.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-12 10:06 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-08 11:48 [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure Mel Gorman
2010-03-08 11:48 ` [PATCH 1/3] page-allocator: Under memory pressure, wait on pressure to relieve instead of congestion Mel Gorman
2010-03-09 13:35 ` Nick Piggin
2010-03-09 14:17 ` Mel Gorman
2010-03-09 15:03 ` Nick Piggin
2010-03-09 15:42 ` Christian Ehrhardt
2010-03-09 18:22 ` Mel Gorman
2010-03-10 2:38 ` Nick Piggin
2010-03-09 17:35 ` Mel Gorman
2010-03-10 2:35 ` Nick Piggin
2010-03-09 15:50 ` Christoph Lameter
2010-03-09 15:56 ` Christian Ehrhardt
2010-03-09 16:09 ` Christoph Lameter
2010-03-09 17:01 ` Mel Gorman
2010-03-09 17:11 ` Christoph Lameter
2010-03-09 17:30 ` Mel Gorman
2010-03-08 11:48 ` [PATCH 2/3] page-allocator: Check zone pressure when batch of pages are freed Mel Gorman
2010-03-09 9:53 ` Nick Piggin
2010-03-09 10:08 ` Mel Gorman
2010-03-09 10:23 ` Nick Piggin
2010-03-09 10:36 ` Mel Gorman
2010-03-09 11:11 ` Nick Piggin
2010-03-09 11:29 ` Mel Gorman
2010-03-08 11:48 ` [PATCH 3/3] vmscan: Put kswapd to sleep on its own waitqueue, not congestion Mel Gorman
2010-03-09 10:00 ` Nick Piggin
2010-03-09 10:21 ` Mel Gorman
2010-03-09 10:32 ` Nick Piggin
2010-03-11 23:41 ` [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure Andrew Morton
2010-03-12 6:39 ` Christian Ehrhardt
2010-03-12 7:05 ` Andrew Morton [this message]
2010-03-12 10:47 ` Mel Gorman
2010-03-12 12:15 ` Christian Ehrhardt
2010-03-12 14:37 ` Andrew Morton
2010-03-15 12:29 ` Mel Gorman
2010-03-15 14:45 ` Christian Ehrhardt
2010-03-15 12:34 ` Christian Ehrhardt
2010-03-15 20:09 ` Andrew Morton
2010-03-16 10:11 ` Mel Gorman
2010-03-18 17:42 ` Mel Gorman
2010-03-22 23:50 ` Mel Gorman
2010-03-23 14:35 ` Christian Ehrhardt
2010-03-23 21:35 ` Corrado Zoccolo
2010-03-24 11:48 ` Mel Gorman
2010-03-24 12:56 ` Corrado Zoccolo
2010-03-23 22:29 ` Rik van Riel
2010-03-24 14:50 ` Mel Gorman
2010-04-19 12:22 ` Christian Ehrhardt
2010-04-19 21:44 ` Johannes Weiner
2010-04-20 7:20 ` Christian Ehrhardt
2010-04-20 8:54 ` Christian Ehrhardt
2010-04-20 15:32 ` Johannes Weiner
2010-04-20 17:22 ` Rik van Riel
2010-04-21 4:23 ` Christian Ehrhardt
2010-04-21 7:35 ` Christian Ehrhardt
2010-04-21 13:19 ` Rik van Riel
2010-04-22 6:21 ` Christian Ehrhardt
2010-04-26 10:59 ` Subject: [PATCH][RFC] mm: make working set portion that is protected tunable v2 Christian Ehrhardt
2010-04-26 11:59 ` KOSAKI Motohiro
2010-04-26 12:43 ` Christian Ehrhardt
2010-04-26 14:20 ` Rik van Riel
2010-04-27 14:00 ` Christian Ehrhardt
2010-04-21 9:03 ` [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure Johannes Weiner
2010-04-21 13:20 ` Rik van Riel
2010-04-20 14:40 ` Rik van Riel
2010-03-24 2:38 ` Greg KH
2010-03-24 11:49 ` Mel Gorman
2010-03-24 13:13 ` Johannes Weiner
2010-03-12 9:09 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100312020526.d424f2a8.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=ehrhardt@linux.vnet.ibm.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).