From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
miklos@szeredi.hu, neilb@suse.de, dgc@sgi.com,
tomoki.sekiyama.qu@hitachi.com, nikita@clusterfs.com
Subject: Re: [PATCH 09/12] mm: remove throttle_vm_writeback
Date: Wed, 26 Sep 2007 22:42:30 +0200 [thread overview]
Message-ID: <1190839350.18147.28.camel@lappy> (raw)
In-Reply-To: <20070405154440.0f42fa9f.akpm@linux-foundation.org>
On Thu, 2007-04-05 at 15:44 -0700, Andrew Morton wrote:
> On Thu, 05 Apr 2007 19:42:18 +0200
> root@programming.kicks-ass.net wrote:
>
> > rely on accurate dirty page accounting to provide enough push back
>
> I think we'd like to see a bit more justification than that, please.
it should read like this:
for ( ; ; ) {
get_dirty_limits(&background_thresh, &dirty_thresh, NULL, NULL);
/*
* Boost the allowable dirty threshold a bit for page
* allocators so they don't get DoS'ed by heavy writers
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
if (global_page_state(NR_FILE_DIRTY) +
global_page_state(NR_UNSTABLE_NFS) +
global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
congestion_wait(WRITE, HZ/10);
}
[ note the extra NR_FILE_DIRTY ]
now, balance_dirty_pages() is there to ensure:
nr_dirty + nr_unstable + nr_writeback < dirty_thresh (1)
reclaim will (with the introduction of dirty page tracking) never
generate dirty pages, so the only disturbance of that equation is an
increase in nr_writeback.
[ pageout() sets wbc.for_reclaim=1, so NFS traffic will not generate
unstable pages ]
So, what throttle_vm_writeout() does is limit the number of added
writeback pages to 10% of the total limit.
pageout() seems to avoid stuffing pages down a congested bdi
(TODO: has details), along with the much smaller io-queues, the initial
purpose of this function - which was to avoid all memory getting stuck
in io-queues - seems to be handled.
Now the problems...
Trouble is that it currently does not take nr_dirty into account which
in the worst case limits it to 110% of the limit.
Also, I'm seeing (2.6.23-rc8-mm1) live-locks in throttle_vm_writeback()
where nr_dirty + nr_unstable > thresh - which according to (1) should
not happen, and will not change without explicit action.
Hmm maybe the 10% is < nr_cpus * ratelimit_pages.
2 cpus, mem=128M -> ratelimit_pages ~ 512
threshold ~ 1500
so indeed: 150 < 1024.
Still not conclusive but at least getting somewhere.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
miklos@szeredi.hu, neilb@suse.de, dgc@sgi.com,
tomoki.sekiyama.qu@hitachi.com, nikita@clusterfs.com
Subject: Re: [PATCH 09/12] mm: remove throttle_vm_writeback
Date: Wed, 26 Sep 2007 22:42:30 +0200 [thread overview]
Message-ID: <1190839350.18147.28.camel@lappy> (raw)
In-Reply-To: <20070405154440.0f42fa9f.akpm@linux-foundation.org>
On Thu, 2007-04-05 at 15:44 -0700, Andrew Morton wrote:
> On Thu, 05 Apr 2007 19:42:18 +0200
> root@programming.kicks-ass.net wrote:
>
> > rely on accurate dirty page accounting to provide enough push back
>
> I think we'd like to see a bit more justification than that, please.
it should read like this:
for ( ; ; ) {
get_dirty_limits(&background_thresh, &dirty_thresh, NULL, NULL);
/*
* Boost the allowable dirty threshold a bit for page
* allocators so they don't get DoS'ed by heavy writers
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
if (global_page_state(NR_FILE_DIRTY) +
global_page_state(NR_UNSTABLE_NFS) +
global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
congestion_wait(WRITE, HZ/10);
}
[ note the extra NR_FILE_DIRTY ]
now, balance_dirty_pages() is there to ensure:
nr_dirty + nr_unstable + nr_writeback < dirty_thresh (1)
reclaim will (with the introduction of dirty page tracking) never
generate dirty pages, so the only disturbance of that equation is an
increase in nr_writeback.
[ pageout() sets wbc.for_reclaim=1, so NFS traffic will not generate
unstable pages ]
So, what throttle_vm_writeout() does is limit the number of added
writeback pages to 10% of the total limit.
pageout() seems to avoid stuffing pages down a congested bdi
(TODO: has details), along with the much smaller io-queues, the initial
purpose of this function - which was to avoid all memory getting stuck
in io-queues - seems to be handled.
Now the problems...
Trouble is that it currently does not take nr_dirty into account which
in the worst case limits it to 110% of the limit.
Also, I'm seeing (2.6.23-rc8-mm1) live-locks in throttle_vm_writeback()
where nr_dirty + nr_unstable > thresh - which according to (1) should
not happen, and will not change without explicit action.
Hmm maybe the 10% is < nr_cpus * ratelimit_pages.
2 cpus, mem=128M -> ratelimit_pages ~ 512
threshold ~ 1500
so indeed: 150 < 1024.
Still not conclusive but at least getting somewhere.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-09-26 20:46 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-05 17:42 [PATCH 00/12] per device dirty throttling -v3 root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 01/12] nfs: remove congestion_end() root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 02/12] mm: scalable bdi statistics counters root
2007-04-05 17:42 ` root
2007-04-05 22:37 ` Andrew Morton
2007-04-05 22:37 ` Andrew Morton
2007-04-06 7:22 ` Peter Zijlstra
2007-04-06 7:22 ` Peter Zijlstra
2007-04-05 17:42 ` [PATCH 03/12] mm: count dirty pages per BDI root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 04/12] mm: count writeback " root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 05/12] mm: count unstable " root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 06/12] mm: expose BDI statistics in sysfs root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 07/12] mm: per device dirty threshold root
2007-04-05 17:42 ` root
2007-04-05 17:42 ` [PATCH 08/12] mm: fixup possible deadlock root
2007-04-05 17:42 ` root
2007-04-05 22:43 ` Andrew Morton
2007-04-05 22:43 ` Andrew Morton
2007-04-05 17:42 ` [PATCH 09/12] mm: remove throttle_vm_writeback root
2007-04-05 17:42 ` root
2007-04-05 22:44 ` Andrew Morton
2007-04-05 22:44 ` Andrew Morton
2007-09-26 20:42 ` Peter Zijlstra [this message]
2007-09-26 20:42 ` Peter Zijlstra
2007-04-05 17:42 ` [PATCH 10/12] mm: page_alloc_wait root
2007-04-05 17:42 ` root
2007-04-05 22:57 ` Andrew Morton
2007-04-05 22:57 ` Andrew Morton
2007-04-06 6:37 ` Peter Zijlstra
2007-04-06 6:37 ` Peter Zijlstra
2007-04-05 17:42 ` [PATCH 11/12] mm: accurate pageout congestion wait root
2007-04-05 17:42 ` root
2007-04-05 23:17 ` Andrew Morton
2007-04-05 23:17 ` Andrew Morton
2007-04-06 6:51 ` Peter Zijlstra
2007-04-06 6:51 ` Peter Zijlstra
2007-04-05 17:42 ` [PATCH 12/12] mm: per BDI congestion feedback root
2007-04-05 17:42 ` root
2007-04-05 23:24 ` Andrew Morton
2007-04-05 23:24 ` Andrew Morton
2007-04-06 7:01 ` Peter Zijlstra
2007-04-06 7:01 ` Peter Zijlstra
2007-04-06 11:00 ` Andrew Morton
2007-04-06 11:00 ` Andrew Morton
2007-04-06 11:10 ` Miklos Szeredi
2007-04-06 11:10 ` Miklos Szeredi
2007-04-05 17:47 ` [PATCH 00/12] per device dirty throttling -v3 Peter Zijlstra
2007-04-05 17:47 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1190839350.18147.28.camel@lappy \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=dgc@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=neilb@suse.de \
--cc=nikita@clusterfs.com \
--cc=tomoki.sekiyama.qu@hitachi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.