From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Andrew Morton <akpm@osdl.org>
Cc: wfg@mail.ustc.edu.cn, linux-kernel@vger.kernel.org,
christoph@lameter.com, riel@redhat.com, a.p.zijlstra@chello.nl,
npiggin@suse.de, andrea@suse.de, magnus.damm@gmail.com
Subject: Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging
Date: Fri, 2 Dec 2005 22:26:14 -0200 [thread overview]
Message-ID: <20051203002614.GA3140@dmt.cnet> (raw)
In-Reply-To: <20051202133917.1ebbe851.akpm@osdl.org>
On Fri, Dec 02, 2005 at 01:39:17PM -0800, Andrew Morton wrote:
> Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
> >
> >
> > It all makes sense to me (Wu's description of the problem and your patch),
> > but still no good with reference to fair scanning.
>
> Not so. On a 4G x86 box doing a simple 8GB write this patch took the
> highmem/normal scanning ratio from 0.7 to 3.5. On that setup the highmem
> zone has 3.6x as many pages as the normal zone, so it's bang-on-target.
Humpf! What are the pgalloc dma/normal/highmem numbers under such test?
Does this machine need bounce buffers for disk I/O?
> There's not a lot of point in jumping straight into the complex stresstests
> without having first tested the simple stuff.
Its not a really complex stresstest, though yours is simpler. There are 10
threads operating on 20 files. You can reproduce the load using the
following FFSB profile (I remake the filesystem each time, results are
pretty stable):
num_filesystems=1
num_threadgroups=1
directio=0
time=300
[filesystem0]
location=/mnt/hda4/
num_files=20
num_dirs=10
max_filesize=91534338
min_filesize=65535
[end0]
[threadgroup0]
num_threads=10
write_size=2816
write_blocksize=4096
read_size=2816
read_blocksize=4096
create_weight=100
write_weight=30
read_weight=100
[end0]
> > Moreover the patch hurts
> > interactivity _badly_, not sure why (ssh into the box with FFSB testcase
> > takes more than one minute to login, while vanilla takes few dozens of seconds).
>
> Well, we know that the revert reintroduces an overscanning problem.
Can you remember the testcase for which you added the "truncate reclaim"
logic more precisely?
> How are you invoking FFSB? Exactly? On what sort of machine, with how
> much memory?
Its a single processor Pentium-3 1GHz+ booted with mem=128M, 4/5 years old IDE disk.
> > Follows an interesting part of "diff -u 2614-vanilla.vmstat 2614-akpm.vmstat"
> > (they were not retrieve at the exact same point in the benchmark run, but
> > that should not matter much):
> >
> > -slabs_scanned 37632
> > -kswapd_steal 731859
> > -kswapd_inodesteal 1363
> > -pageoutrun 26573
> > -allocstall 636
> > -pgrotated 1898
> > +slabs_scanned 2688
> > +kswapd_steal 502946
> > +kswapd_inodesteal 1
> > +pageoutrun 10612
> > +allocstall 90
> > +pgrotated 68
> >
> > Note how direct reclaim (and slabs_scanned) are hugely affected.
>
> hm. allocstall is much lower and pgrotated has improved and direct reclaim
> has improved. All of which would indicate that kswapd is doing more work.
> Yet kswapd reclaimed less pages. It's hard to say what's going on as these
> numbers came from different stages of the test.
I have a feeling they came from a somewhat equivalent stage (FFSB is a cyclic
test, there are not much of "phases" after the initial creation of files).
Feel free to reproduce the testcase, you simply need the FFSB profile
above and mem=128M.
It seems very fragile (Wu's patches attempt to address that) in general: you
tweak it here and watch it go nuts there.
> > Normal: 114688kB
> > DMA: 16384kB
> >
> > Normal/DMA ratio = 114688 / 16384 = 7.000
> >
> > pgscan_kswapd Normal/DMA = (450483 / 88869) = 5.069
> > pgscan_direct Normal/DMA = (23826 / 4224) = 5.640
> > pgscan Normal/DMA = (474309 / 88869) = 5.337
> > pgscan_kswapd Normal/DMA = (441936 / 80520) = 5.488
> > pgscan_direct Normal/DMA = (7392/1188) = 6.222
> > pgscan Normal/DMA = (449328 / 81708) = 5.499
> > pgalloc_normal_dma_ratio = (559994 / 8488) = 6.597
> > pgscan_kswapd Normal/DMA (664883/82845) = 8.025
> > pgscan_direct Normal/DMA = (13485/1745) = 7.727
> > pgscan Normal/DMA = (678368 / 84590) = 8.019
> > pgalloc_normal_dma_ratio = (699927/66313) = 10.554
>
> All of these look close enough to me. 10-20% over- or under-scanning of
> the teeny DMA zone doesn't seem very important.
Hopefully yes. The lowmem_reserve[] logic is there to _avoid_ over-allocation
(over-scanning) of the DMA zone by GFP_NORMAL allocations, isnt it?
Note, there should be no DMA limited hardware on this box (I'm using PIO for the
IDE disk). BTW, why do you need lowmem_reserve for the DMA zone if you don't
have 16MB capped ISA devices on your system?
> Getting normal-vs-highmem right is more important.
>
> It's hard to say what effect the watermark thingies have on all of this.
> I'd sugget that you start out with much less complex tests and see if `echo
> 10000 10000 10000 > /proc/sys/vm/lowmem_reserve_ratio' changes anything.
> (I have that in my rc.local - the thing is a daft waste of memory).
>
> I'd be more concerned about the interactivity thing, although it sounds
> like the machine is so overloaded with this test that it'd be fairly
> pointless to try to tune that workload first. It's more important to tune
> the system for more typical heavy loads.
What made me notice it was the huge interactivity difference between
vanilla and your patch, again, I'm not really sure about its root.
> Also, the choice of IO scheduler matters. Which are you using?
The default for 2.6.14. Thats AS right?
I'll see if I can do more tests next week.
Best wishes.
next prev parent reply other threads:[~2005-12-03 14:18 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-01 10:18 [PATCH 00/12] Balancing the scan rate of major caches Wu Fengguang
2005-12-01 10:18 ` [PATCH 01/12] vm: kswapd incmin Wu Fengguang
2005-12-01 10:33 ` Andrew Morton
2005-12-01 11:40 ` Wu Fengguang
2005-12-01 10:18 ` [PATCH 02/12] mm: supporting variables and functions for balanced zone aging Wu Fengguang
2005-12-01 10:37 ` Andrew Morton
2005-12-01 12:11 ` Wu Fengguang
2005-12-01 22:28 ` Marcelo Tosatti
2005-12-01 23:03 ` Andrew Morton
2005-12-02 1:19 ` Wu Fengguang
2005-12-02 1:30 ` Andrew Morton
2005-12-02 2:04 ` Wu Fengguang
2005-12-02 2:18 ` Andrea Arcangeli
2005-12-02 2:37 ` Wu Fengguang
2005-12-02 2:52 ` Andrea Arcangeli
2005-12-02 4:45 ` Andrew Morton
2005-12-02 6:38 ` Wu Fengguang
2005-12-02 2:27 ` Nick Piggin
2005-12-02 2:36 ` Andrea Arcangeli
2005-12-02 2:43 ` Wu Fengguang
2005-12-02 5:49 ` Andrew Morton
2005-12-02 7:18 ` Wu Fengguang
2005-12-02 7:27 ` Andrew Morton
2005-12-02 15:13 ` Marcelo Tosatti
2005-12-02 21:39 ` Andrew Morton
2005-12-03 0:26 ` Marcelo Tosatti [this message]
2005-12-04 6:06 ` Wu Fengguang
2005-12-02 1:26 ` Marcelo Tosatti
2005-12-02 3:40 ` Andrew Morton
2005-12-01 10:18 ` [PATCH 03/12] mm: balance zone aging in direct reclaim path Wu Fengguang
2005-12-01 10:18 ` [PATCH 04/12] mm: balance zone aging in kswapd " Wu Fengguang
2005-12-01 10:18 ` [PATCH 05/12] mm: balance slab aging Wu Fengguang
2005-12-01 10:18 ` [PATCH 06/12] mm: balance active/inactive list scan rates Wu Fengguang
2005-12-01 11:39 ` Peter Zijlstra
2005-12-01 10:18 ` [PATCH 07/12] mm: remove unnecessary variable and loop Wu Fengguang
2005-12-01 10:18 ` [PATCH 08/12] mm: remove swap_cluster_max from scan_control Wu Fengguang
2005-12-01 10:18 ` [PATCH 09/12] mm: accumulate sc.nr_scanned/sc.nr_reclaimed Wu Fengguang
2005-12-01 10:18 ` [PATCH 10/12] mm: merge sc.may_writepage and sc.may_swap into sc.flags Wu Fengguang
2005-12-01 10:18 ` [PATCH 11/12] mm: add page reclaim debug traces Wu Fengguang
2005-12-01 10:18 ` [PATCH 12/12] mm: fix minor scan count bugs Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051203002614.GA3140@dmt.cnet \
--to=marcelo.tosatti@cyclades.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=christoph@lameter.com \
--cc=linux-kernel@vger.kernel.org \
--cc=magnus.damm@gmail.com \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=wfg@mail.ustc.edu.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox