From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Jens Axboe <jens.axboe@oracle.com>,
akpm@linux-foundation.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@elte.hu>,
thomas.pi@arcor.dea, Yuriy Lalym <ylalym@gmail.com>,
ltt-dev@lists.casi.polymtl.ca, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O
Date: Tue, 10 Feb 2009 01:12:27 -0500 [thread overview]
Message-ID: <20090210061226.GA1918@Krystal> (raw)
In-Reply-To: <alpine.LFD.2.00.0902092120450.3048@localhost.localdomain>
* Linus Torvalds (torvalds@linux-foundation.org) wrote:
>
>
> On Mon, 9 Feb 2009, Mathieu Desnoyers wrote:
> >
> > So this patch fixes this behavior by only decrementing the page accounting
> > _after_ the block I/O writepage has been done.
>
> This makes no sense, really.
>
> Or rather, I don't mind the notion of updating the counters only after IO
> per se, and _that_ part of it probably makes sense. But why is it that you
> only then fix up two of the call-sites. There's a lot more call-sites than
> that for this function.
>
> So if this really makes a big difference, that's an interesting starting
> point for discussion, but I don't see how this particular patch could
> possibly be the right thing to do.
>
Yes, you are right. Looking in more details at /proc/meminfo under the
workload, I notice this :
MemTotal: 16028812 kB
MemFree: 13651440 kB
Buffers: 8944 kB
Cached: 2209456 kB <--- increments up to ~16GB
cached = global_page_state(NR_FILE_PAGES) -
total_swapcache_pages - i.bufferram;
SwapCached: 0 kB
Active: 34668 kB
Inactive: 2200668 kB <--- also
K(pages[LRU_INACTIVE_ANON] + pages[LRU_INACTIVE_FILE]),
Active(anon): 17136 kB
Inactive(anon): 0 kB
Active(file): 17532 kB
Inactive(file): 2200668 kB <--- also
K(pages[LRU_INACTIVE_FILE]),
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 19535024 kB
SwapFree: 19535024 kB
Dirty: 1159036 kB
Writeback: 0 kB <--- stays close to 0
AnonPages: 17060 kB
Mapped: 9476 kB
Slab: 96188 kB
SReclaimable: 79776 kB
SUnreclaim: 16412 kB
PageTables: 3364 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 27549428 kB
Committed_AS: 54292 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 9960 kB
VmallocChunk: 34359727667 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 7552 kB
DirectMap2M: 16769024 kB
So I think simply substracting K(pages[LRU_INACTIVE_FILE]) from
avail_dirty in clip_bdi_dirty_limit() and to consider it in
balance_dirty_pages() and throttle_vm_writeout() would probably make my
problem go away, but I would like to understand exactly why this is
needed and if I would need to consider other types of page counts that
would have been forgotten.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Jens Axboe <jens.axboe@oracle.com>,
akpm@linux-foundation.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@elte.hu>,
thomas.pi@arcor.dea, Yuriy Lalym <ylalym@gmail.com>,
ltt-dev@lists.casi.polymtl.ca, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O
Date: Tue, 10 Feb 2009 01:12:27 -0500 [thread overview]
Message-ID: <20090210061226.GA1918@Krystal> (raw)
In-Reply-To: <alpine.LFD.2.00.0902092120450.3048@localhost.localdomain>
* Linus Torvalds (torvalds@linux-foundation.org) wrote:
>
>
> On Mon, 9 Feb 2009, Mathieu Desnoyers wrote:
> >
> > So this patch fixes this behavior by only decrementing the page accounting
> > _after_ the block I/O writepage has been done.
>
> This makes no sense, really.
>
> Or rather, I don't mind the notion of updating the counters only after IO
> per se, and _that_ part of it probably makes sense. But why is it that you
> only then fix up two of the call-sites. There's a lot more call-sites than
> that for this function.
>
> So if this really makes a big difference, that's an interesting starting
> point for discussion, but I don't see how this particular patch could
> possibly be the right thing to do.
>
Yes, you are right. Looking in more details at /proc/meminfo under the
workload, I notice this :
MemTotal: 16028812 kB
MemFree: 13651440 kB
Buffers: 8944 kB
Cached: 2209456 kB <--- increments up to ~16GB
cached = global_page_state(NR_FILE_PAGES) -
total_swapcache_pages - i.bufferram;
SwapCached: 0 kB
Active: 34668 kB
Inactive: 2200668 kB <--- also
K(pages[LRU_INACTIVE_ANON] + pages[LRU_INACTIVE_FILE]),
Active(anon): 17136 kB
Inactive(anon): 0 kB
Active(file): 17532 kB
Inactive(file): 2200668 kB <--- also
K(pages[LRU_INACTIVE_FILE]),
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 19535024 kB
SwapFree: 19535024 kB
Dirty: 1159036 kB
Writeback: 0 kB <--- stays close to 0
AnonPages: 17060 kB
Mapped: 9476 kB
Slab: 96188 kB
SReclaimable: 79776 kB
SUnreclaim: 16412 kB
PageTables: 3364 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 27549428 kB
Committed_AS: 54292 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 9960 kB
VmallocChunk: 34359727667 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 7552 kB
DirectMap2M: 16769024 kB
So I think simply substracting K(pages[LRU_INACTIVE_FILE]) from
avail_dirty in clip_bdi_dirty_limit() and to consider it in
balance_dirty_pages() and throttle_vm_writeout() would probably make my
problem go away, but I would like to understand exactly why this is
needed and if I would need to consider other types of page counts that
would have been forgotten.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-02-10 6:12 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-17 0:44 [Regression] High latency when doing large I/O Mathieu Desnoyers
2009-01-17 16:26 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-01-17 16:50 ` Leon Woestenberg
2009-01-17 17:15 ` Mathieu Desnoyers
2009-01-17 19:04 ` Jens Axboe
2009-01-18 21:12 ` Mathieu Desnoyers
2009-01-18 21:27 ` Mathieu Desnoyers
2009-01-19 18:26 ` Jens Axboe
2009-01-20 2:10 ` Mathieu Desnoyers
2009-01-20 7:37 ` Jens Axboe
2009-01-20 12:28 ` Jens Axboe
2009-01-20 14:22 ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 14:24 ` Jens Axboe
2009-01-20 15:42 ` Mathieu Desnoyers
2009-01-20 23:06 ` Mathieu Desnoyers
2009-01-20 23:27 ` Mathieu Desnoyers
2009-01-21 0:25 ` Mathieu Desnoyers
2009-01-21 4:38 ` Ben Gamari
2009-01-21 4:54 ` [ltt-dev] " Mathieu Desnoyers
2009-01-21 6:17 ` Ben Gamari
2009-01-22 22:59 ` Mathieu Desnoyers
2009-01-23 3:21 ` [ltt-dev] " KOSAKI Motohiro
2009-01-23 4:03 ` Mathieu Desnoyers
2009-02-10 3:36 ` [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O Mathieu Desnoyers
2009-02-10 3:36 ` Mathieu Desnoyers
2009-02-10 3:55 ` Nick Piggin
2009-02-10 3:55 ` Nick Piggin
2009-02-10 5:23 ` Linus Torvalds
2009-02-10 5:23 ` Linus Torvalds
2009-02-10 5:56 ` Nick Piggin
2009-02-10 5:56 ` Nick Piggin
2009-02-10 6:12 ` Mathieu Desnoyers [this message]
2009-02-10 6:12 ` Mathieu Desnoyers
2009-02-02 2:08 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-02-02 11:26 ` Jens Axboe
2009-02-03 0:46 ` Mathieu Desnoyers
2009-01-20 13:45 ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 20:22 ` Ben Gamari
2009-01-20 22:23 ` Ben Gamari
2009-01-20 23:05 ` Mathieu Desnoyers
2009-01-22 2:35 ` Ben Gamari
2009-01-19 15:45 ` Nikanth K
2009-01-19 18:23 ` Jens Axboe
2009-01-17 20:03 ` Ben Gamari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090210061226.GA1918@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ltt-dev@lists.casi.polymtl.ca \
--cc=mingo@elte.hu \
--cc=thomas.pi@arcor.dea \
--cc=torvalds@linux-foundation.org \
--cc=ylalym@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.