From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andreas Mohr <andi@lisas.de>, Jens Axboe <axboe@kernel.dk>,
Minchan Kim <minchan.kim@gmail.com>,
Linux Memory Management List <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Rik van Riel <riel@redhat.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Subject: Re: 32GB SSD on USB1.1 P3/700 == ___HELL___ (2.6.34-rc3)
Date: Thu, 15 Apr 2010 13:19:11 +0800 [thread overview]
Message-ID: <20100415051911.GA17110@localhost> (raw)
In-Reply-To: <20100415135031.D186.A69D9226@jp.fujitsu.com>
On Thu, Apr 15, 2010 at 12:55:30PM +0800, KOSAKI Motohiro wrote:
> > On Thu, Apr 15, 2010 at 12:32:50PM +0800, KOSAKI Motohiro wrote:
> > > > On Thu, Apr 15, 2010 at 11:31:52AM +0800, KOSAKI Motohiro wrote:
> > > > > > > Many applications (this one and below) are stuck in
> > > > > > > wait_on_page_writeback(). I guess this is why "heavy write to
> > > > > > > irrelevant partition stalls the whole system". They are stuck on page
> > > > > > > allocation. Your 512MB system memory is a bit tight, so reclaim
> > > > > > > pressure is a bit high, which triggers the wait-on-writeback logic.
> > > > > >
> > > > > > I wonder if this hacking patch may help.
> > > > > >
> > > > > > When creating 300MB dirty file with dd, it is creating continuous
> > > > > > region of hard-to-reclaim pages in the LRU list. priority can easily
> > > > > > go low when irrelevant applications' direct reclaim run into these
> > > > > > regions..
> > > > >
> > > > > Sorry I'm confused not. can you please tell us more detail explanation?
> > > > > Why did lumpy reclaim cause OOM? lumpy reclaim might cause
> > > > > direct reclaim slow down. but IIUC it's not cause OOM because OOM is
> > > > > only occur when priority-0 reclaim failure.
> > > >
> > > > No I'm not talking OOM. Nor lumpy reclaim.
> > > >
> > > > I mean the direct reclaim can get stuck for long time, when we do
> > > > wait_on_page_writeback() on lumpy_reclaim=1.
> > > >
> > > > > IO get stcking also prevent priority reach to 0.
> > > >
> > > > Sure. But we can wait for IO a bit later -- after scanning 1/64 LRU
> > > > (the below patch) instead of the current 1/1024.
> > > >
> > > > In Andreas' case, 512MB/1024 = 512KB, this is way too low comparing to
> > > > the 22MB writeback pages. There can easily be a continuous range of
> > > > 512KB dirty/writeback pages in the LRU, which will trigger the wait
> > > > logic.
> > >
> > > In my feeling from your explanation, we need auto adjustment mechanism
> > > instead change default value for special machine. no?
> >
> > You mean the dumb DEF_PRIORITY/2 may be too large for a 1TB memory box?
> >
> > However for such boxes, whether it be DEF_PRIORITY-2 or DEF_PRIORITY/2
> > shall be irrelevant: it's trivial anyway to reclaim an order-1 or
> > order-2 page. In other word, lumpy_reclaim will hardly go 1. Do you
> > think so?
>
> If my remember is correct, Its order-1 lumpy reclaim was introduced
> for solving such big box + AIM7 workload made kernel stack (order-1 page)
> allocation failure.
>
> Now, We are living on moore's law. so probably we need to pay attention
> scalability always. today's big box is going to become desktop box after
> 3-5 years.
>
> Probably, Lee know such problem than me. cc to him.
In Andreas' trace, the processes are blocked in
- do_fork: console-kit-d
- __alloc_skb: x-terminal-em, konqueror
- handle_mm_fault: tclsh
- filemap_fault: ls
I'm a bit confused by the last one, and wonder what's the typical
gfp order of __alloc_skb().
Thanks,
Fengguang
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andreas Mohr <andi@lisas.de>, Jens Axboe <axboe@kernel.dk>,
Minchan Kim <minchan.kim@gmail.com>,
Linux Memory Management List <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Rik van Riel <riel@redhat.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Subject: Re: 32GB SSD on USB1.1 P3/700 == ___HELL___ (2.6.34-rc3)
Date: Thu, 15 Apr 2010 13:19:11 +0800 [thread overview]
Message-ID: <20100415051911.GA17110@localhost> (raw)
In-Reply-To: <20100415135031.D186.A69D9226@jp.fujitsu.com>
On Thu, Apr 15, 2010 at 12:55:30PM +0800, KOSAKI Motohiro wrote:
> > On Thu, Apr 15, 2010 at 12:32:50PM +0800, KOSAKI Motohiro wrote:
> > > > On Thu, Apr 15, 2010 at 11:31:52AM +0800, KOSAKI Motohiro wrote:
> > > > > > > Many applications (this one and below) are stuck in
> > > > > > > wait_on_page_writeback(). I guess this is why "heavy write to
> > > > > > > irrelevant partition stalls the whole system". They are stuck on page
> > > > > > > allocation. Your 512MB system memory is a bit tight, so reclaim
> > > > > > > pressure is a bit high, which triggers the wait-on-writeback logic.
> > > > > >
> > > > > > I wonder if this hacking patch may help.
> > > > > >
> > > > > > When creating 300MB dirty file with dd, it is creating continuous
> > > > > > region of hard-to-reclaim pages in the LRU list. priority can easily
> > > > > > go low when irrelevant applications' direct reclaim run into these
> > > > > > regions..
> > > > >
> > > > > Sorry I'm confused not. can you please tell us more detail explanation?
> > > > > Why did lumpy reclaim cause OOM? lumpy reclaim might cause
> > > > > direct reclaim slow down. but IIUC it's not cause OOM because OOM is
> > > > > only occur when priority-0 reclaim failure.
> > > >
> > > > No I'm not talking OOM. Nor lumpy reclaim.
> > > >
> > > > I mean the direct reclaim can get stuck for long time, when we do
> > > > wait_on_page_writeback() on lumpy_reclaim=1.
> > > >
> > > > > IO get stcking also prevent priority reach to 0.
> > > >
> > > > Sure. But we can wait for IO a bit later -- after scanning 1/64 LRU
> > > > (the below patch) instead of the current 1/1024.
> > > >
> > > > In Andreas' case, 512MB/1024 = 512KB, this is way too low comparing to
> > > > the 22MB writeback pages. There can easily be a continuous range of
> > > > 512KB dirty/writeback pages in the LRU, which will trigger the wait
> > > > logic.
> > >
> > > In my feeling from your explanation, we need auto adjustment mechanism
> > > instead change default value for special machine. no?
> >
> > You mean the dumb DEF_PRIORITY/2 may be too large for a 1TB memory box?
> >
> > However for such boxes, whether it be DEF_PRIORITY-2 or DEF_PRIORITY/2
> > shall be irrelevant: it's trivial anyway to reclaim an order-1 or
> > order-2 page. In other word, lumpy_reclaim will hardly go 1. Do you
> > think so?
>
> If my remember is correct, Its order-1 lumpy reclaim was introduced
> for solving such big box + AIM7 workload made kernel stack (order-1 page)
> allocation failure.
>
> Now, We are living on moore's law. so probably we need to pay attention
> scalability always. today's big box is going to become desktop box after
> 3-5 years.
>
> Probably, Lee know such problem than me. cc to him.
In Andreas' trace, the processes are blocked in
- do_fork: console-kit-d
- __alloc_skb: x-terminal-em, konqueror
- handle_mm_fault: tclsh
- filemap_fault: ls
I'm a bit confused by the last one, and wonder what's the typical
gfp order of __alloc_skb().
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-15 5:19 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-04 22:13 32GB SSD on USB1.1 P3/700 == ___HELL___ (2.6.34-rc3) Andreas Mohr
2010-04-04 23:31 ` Gábor Lénárt
2010-04-05 10:53 ` Andreas Mohr
2010-04-07 7:00 ` Wu Fengguang
2010-04-07 7:00 ` Wu Fengguang
2010-04-07 7:08 ` Wu Fengguang
2010-04-07 7:08 ` Wu Fengguang
2010-04-15 3:31 ` KOSAKI Motohiro
2010-04-15 3:31 ` KOSAKI Motohiro
2010-04-15 4:19 ` Wu Fengguang
2010-04-15 4:19 ` Wu Fengguang
2010-04-15 4:32 ` KOSAKI Motohiro
2010-04-15 4:32 ` KOSAKI Motohiro
2010-04-15 4:41 ` Wu Fengguang
2010-04-15 4:41 ` Wu Fengguang
2010-04-15 4:55 ` KOSAKI Motohiro
2010-04-15 4:55 ` KOSAKI Motohiro
2010-04-15 5:19 ` Wu Fengguang [this message]
2010-04-15 5:19 ` Wu Fengguang
2010-04-16 3:16 ` [PATCH] vmscan: page_check_references() check low order lumpy reclaim properly KOSAKI Motohiro
2010-04-16 3:16 ` KOSAKI Motohiro
2010-04-16 4:26 ` Minchan Kim
2010-04-16 4:26 ` Minchan Kim
2010-04-16 5:33 ` KOSAKI Motohiro
2010-04-16 5:33 ` KOSAKI Motohiro
2010-04-16 21:18 ` Andrew Morton
2010-04-16 21:18 ` Andrew Morton
2010-05-13 2:54 ` KOSAKI Motohiro
2010-05-13 2:54 ` KOSAKI Motohiro
2010-04-07 8:39 ` 32GB SSD on USB1.1 P3/700 == ___HELL___ (2.6.34-rc3) Minchan Kim
2010-04-07 8:39 ` Minchan Kim
2010-04-07 8:52 ` Wu Fengguang
2010-04-07 8:52 ` Wu Fengguang
2010-04-07 11:17 ` Andreas Mohr
2010-04-07 11:17 ` Andreas Mohr
2010-04-08 19:46 ` Andreas Mohr
2010-04-08 19:46 ` Andreas Mohr
2010-04-08 20:12 ` Bill Davidsen
2010-04-08 20:35 ` Andreas Mohr
2010-04-08 22:01 ` Bill Davidsen
2010-04-09 15:56 ` Ben Gamari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100415051911.GA17110@localhost \
--to=fengguang.wu@intel.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=andi@lisas.de \
--cc=axboe@kernel.dk \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.