[LSF][MM] page allocation & direct reclaim latency

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [LSF][MM] page allocation & direct reclaim latency
       [not found] <1301373398.2590.20.camel@mulgrave.site>
@ 2011-03-29 15:35 ` Rik van Riel
  2011-03-29 19:05   ` [Lsf] " Andrea Arcangeli
  0 siblings, 1 reply; 17+ messages in thread
From: Rik van Riel @ 2011-03-29 15:35 UTC (permalink / raw)
  To: lsf; +Cc: linux-mm

On 03/29/2011 12:36 AM, James Bottomley wrote:
> Hi All,
>
> Since LSF is less than a week away, the programme committee put together
> a just in time preliminary agenda for LSF.  As you can see there is
> still plenty of empty space, which you can make suggestions

There have been a few patches upstream by people for who
page allocation latency is a concern.

It may be worthwhile to have a short discussion on what
we can do to keep page allocation (and direct reclaim?)
latencies down to a minimum, reducing the slowdown that
direct reclaim introduces on some workloads.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 15:35 ` [LSF][MM] page allocation & direct reclaim latency Rik van Riel
@ 2011-03-29 19:05   ` Andrea Arcangeli
  2011-03-29 20:35     ` Ying Han
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-29 19:05 UTC (permalink / raw)
  To: Rik van Riel; +Cc: lsf, linux-mm, Hugh Dickins

Hi Rik, Hugh and everyone,

On Tue, Mar 29, 2011 at 11:35:09AM -0400, Rik van Riel wrote:
> On 03/29/2011 12:36 AM, James Bottomley wrote:
> > Hi All,
> >
> > Since LSF is less than a week away, the programme committee put together
> > a just in time preliminary agenda for LSF.  As you can see there is
> > still plenty of empty space, which you can make suggestions
> 
> There have been a few patches upstream by people for who
> page allocation latency is a concern.
> 
> It may be worthwhile to have a short discussion on what
> we can do to keep page allocation (and direct reclaim?)
> latencies down to a minimum, reducing the slowdown that
> direct reclaim introduces on some workloads.

I don't see the patches you refer to, but checking schedule we've a
slot with Mel&Minchan about "Reclaim, compaction and LRU
ordering". Compaction only applies to high order allocations and it
changes nothing to PAGE_SIZE allocations, but it surely has lower
latency than the older lumpy reclaim logic so overall it should be a
net improvement compared to what we had before.

Should the latency issues be discussed in that track?

The MM schedule has still a free slot 14-14:30 on Monday, I wonder if
there's interest on a "NUMA automatic migration and scheduling
awareness" topic or if it's still too vapourware for a real topic and
we should keep it for offtrack discussions, and maybe we should
reserve it for something more tangible with patches already floating
around. Comments welcome.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 19:05   ` [Lsf] " Andrea Arcangeli
@ 2011-03-29 20:35     ` Ying Han
  2011-03-29 20:39       ` Ying Han
  2011-03-29 20:45       ` Andrea Arcangeli
  2011-03-29 21:22     ` Rik van Riel
  2011-03-29 22:13     ` Minchan Kim
  2 siblings, 2 replies; 17+ messages in thread
From: Ying Han @ 2011-03-29 20:35 UTC (permalink / raw)
  To: Andrea Arcangeli, Rik van Riel; +Cc: lsf, linux-mm, Hugh Dickins

On Tue, Mar 29, 2011 at 12:05 PM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> Hi Rik, Hugh and everyone,
>
> On Tue, Mar 29, 2011 at 11:35:09AM -0400, Rik van Riel wrote:
>> On 03/29/2011 12:36 AM, James Bottomley wrote:
>> > Hi All,
>> >
>> > Since LSF is less than a week away, the programme committee put together
>> > a just in time preliminary agenda for LSF.  As you can see there is
>> > still plenty of empty space, which you can make suggestions
>>
>> There have been a few patches upstream by people for who
>> page allocation latency is a concern.
>>
>> It may be worthwhile to have a short discussion on what
>> we can do to keep page allocation (and direct reclaim?)
>> latencies down to a minimum, reducing the slowdown that
>> direct reclaim introduces on some workloads.
>
> I don't see the patches you refer to, but checking schedule we've a
> slot with Mel&Minchan about "Reclaim, compaction and LRU
> ordering". Compaction only applies to high order allocations and it
> changes nothing to PAGE_SIZE allocations, but it surely has lower
> latency than the older lumpy reclaim logic so overall it should be a
> net improvement compared to what we had before.
>
> Should the latency issues be discussed in that track?
>
> The MM schedule has still a free slot 14-14:30 on Monday, I wonder if
> there's interest on a "NUMA automatic migration and scheduling
> awareness" topic or if it's still too vapourware for a real topic and
> we should keep it for offtrack discussions, and maybe we should
> reserve it for something more tangible with patches already floating
> around. Comments welcome.


In page reclaim, I would like to discuss on the magic "8" *
high_wmark() in balance_pgdat(). I recently found the discussion on
thread "too big min_free_kbytes", where I didn't find where we proved
it is still a problem or not. This might not need reserve time slot,
but something I want to learn more on.

--Ying


>
> Thanks,
> Andrea
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 20:35     ` Ying Han
@ 2011-03-29 20:39       ` Ying Han
  2011-03-29 20:45       ` Andrea Arcangeli
  1 sibling, 0 replies; 17+ messages in thread
From: Ying Han @ 2011-03-29 20:39 UTC (permalink / raw)
  To: Andrea Arcangeli, Rik van Riel; +Cc: lsf, linux-mm, Hugh Dickins

On Tue, Mar 29, 2011 at 1:35 PM, Ying Han <yinghan@google.com> wrote:
> On Tue, Mar 29, 2011 at 12:05 PM, Andrea Arcangeli <aarcange@redhat.com> wrote:
>> Hi Rik, Hugh and everyone,
>>
>> On Tue, Mar 29, 2011 at 11:35:09AM -0400, Rik van Riel wrote:
>>> On 03/29/2011 12:36 AM, James Bottomley wrote:
>>> > Hi All,
>>> >
>>> > Since LSF is less than a week away, the programme committee put together
>>> > a just in time preliminary agenda for LSF.  As you can see there is
>>> > still plenty of empty space, which you can make suggestions
>>>
>>> There have been a few patches upstream by people for who
>>> page allocation latency is a concern.
>>>
>>> It may be worthwhile to have a short discussion on what
>>> we can do to keep page allocation (and direct reclaim?)
>>> latencies down to a minimum, reducing the slowdown that
>>> direct reclaim introduces on some workloads.
>>
>> I don't see the patches you refer to, but checking schedule we've a
>> slot with Mel&Minchan about "Reclaim, compaction and LRU
>> ordering". Compaction only applies to high order allocations and it
>> changes nothing to PAGE_SIZE allocations, but it surely has lower
>> latency than the older lumpy reclaim logic so overall it should be a
>> net improvement compared to what we had before.
>>
>> Should the latency issues be discussed in that track?
>>
>> The MM schedule has still a free slot 14-14:30 on Monday, I wonder if
>> there's interest on a "NUMA automatic migration and scheduling
>> awareness" topic or if it's still too vapourware for a real topic and
>> we should keep it for offtrack discussions, and maybe we should
>> reserve it for something more tangible with patches already floating
>> around. Comments welcome.
>
>
> In page reclaim, I would like to discuss on the magic "8" *
> high_wmark() in balance_pgdat(). I recently found the discussion on
> thread "too big min_free_kbytes", where I didn't find where we proved
> it is still a problem or not. This might not need reserve time slot,
> but something I want to learn more on.

well, forgot to mention. I also noticed that has been changed in mmotm
by a "balance_gap". In general, I would like to understand why we can
not stick on high_wmark for kswapd regardless of zones.

Thanks

--Ying

>
> --Ying
>
>
>>
>> Thanks,
>> Andrea
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 20:35     ` Ying Han
  2011-03-29 20:39       ` Ying Han
@ 2011-03-29 20:45       ` Andrea Arcangeli
  2011-03-29 20:53         ` Ying Han
  1 sibling, 1 reply; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-29 20:45 UTC (permalink / raw)
  To: Ying Han; +Cc: Rik van Riel, lsf, linux-mm, Hugh Dickins

On Tue, Mar 29, 2011 at 01:35:24PM -0700, Ying Han wrote:
> In page reclaim, I would like to discuss on the magic "8" *
> high_wmark() in balance_pgdat(). I recently found the discussion on
> thread "too big min_free_kbytes", where I didn't find where we proved
> it is still a problem or not. This might not need reserve time slot,
> but something I want to learn more on.

That is merged in 2.6.39-rc1. It's hopefully working good enough. We
still use high+balance_gap but the balance_gap isn't high*8 anymore. I
still think the balance_gap may as well be zero but the gap now is
small enough (not 600M on 4G machine anymore) that it's ok and this
was a safer change.

This is an LRU ordering issue to try to keep the lru balance across
the zones and not just rotate a lot a single one. I think it can be
covered in the LRU ordering topic too. But we could also expand it to
a different slot if we expect too many issues to showup in that
slot... Hugh what's your opinion?

The subtopics that comes to mind for that topic so far would be:

- reclaim latency
- compaction issues (Mel)
- lru ordering altered by compaction/migrate/khugepaged or other
  features requiring lru page isolation (Minchan)
- lru rotation balance across zones in kswapd (balance_gap) (Ying)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 20:45       ` Andrea Arcangeli
@ 2011-03-29 20:53         ` Ying Han
  0 siblings, 0 replies; 17+ messages in thread
From: Ying Han @ 2011-03-29 20:53 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Rik van Riel, lsf, linux-mm, Hugh Dickins

On Tue, Mar 29, 2011 at 1:45 PM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Tue, Mar 29, 2011 at 01:35:24PM -0700, Ying Han wrote:
>> In page reclaim, I would like to discuss on the magic "8" *
>> high_wmark() in balance_pgdat(). I recently found the discussion on
>> thread "too big min_free_kbytes", where I didn't find where we proved
>> it is still a problem or not. This might not need reserve time slot,
>> but something I want to learn more on.
>
> That is merged in 2.6.39-rc1. It's hopefully working good enough. We
> still use high+balance_gap but the balance_gap isn't high*8 anymore. I
> still think the balance_gap may as well be zero but the gap now is
> small enough (not 600M on 4G machine anymore) that it's ok and this
> was a safer change.
>
> This is an LRU ordering issue to try to keep the lru balance across
> the zones and not just rotate a lot a single one. I think it can be
> covered in the LRU ordering topic too. But we could also expand it to
> a different slot if we expect too many issues to showup in that
> slot... Hugh what's your opinion?

Yes, that is what I got from the thread discussion and thank you for
confirming that. Guess my question is :

Do we need to do balance across zones by giving the fact that each
zone does its own balancing?
What is the problem we saw without doing the cross-zone balancing?

I don't have data to back-up either way, and that is something I am
interested too :)

--Ying


>
> The subtopics that comes to mind for that topic so far would be:
>
> - reclaim latency
> - compaction issues (Mel)
> - lru ordering altered by compaction/migrate/khugepaged or other
>  features requiring lru page isolation (Minchan)
> - lru rotation balance across zones in kswapd (balance_gap) (Ying)
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 19:05   ` [Lsf] " Andrea Arcangeli
  2011-03-29 20:35     ` Ying Han
@ 2011-03-29 21:22     ` Rik van Riel
  2011-03-29 22:38       ` Andrea Arcangeli
  2011-03-29 22:13     ` Minchan Kim
  2 siblings, 1 reply; 17+ messages in thread
From: Rik van Riel @ 2011-03-29 21:22 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: lsf, linux-mm, Hugh Dickins

On 03/29/2011 03:05 PM, Andrea Arcangeli wrote:

> Should the latency issues be discussed in that track?

Sounds good.  I don't think we'll spend more than 5-10 minutes
on the latency thing, probably less than that.

> The MM schedule has still a free slot 14-14:30 on Monday, I wonder if
> there's interest on a "NUMA automatic migration and scheduling
> awareness" topic or if it's still too vapourware for a real topic and
> we should keep it for offtrack discussions,

I believe that problem is complex enough to warrant a 30
minute discussion.  Even if we do not come up with solutions,
it would be a good start if we could all agree on the problem.

Things this complex often end up getting shot down later, not
because people do not agree on the solution, but because people
do not agree on the PROBLEM (and the patches in question only
solve a subset of the problem).

I would be willing to lead the NUMA scheduling and memory
allocation discussion.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 21:22     ` Rik van Riel
@ 2011-03-29 22:38       ` Andrea Arcangeli
  0 siblings, 0 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-29 22:38 UTC (permalink / raw)
  To: Rik van Riel; +Cc: lsf, linux-mm, Hugh Dickins

On Tue, Mar 29, 2011 at 05:22:20PM -0400, Rik van Riel wrote:
> I believe that problem is complex enough to warrant a 30
> minute discussion.  Even if we do not come up with solutions,
> it would be a good start if we could all agree on the problem.
> 
> Things this complex often end up getting shot down later, not
> because people do not agree on the solution, but because people
> do not agree on the PROBLEM (and the patches in question only
> solve a subset of the problem).
> 
> I would be willing to lead the NUMA scheduling and memory
> allocation discussion.

Well, for now I added it to schedule.

The problem I think exists as without bindings and NUMA hinting, the
current automatic behavior deviates significantly from the tuned-NUMA
binding performance as also shown by the migrate-on-fault patches.

Now THP pages can't even be migrated before being splitted, and
migrating 2M on fault isn't optimal even after we teach migrate how to
migrate 2M pages without splitting [a separate issue]. Migrate on
fault to me looks a great improvement but it doesn't look the most
optimal design we can have as the page fault can be avoided with a
background migration from kernel thread, without requiring page faults.

Hugh if you think of some other topic being more urgent feel free to
update. One other topic that comes to mind right now that could be
good candidate for the floating slot would be Hugh's OOM topic. I
think it'd be nice to somehow squeeze that into the schedule too if
Hugh has interest to lead it.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 19:05   ` [Lsf] " Andrea Arcangeli
  2011-03-29 20:35     ` Ying Han
  2011-03-29 21:22     ` Rik van Riel
@ 2011-03-29 22:13     ` Minchan Kim
  2011-03-29 23:12       ` Andrea Arcangeli
  2011-03-30 16:17       ` Mel Gorman
  2 siblings, 2 replies; 17+ messages in thread
From: Minchan Kim @ 2011-03-29 22:13 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Rik van Riel, lsf, linux-mm, Hugh Dickins, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro

On Wed, Mar 30, 2011 at 4:05 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> Hi Rik, Hugh and everyone,
>
> On Tue, Mar 29, 2011 at 11:35:09AM -0400, Rik van Riel wrote:
>> On 03/29/2011 12:36 AM, James Bottomley wrote:
>> > Hi All,
>> >
>> > Since LSF is less than a week away, the programme committee put together
>> > a just in time preliminary agenda for LSF.  As you can see there is
>> > still plenty of empty space, which you can make suggestions
>>
>> There have been a few patches upstream by people for who
>> page allocation latency is a concern.
>>
>> It may be worthwhile to have a short discussion on what
>> we can do to keep page allocation (and direct reclaim?)
>> latencies down to a minimum, reducing the slowdown that
>> direct reclaim introduces on some workloads.
>
> I don't see the patches you refer to, but checking schedule we've a
> slot with Mel&Minchan about "Reclaim, compaction and LRU
> ordering". Compaction only applies to high order allocations and it
> changes nothing to PAGE_SIZE allocations, but it surely has lower
> latency than the older lumpy reclaim logic so overall it should be a
> net improvement compared to what we had before.
>
> Should the latency issues be discussed in that track?

It's okay to me. LRU ordering issue wouldn't take much time.
But I am not sure Mel would have a long time. :)

About reclaim latency, I sent a patch in the old days.
http://marc.info/?l=linux-mm&m=129187231129887&w=4

And some guys on embedded had a concern about latency.
They want OOM rather than eviction of working set and undeterministic
latency of reclaim.

As another issue of related to latency, there is a OOM.
To accelerate task's exit, we raise a priority of the victim process
but it had a problem so Kosaki decided reverting the patch. It's
totally related to latency issue but it would

In addition, Kame and I sent a patch to prevent forkbomb. Kame's
apprach is to track the history of mm and mine is to use sysrq to kill
recently created tasks. The approaches have pros and cons.
But anyone seem to not has a interest about forkbomb protection.
So I want to listen other's opinion we really need it

I am not sure this could become a topic of LSF/MM
If it is proper, I would like to talk above issues in "Reclaim,
compaction and LRU ordering" slot.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 22:13     ` Minchan Kim
@ 2011-03-29 23:12       ` Andrea Arcangeli
  2011-03-30 16:17       ` Mel Gorman
  1 sibling, 0 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-29 23:12 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Rik van Riel, lsf, linux-mm, Hugh Dickins, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro

On Wed, Mar 30, 2011 at 07:13:42AM +0900, Minchan Kim wrote:
> It's okay to me. LRU ordering issue wouldn't take much time.
> But I am not sure Mel would have a long time. :)
> 
> About reclaim latency, I sent a patch in the old days.
> http://marc.info/?l=linux-mm&m=129187231129887&w=4
> 
> And some guys on embedded had a concern about latency.
> They want OOM rather than eviction of working set and undeterministic
> latency of reclaim.
> 
> As another issue of related to latency, there is a OOM.
> To accelerate task's exit, we raise a priority of the victim process
> but it had a problem so Kosaki decided reverting the patch. It's
> totally related to latency issue but it would
> 
> In addition, Kame and I sent a patch to prevent forkbomb. Kame's
> apprach is to track the history of mm and mine is to use sysrq to kill
> recently created tasks. The approaches have pros and cons.
> But anyone seem to not has a interest about forkbomb protection.
> So I want to listen other's opinion we really need it

The sysrq won't help on large servers, virtual clouds or android
devices where there's no keyboard attached and not sysrqd running. So
I'd prefer not requiring sysrq for that, even if you can run a sysrq,
few people would know how to activate an obscure sysrq feature meant
to be more selective than SYSRQ+I (or SYSRQ+B..) for forkbombs, if it
works without human intervention I think it's more valuable.

> I am not sure this could become a topic of LSF/MM

Now the forkbomb detection would fit nicely as a subtopic into Hugh's
OOM topic. I found one more spare MM slot on 5 April 12:30-13, so for
now I filled it with OOM and forkbomb.

> If it is proper, I would like to talk above issues in "Reclaim,
> compaction and LRU ordering" slot.

Sounds good to me. I added "allocation latency" to your slot.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-29 22:13     ` Minchan Kim
  2011-03-29 23:12       ` Andrea Arcangeli
@ 2011-03-30 16:17       ` Mel Gorman
  2011-03-30 16:49         ` Andrea Arcangeli
  2011-03-30 16:59         ` Dan Magenheimer
  1 sibling, 2 replies; 17+ messages in thread
From: Mel Gorman @ 2011-03-30 16:17 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrea Arcangeli, lsf, linux-mm

On Wed, Mar 30, 2011 at 07:13:42AM +0900, Minchan Kim wrote:
> On Wed, Mar 30, 2011 at 4:05 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> > Hi Rik, Hugh and everyone,
> >
> > On Tue, Mar 29, 2011 at 11:35:09AM -0400, Rik van Riel wrote:
> >> On 03/29/2011 12:36 AM, James Bottomley wrote:
> >> > Hi All,
> >> >
> >> > Since LSF is less than a week away, the programme committee put together
> >> > a just in time preliminary agenda for LSF.  As you can see there is
> >> > still plenty of empty space, which you can make suggestions
> >>
> >> There have been a few patches upstream by people for who
> >> page allocation latency is a concern.
> >>
> >> It may be worthwhile to have a short discussion on what
> >> we can do to keep page allocation (and direct reclaim?)
> >> latencies down to a minimum, reducing the slowdown that
> >> direct reclaim introduces on some workloads.
> >
> > I don't see the patches you refer to, but checking schedule we've a
> > slot with Mel&Minchan about "Reclaim, compaction and LRU
> > ordering". Compaction only applies to high order allocations and it
> > changes nothing to PAGE_SIZE allocations, but it surely has lower
> > latency than the older lumpy reclaim logic so overall it should be a
> > net improvement compared to what we had before.
> >
> > Should the latency issues be discussed in that track?
> 
> It's okay to me. LRU ordering issue wouldn't take much time.
> But I am not sure Mel would have a long time. :)
> 

What might be worth discussing on LRU ordering is encountering dirty pages
at the end of the LRU. This is a long-standing issues and patches have been
merged to mitigate the problem since the last LSF/MM. For example [e11da5b4:
tracing, vmscan: add trace events for LRU list shrinking] was the beginning
of a series that added some tracing around catching when this happened
and to mitigate it somewhat (at least according to the report included in
that changelog).

This happened since the last LSF/MM so it might be worth re-discussing if the
dirty-pages-at-end-of-LRU has mitigated somewhat. The last major bug
report that I'm aware of in that area was due to compaction rather than
reclaim but that could just mean people have given up raising the issue.

A trickier subject on LRU ordering is to consider if we are recycling
pages through the LRU too aggressively and aging too quickly. There have
been some patches in this area recently but it's not really clear if we
are happy with how the LRU lists age at the moment.

> About reclaim latency, I sent a patch in the old days.
> http://marc.info/?l=linux-mm&m=129187231129887&w=4
> 

Andy Whitcroft also posted patches ages ago that were related to lumpy reclaim
which would capture high-order pages being reclaimed for the exclusive use
of the reclaimer. It was never shown to be necessary though. I'll read this
thread in a bit because I'm curious to see why it came up now.

> And some guys on embedded had a concern about latency.
> They want OOM rather than eviction of working set and undeterministic
> latency of reclaim.
> 
> As another issue of related to latency, there is a OOM.
> To accelerate task's exit, we raise a priority of the victim process
> but it had a problem so Kosaki decided reverting the patch. It's
> totally related to latency issue but it would
> 

I think we should be very wary of conflating OOM latency, reclaim latency and
allocation latency as they are very different things with different causes.

> In addition, Kame and I sent a patch to prevent forkbomb. Kame's
> apprach is to track the history of mm and mine is to use sysrq to kill
> recently created tasks. The approaches have pros and cons.
> But anyone seem to not has a interest about forkbomb protection.
> So I want to listen other's opinion we really need it
> 
> I am not sure this could become a topic of LSF/MM
> If it is proper, I would like to talk above issues in "Reclaim,
> compaction and LRU ordering" slot.
> 

I'd prefer to see OOM-related issues treated as a separate-but-related
problem if possible so;

1. LRU ordering - are we aging pages properly or recycling through the
   list too aggressively? The high_wmark*8 change made recently was
   partially about list rotations and the associated cost so it might
   be worth listing out whatever issues people are currently aware of.
2. LRU ordering - dirty pages at the end of the LRU. Are we still going
   the right direction on this or is it still a shambles?
3. Compaction latency, other issues (IRQ disabling latency was the last
   one I'm aware of)
4. OOM killing and OOM latency - Whole load of churn going on in there.

?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-30 16:17       ` Mel Gorman
@ 2011-03-30 16:49         ` Andrea Arcangeli
  2011-03-31  0:42           ` Hugh Dickins
  2011-03-31  9:30           ` Mel Gorman
  2011-03-30 16:59         ` Dan Magenheimer
  1 sibling, 2 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-30 16:49 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Minchan Kim, lsf, linux-mm

Hi Mel,

On Wed, Mar 30, 2011 at 05:17:16PM +0100, Mel Gorman wrote:
> Andy Whitcroft also posted patches ages ago that were related to lumpy reclaim
> which would capture high-order pages being reclaimed for the exclusive use
> of the reclaimer. It was never shown to be necessary though. I'll read this
> thread in a bit because I'm curious to see why it came up now.

Ok ;).

About lumpy I wouldn't spend too much on lumpy, I'd rather spend on
other issues like the one you mentioned on lru ordering, and
compaction (compaction in kswapd has still an unknown solution, my
last attempt failed and we backed off to no compaction in kswapd which
is safe but doesn't help for GFP_ATOMIC order > 0).

Lumpy should go away in a few releases IIRC.

> I think we should be very wary of conflating OOM latency, reclaim latency and
> allocation latency as they are very different things with different causes.

I think it's better to stick to successful allocation latencies only
here, or at most -ENOMEM from order > 0 which normally never happens
with compaction (not the time it takes before declaring OOM and
triggering the oom killer).

> I'd prefer to see OOM-related issues treated as a separate-but-related
> problem if possible so;
> 
> 1. LRU ordering - are we aging pages properly or recycling through the
>    list too aggressively? The high_wmark*8 change made recently was
>    partially about list rotations and the associated cost so it might
>    be worth listing out whatever issues people are currently aware of.
> 2. LRU ordering - dirty pages at the end of the LRU. Are we still going
>    the right direction on this or is it still a shambles?
> 3. Compaction latency, other issues (IRQ disabling latency was the last
>    one I'm aware of)
> 4. OOM killing and OOM latency - Whole load of churn going on in there.

I prefer it too. The OOM killing is already covered in OOM topic from
Hugh, and we can add "OOM detection latency" to it.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-30 16:49         ` Andrea Arcangeli
@ 2011-03-31  0:42           ` Hugh Dickins
  2011-03-31 15:15             ` Andrea Arcangeli
  2011-03-31  9:30           ` Mel Gorman
  1 sibling, 1 reply; 17+ messages in thread
From: Hugh Dickins @ 2011-03-31  0:42 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Mel Gorman, Minchan Kim, lsf, linux-mm

On Wed, Mar 30, 2011 at 9:49 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Wed, Mar 30, 2011 at 05:17:16PM +0100, Mel Gorman wrote:
>> I'd prefer to see OOM-related issues treated as a separate-but-related
>> problem if possible so;
>
> I prefer it too. The OOM killing is already covered in OOM topic from
> Hugh, and we can add "OOM detection latency" to it.

Thanks for adjusting and updating the schedule, Andrea.  I'm way
behind in my mailbox and everything else, that was a real help.

But last night I did remove that OOM and fork-bomb topic you
mischievously added in my name ;-)  Yes, I did propose an OOM topic
against my name in the working list I sent you a few days ago, but by
Monday had concluded that it would be pretty silly for me to get up
and spout the few things I have to say about it, in the absence of
every one of the people most closely involved and experienced.  And on
fork-bombs I've even less to say.

Of course, none of these sessions are for those named facilitators to
lecture the assembled company for half an hour.  We can bring it back
it there's demand on the day: but right now I'd prefer to keep it as
an empty slot, to be decided when the time comes.  After all, those FS
people, they appear to thrive on empty slots!

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-31  0:42           ` Hugh Dickins
@ 2011-03-31 15:15             ` Andrea Arcangeli
  0 siblings, 0 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-31 15:15 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Mel Gorman, Minchan Kim, lsf, linux-mm

On Wed, Mar 30, 2011 at 05:42:15PM -0700, Hugh Dickins wrote:
> On Wed, Mar 30, 2011 at 9:49 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> > On Wed, Mar 30, 2011 at 05:17:16PM +0100, Mel Gorman wrote:
> >> I'd prefer to see OOM-related issues treated as a separate-but-related
> >> problem if possible so;
> >
> > I prefer it too. The OOM killing is already covered in OOM topic from
> > Hugh, and we can add "OOM detection latency" to it.
> 
> Thanks for adjusting and updating the schedule, Andrea.  I'm way
> behind in my mailbox and everything else, that was a real help.

Glad I could help.

> But last night I did remove that OOM and fork-bomb topic you
> mischievously added in my name ;-)  Yes, I did propose an OOM topic
> against my name in the working list I sent you a few days ago, but by
> Monday had concluded that it would be pretty silly for me to get up
> and spout the few things I have to say about it, in the absence of
> every one of the people most closely involved and experienced.  And on
> fork-bombs I've even less to say.
>
> Of course, none of these sessions are for those named facilitators to
> lecture the assembled company for half an hour.  We can bring it back
> it there's demand on the day: but right now I'd prefer to keep it as
> an empty slot, to be decided when the time comes.  After all, those FS
> people, they appear to thrive on empty slots!

Ok, and agree that the MM track is pretty dense already ;).

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-30 16:49         ` Andrea Arcangeli
  2011-03-31  0:42           ` Hugh Dickins
@ 2011-03-31  9:30           ` Mel Gorman
  2011-03-31 16:36             ` Andrea Arcangeli
  1 sibling, 1 reply; 17+ messages in thread
From: Mel Gorman @ 2011-03-31  9:30 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Minchan Kim, lsf, linux-mm

On Wed, Mar 30, 2011 at 06:49:06PM +0200, Andrea Arcangeli wrote:
> Hi Mel,
> 
> On Wed, Mar 30, 2011 at 05:17:16PM +0100, Mel Gorman wrote:
> > Andy Whitcroft also posted patches ages ago that were related to lumpy reclaim
> > which would capture high-order pages being reclaimed for the exclusive use
> > of the reclaimer. It was never shown to be necessary though. I'll read this
> > thread in a bit because I'm curious to see why it came up now.
> 
> Ok ;).
> 
> About lumpy I wouldn't spend too much on lumpy,

I hadn't intended to but the context of the capture patches was lumpy so
it'd be the starting point for anyone looking at the old patches.  If someone
wanted to go in that direction, it would need to be adapted for compaction,
reclaim/compaction and reclaim.

> I'd rather spend on
> other issues like the one you mentioned on lru ordering, and
> compaction (compaction in kswapd has still an unknown solution, my
> last attempt failed and we backed off to no compaction in kswapd which
> is safe but doesn't help for GFP_ATOMIC order > 0).
> 

Agreed. It may also be worth a quick discussion on *how* people are currently
evauating their reclaim-related changes be it via tracepoints, systemtap,
a patched kernel or indirect measures such as faults.

> Lumpy should go away in a few releases IIRC.
> 
> > I think we should be very wary of conflating OOM latency, reclaim latency and
> > allocation latency as they are very different things with different causes.
> 
> I think it's better to stick to successful allocation latencies only
> here, or at most -ENOMEM from order > 0 which normally never happens
> with compaction (not the time it takes before declaring OOM and
> triggering the oom killer).
> 

Sounds reasonable. I could discuss briefly the scripts I use based on ftrace
that dump out highorder allocation latencies as it might be useful to others
if this is the area they are looking at.

> > I'd prefer to see OOM-related issues treated as a separate-but-related
> > problem if possible so;
> > 
> > 1. LRU ordering - are we aging pages properly or recycling through the
> >    list too aggressively? The high_wmark*8 change made recently was
> >    partially about list rotations and the associated cost so it might
> >    be worth listing out whatever issues people are currently aware of.
> > 2. LRU ordering - dirty pages at the end of the LRU. Are we still going
> >    the right direction on this or is it still a shambles?
> > 3. Compaction latency, other issues (IRQ disabling latency was the last
> >    one I'm aware of)
> > 4. OOM killing and OOM latency - Whole load of churn going on in there.
> 
> I prefer it too. The OOM killing is already covered in OOM topic from
> Hugh, and we can add "OOM detection latency" to it.
> 

Also sounds good to me.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-31  9:30           ` Mel Gorman
@ 2011-03-31 16:36             ` Andrea Arcangeli
  0 siblings, 0 replies; 17+ messages in thread
From: Andrea Arcangeli @ 2011-03-31 16:36 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Minchan Kim, lsf, linux-mm

On Thu, Mar 31, 2011 at 10:30:53AM +0100, Mel Gorman wrote:
> Sounds reasonable. I could discuss briefly the scripts I use based on ftrace
> that dump out highorder allocation latencies as it might be useful to others
> if this is the area they are looking at.

I think it's interesting.

> Also sounds good to me.

Ok. BTW, the OOM topic has been removed from schedule for now and it
returned an empty slot, but as Hugh mentioned, it can be brought back
in as needed on the day.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Lsf] [LSF][MM] page allocation & direct reclaim latency
  2011-03-30 16:17       ` Mel Gorman
  2011-03-30 16:49         ` Andrea Arcangeli
@ 2011-03-30 16:59         ` Dan Magenheimer
  1 sibling, 0 replies; 17+ messages in thread
From: Dan Magenheimer @ 2011-03-30 16:59 UTC (permalink / raw)
  To: Mel Gorman, Minchan Kim; +Cc: lsf, linux-mm

> 1. LRU ordering - are we aging pages properly or recycling through the
>    list too aggressively? The high_wmark*8 change made recently was
>    partially about list rotations and the associated cost so it might
>    be worth listing out whatever issues people are currently aware of.

Here's one: zcache (and tmem RAMster and SSmem) is essentially a level2
cache for clean page cache pages that have been reclaimed.  (Or
more precisely, the pageFRAME has been reclaimed, but the contents
has been squirreled away in zcache.)

Just like the active/inactive lists, ideally, you'd like to ensure
zcache gets filled with pages that have some probability of being used
in the future, not pages you KNOW won't be used in the future but
have left on the inactive list to rot until they are reclaimed.

There's also a sizing issue... under memory pressure, pages in
active/inactive have different advantages/disadvantages vs
pages in zcache/etc... What tuning knobs exist already?

I hacked a (non-upstreamable) patch to only "put" clean pages
that had been previously in active, to play with this a bit but
didn't pursue it.

Anyway, would like to include this in the above discussion.

Thanks,
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-03-31 16:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1301373398.2590.20.camel@mulgrave.site>
2011-03-29 15:35 ` [LSF][MM] page allocation & direct reclaim latency Rik van Riel
2011-03-29 19:05   ` [Lsf] " Andrea Arcangeli
2011-03-29 20:35     ` Ying Han
2011-03-29 20:39       ` Ying Han
2011-03-29 20:45       ` Andrea Arcangeli
2011-03-29 20:53         ` Ying Han
2011-03-29 21:22     ` Rik van Riel
2011-03-29 22:38       ` Andrea Arcangeli
2011-03-29 22:13     ` Minchan Kim
2011-03-29 23:12       ` Andrea Arcangeli
2011-03-30 16:17       ` Mel Gorman
2011-03-30 16:49         ` Andrea Arcangeli
2011-03-31  0:42           ` Hugh Dickins
2011-03-31 15:15             ` Andrea Arcangeli
2011-03-31  9:30           ` Mel Gorman
2011-03-31 16:36             ` Andrea Arcangeli
2011-03-30 16:59         ` Dan Magenheimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).