Re: [PATCH] Revert oom rewrite series

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] Revert oom rewrite series
  2010-11-14 21:33     ` David Rientjes
@ 2010-11-15  3:26       ` Figo.zhang
  2010-11-15 10:14         ` David Rientjes
  0 siblings, 1 reply; 6+ messages in thread
From: Figo.zhang @ 2010-11-15  3:26 UTC (permalink / raw)
  To: David Rientjes
  Cc: KOSAKI Motohiro, Figo.zhang, lkml, linux-mm@kvack.org,
	Andrew Morton, Linus Torvalds

 >Nothing to say, really.  Seems each time we're told about a bug or a
 >regression, David either fixes the bug or points out why it wasn't a
 >bug or why it wasn't a regression or how it was a deliberate behaviour
 >change for the better.

 >I just haven't seen any solid reason to be concerned about the state of
 >the current oom-killer, sorry.

 >I'm concerned that you're concerned!  A lot.  When someone such as
 >yourself is unhappy with part of MM then I sit up and pay attention.
 >But after all this time I simply don't understand the technical issues
 >which you're seeing here.

we just talk about oom-killer technical issues.

i am doubt that a new rewrite but the athor canot provide some evidence 
and experiment result, why did you do that? what is the prominent change 
for your new algorithm?

as KOSAKI Motohiro said, "you removed CAP_SYS_RESOURCE condition with 
ZERO explanation".

David just said that pls use userspace tunable for protection by 
oom_score_adj. but may i ask question:

1. what is your innovation for your new algorithm, the old one have the 
same way for user tunable oom_adj.

2. if server like db-server/financial-server have huge import processes 
(such as root/hardware access processes)want to be protection, you let 
the administrator to find out which processes should be protection. you
will let the  financial-server administrator huge crazy!! and lose so 
many money!! ^~^

3. i see your email in LKML, you just said
"I have repeatedly said that the oom killer no longer kills KDE when run 
on my desktop in the presence of a memory hogging task that was written 
specifically to oom the machine."
http://thread.gmane.org/gmane.linux.kernel.mm/48998

so you just test your new oom_killer algorithm on your desktop with KDE, 
so have you provide the detail how you do the test? is it do the
experiment again for anyone and got the same result as your comment ?

as KOSAKI Motohiro said, in reality word, it we makes 5-6 brain 
simulation, embedded, desktop, web server,db server, hpc, finance. 
Different workloads certenally makes big impact. have you do those
experiments?

i think that technology should base on experiment not on imagine.

Best,
Figo.zhang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert oom rewrite series
       [not found]       ` <AANLkTik_SDaiu2eQsJ9+4ywLR5K5V1Od-hwop6gwas3F@mail.gmail.com>
@ 2010-11-15  4:41         ` Figo.zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Figo.zhang @ 2010-11-15  4:41 UTC (permalink / raw)
  To: Andrew Morton, David Rientjes
  Cc: figo zhang, KOSAKI Motohiro, Linus Torvalds, LKML, Ying Han,
	Bodo Eggert, Mandeep Singh Baines, linux-mm@kvack.org

 >Nothing to say, really.  Seems each time we're told about a bug or a
 >regression, David either fixes the bug or points out why it wasn't a
 >bug or why it wasn't a regression or how it was a deliberate behaviour
 >change for the better.

 >I just haven't seen any solid reason to be concerned about the state of
 >the current oom-killer, sorry.

 >I'm concerned that you're concerned!  A lot.  When someone such as
 >yourself is unhappy with part of MM then I sit up and pay attention.
 >But after all this time I simply don't understand the technical issues
 >which you're seeing here.

we just talk about oom-killer technical issues.

i am doubt that a new rewrite but the athor canot provide some evidence
and experiment result, why did you do that? what is the prominent change

for your new algorithm?

as KOSAKI Motohiro said, "you removed CAP_SYS_RESOURCE condition with
ZERO explanation".

David just said that pls use userspace tunable for protection by
oom_score_adj. but may i ask question:

1. what is your innovation for your new algorithm, the old one have the
same way for user tunable oom_adj.

2. if server like db-server/financial-server have huge import processes
(such as root/hardware access processes)want to be protection, you let

the administrator to find out which processes should be protection. you
will let the  financial-server administrator huge crazy!! and lose so
many money!! ^~^

3. i see your email in LKML, you just said
"I have repeatedly said that the oom killer no longer kills KDE when run

on my desktop in the presence of a memory hogging task that was written
specifically to oom the machine."
http://thread.gmane.org/gmane.linux.kernel.mm/48998

so you just test your new oom_killer algorithm on your desktop with KDE,
so have you provide the detail how you do the test? is it do the
experiment again for anyone and got the same result as your comment ?

as KOSAKI Motohiro said, in reality word, it we makes 5-6 brain
simulation, embedded, desktop, web server,db server, hpc, finance.
Different workloads certenally makes big impact. have you do those
experiments?

i think that technology should base on experiment not on imagine.

Best,
Figo.zhang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert oom rewrite series
  2010-11-15  3:26       ` [PATCH] Revert oom rewrite series Figo.zhang
@ 2010-11-15 10:14         ` David Rientjes
  2010-11-15 10:57           ` Alan Cox
  0 siblings, 1 reply; 6+ messages in thread
From: David Rientjes @ 2010-11-15 10:14 UTC (permalink / raw)
  To: Figo.zhang
  Cc: KOSAKI Motohiro, Figo.zhang, lkml, linux-mm@kvack.org,
	Andrew Morton, Linus Torvalds

On Mon, 15 Nov 2010, Figo.zhang wrote:

> i am doubt that a new rewrite but the athor canot provide some evidence and
> experiment result, why did you do that? what is the prominent change for your
> new algorithm?
> 
> as KOSAKI Motohiro said, "you removed CAP_SYS_RESOURCE condition with ZERO
> explanation".
> 
> David just said that pls use userspace tunable for protection by
> oom_score_adj. but may i ask question:
> 
> 1. what is your innovation for your new algorithm, the old one have the same
> way for user tunable oom_adj.
> 

The goal was to make the oom killer heuristic as predictable as possible 
and to kill the most memory-hogging task to avoid having to recall it and 
needlessly kill several tasks.

The goal behind oom_score_adj vs. oom_adj was for several reasons, as 
pointed out before:

 - give it a unit (proportion of available memory), oom_adj had no unit,

 - allow it to work on a linear scale for more control over 
   prioritization, oom_adj had an exponential scale,

 - give it a much higher resolution so it can be fine-tuned, it works with 
   a granularity of 0.1% of memory (~128M on a 128G machine), and

 - allow it to describe the oom killing priority of a task regardless of 
   its cpuset attachment, mempolicy, or memcg, or when their respective
   limits change.

> 2. if server like db-server/financial-server have huge import processes (such
> as root/hardware access processes)want to be protection, you let the
> administrator to find out which processes should be protection. you
> will let the  financial-server administrator huge crazy!! and lose so many
> money!! ^~^
> 

You have full control over disabling a task from being considered with 
oom_score_adj just like you did with oom_adj.  Since oom_adj is 
deprecated for two years, you can even use the old interface until then.

> 3. i see your email in LKML, you just said
> "I have repeatedly said that the oom killer no longer kills KDE when run on my
> desktop in the presence of a memory hogging task that was written specifically
> to oom the machine."
> http://thread.gmane.org/gmane.linux.kernel.mm/48998
> 
> so you just test your new oom_killer algorithm on your desktop with KDE, so
> have you provide the detail how you do the test? is it do the
> experiment again for anyone and got the same result as your comment ?
> 

Xorg tends to be killed less because of the change to the heuristic's 
baseline, which is now based on rss and swap instead of total_vm.  This is 
seperate from the issues you list above, but is a benefit to the oom 
killer that desktop users especially will notice.  I, personally, am 
interested more in the server market and that's why I looked for a more 
robust userspace tunable that would still be applicable when things like 
cpusets have a node added or removed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert oom rewrite series
  2010-11-15 10:14         ` David Rientjes
@ 2010-11-15 10:57           ` Alan Cox
  2010-11-15 20:54             ` David Rientjes
  2010-11-23  7:16             ` KOSAKI Motohiro
  0 siblings, 2 replies; 6+ messages in thread
From: Alan Cox @ 2010-11-15 10:57 UTC (permalink / raw)
  To: David Rientjes
  Cc: Figo.zhang, KOSAKI Motohiro, Figo.zhang, lkml, linux-mm@kvack.org,
	Andrew Morton, Linus Torvalds

> The goal was to make the oom killer heuristic as predictable as possible 
> and to kill the most memory-hogging task to avoid having to recall it and 
> needlessly kill several tasks.

Meta question - why is that a good thing. In a desktop environment it's
frequently wrong, in a server environment it is often wrong. We had this
before where people spend months fiddling with the vm and make it work
slightly differently and it suits their workload, then other workloads go
downhill. Then the cycle repeats.

> You have full control over disabling a task from being considered with 
> oom_score_adj just like you did with oom_adj.  Since oom_adj is 
> deprecated for two years, you can even use the old interface until then.

Which changeset added it to the Documentation directory as deprecated ?

Alan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert oom rewrite series
  2010-11-15 10:57           ` Alan Cox
@ 2010-11-15 20:54             ` David Rientjes
  2010-11-23  7:16             ` KOSAKI Motohiro
  1 sibling, 0 replies; 6+ messages in thread
From: David Rientjes @ 2010-11-15 20:54 UTC (permalink / raw)
  To: Alan Cox
  Cc: Figo.zhang, KOSAKI Motohiro, Figo.zhang, lkml, linux-mm@kvack.org,
	Andrew Morton, Linus Torvalds

On Mon, 15 Nov 2010, Alan Cox wrote:

> > The goal was to make the oom killer heuristic as predictable as possible 
> > and to kill the most memory-hogging task to avoid having to recall it and 
> > needlessly kill several tasks.
> 
> Meta question - why is that a good thing. In a desktop environment it's
> frequently wrong, in a server environment it is often wrong. We had this
> before where people spend months fiddling with the vm and make it work
> slightly differently and it suits their workload, then other workloads go
> downhill. Then the cycle repeats.
> 

Most of the arbitrary heuristics were removed from oom_badness(), things 
like nice level, runtime, CAP_SYS_RESOURCE, etc., so that we only consider 
the rss and swap usage of each application in comparison to each other 
when deciding which task to kill.  We give root tasks a 3% bonus since 
they tend to be more important to the productivity or uptime of the 
machine, which did exist -- albeit with a more dramatic impact -- in the 
old heursitic.

You'll find that the new heuristic always kills the task consuming the 
most amount of rss unless influenced by userspace via the tunables (or 
within 3% of root tasks).

We always want to kill the most memory-hogging task because it avoids 
needlessly killing additional tasks when we must immediately recall the 
oom killer because we continue to allocate memory.  If that task happens 
to be of vital importance to userspace, then the user has full control 
over tuning the oom killer priorities in such circumstances.

> > You have full control over disabling a task from being considered with 
> > oom_score_adj just like you did with oom_adj.  Since oom_adj is 
> > deprecated for two years, you can even use the old interface until then.
> 
> Which changeset added it to the Documentation directory as deprecated ?
> 

51b1bd2a was the actual change that deprecated it, which was a direct 
follow-up to a63d83f4 which actually obsoleted it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert oom rewrite series
  2010-11-15 10:57           ` Alan Cox
  2010-11-15 20:54             ` David Rientjes
@ 2010-11-23  7:16             ` KOSAKI Motohiro
  1 sibling, 0 replies; 6+ messages in thread
From: KOSAKI Motohiro @ 2010-11-23  7:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: kosaki.motohiro, David Rientjes, Figo.zhang, Figo.zhang, lkml,
	linux-mm@kvack.org, Andrew Morton, Linus Torvalds


sorry for the delay.

> > The goal was to make the oom killer heuristic as predictable as possible 
> > and to kill the most memory-hogging task to avoid having to recall it and 
> > needlessly kill several tasks.
> 
> Meta question - why is that a good thing. In a desktop environment it's
> frequently wrong, in a server environment it is often wrong. We had this
> before where people spend months fiddling with the vm and make it work
> slightly differently and it suits their workload, then other workloads go
> downhill. Then the cycle repeats.
> 
> > You have full control over disabling a task from being considered with 
> > oom_score_adj just like you did with oom_adj.  Since oom_adj is 
> > deprecated for two years, you can even use the old interface until then.
> 
> Which changeset added it to the Documentation directory as deprecated ?

It's insufficient.
a63d83f427fbce97a6cea0db2e64b0eb8435cd10 (oom: badness heuristic rewrite)
introduced a lot of incompatibility to oom_adj and oom_score.
Theresore I would sugestted full revert and resubmit some patches which
cherry pick no pain piece.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-11-23  7:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20101114133543.E00A.A69D9226@jp.fujitsu.com>
     [not found] ` <AANLkTikSq-qC28uurd17RGup92Kao7enCiGJkDnJG+94@mail.gmail.com>
     [not found]   ` <20101115093410.BEFD.A69D9226@jp.fujitsu.com>
     [not found]     ` <20101114181905.bc5b44f9.akpm@linux-foundation.org>
     [not found]       ` <AANLkTik_SDaiu2eQsJ9+4ywLR5K5V1Od-hwop6gwas3F@mail.gmail.com>
2010-11-15  4:41         ` [PATCH] Revert oom rewrite series Figo.zhang
2010-11-10 15:14 [PATCH v3]mm/oom-kill: direct hardware access processes should get bonus Figo.zhang
2010-11-10 15:24 ` Figo.zhang
2010-11-14  5:21   ` KOSAKI Motohiro
2010-11-14 21:33     ` David Rientjes
2010-11-15  3:26       ` [PATCH] Revert oom rewrite series Figo.zhang
2010-11-15 10:14         ` David Rientjes
2010-11-15 10:57           ` Alan Cox
2010-11-15 20:54             ` David Rientjes
2010-11-23  7:16             ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).