oom is broken in mmotm 2010-11-09-15-31 tree?

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* oom is broken in mmotm 2010-11-09-15-31 tree?
@ 2010-12-01  2:44 CAI Qian
  2010-12-01 19:29 ` CAI Qian
  0 siblings, 1 reply; 10+ messages in thread
From: CAI Qian @ 2010-12-01  2:44 UTC (permalink / raw)
  To: linux-mm

Hi, just a head-up. When testing oom for this tree, my workstation is immediately having no response to ssh, Desktop actions and so on apart from ping. I am trying to bisect but looks like git public server is having problem.

# git pull
fatal: read error: Connection reset by peer

# git clone git://zen-kernel.org/kernel/mmotm.git
Cloning into mmotm...
fatal: read error: Connection reset by peer

CAI Qian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-01  2:44 CAI Qian
@ 2010-12-01 19:29 ` CAI Qian
  2010-12-01 20:15   ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: CAI Qian @ 2010-12-01 19:29 UTC (permalink / raw)
  To: linux-mm
  Cc: Michel Lespinasse, Rik van Riel, Linus Torvalds, Wu Fengguang,
	H. Peter Anvin


> Hi, just a head-up. When testing oom for this tree, my workstation is
> immediately having no response to ssh, Desktop actions and so on apart
> from ping. I am trying to bisect but looks like git public server is
> having problem.
> 
> # git pull
> fatal: read error: Connection reset by peer
> 
> # git clone git://zen-kernel.org/kernel/mmotm.git
> Cloning into mmotm...
> fatal: read error: Connection reset by peer
This turned out that it was introduced by,

  d065bd810b6deb67d4897a14bfe21f8eb526ba99
  mm: retry page fault when blocking on disk transfer

It was reproduced by:
1) ssh to the test box.
2) try to trigger oom a few times using a malloc program there.

Then, the test box will be unable to process any oom to kill the memory allocation program. If switch VCs for the test box and hit a few ENTER keys locally manually, it may process further oom. After roll-back this one commit, it had no problem to cope the above reproducers and always correctly killed the allocation programs.

CAI Qian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-01 19:29 ` CAI Qian
@ 2010-12-01 20:15   ` Linus Torvalds
  2010-12-01 21:40     ` Michel Lespinasse
  2010-12-02  4:26     ` CAI Qian
  0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2010-12-01 20:15 UTC (permalink / raw)
  To: CAI Qian
  Cc: linux-mm, Michel Lespinasse, Rik van Riel, Wu Fengguang,
	H. Peter Anvin

On Wed, Dec 1, 2010 at 11:29 AM, CAI Qian <caiqian@redhat.com> wrote:
>>
>> Hi, just a head-up. When testing oom for this tree, my workstation is
>> immediately having no response to ssh, Desktop actions and so on apart
>> from ping. I am trying to bisect but looks like git public server is
>> having problem.
>
> This turned out that it was introduced by,
>
>  d065bd810b6deb67d4897a14bfe21f8eb526ba99
>  mm: retry page fault when blocking on disk transfer
>
> It was reproduced by:
> 1) ssh to the test box.
> 2) try to trigger oom a few times using a malloc program there.

Interesting. That commit is not supposed to make any semantic
difference at all. And even if we do end up in the retry path, the
arch/x86/mm/fault.c code is very explicitly designed so that it
retries only _once_.

Michel, any ideas? I could see problems with the mmap_sem if
VM_FAULT_OOM is set at the same time as VM_FAULT_RETRY, but I can't
see how that could ever happen.

Anybody?

CAI, can you get any output from sysrq-W when this happens?

                     Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-01 20:15   ` Linus Torvalds
@ 2010-12-01 21:40     ` Michel Lespinasse
  2010-12-02  4:26     ` CAI Qian
  1 sibling, 0 replies; 10+ messages in thread
From: Michel Lespinasse @ 2010-12-01 21:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: CAI Qian, linux-mm, Rik van Riel, Wu Fengguang, H. Peter Anvin

On Wed, Dec 1, 2010 at 12:15 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Dec 1, 2010 at 11:29 AM, CAI Qian <caiqian@redhat.com> wrote:
>>>
>>> Hi, just a head-up. When testing oom for this tree, my workstation is
>>> immediately having no response to ssh, Desktop actions and so on apart
>>> from ping. I am trying to bisect but looks like git public server is
>>> having problem.
>>
>> This turned out that it was introduced by,
>>
>>  d065bd810b6deb67d4897a14bfe21f8eb526ba99
>>  mm: retry page fault when blocking on disk transfer
>>
>> It was reproduced by:
>> 1) ssh to the test box.
>> 2) try to trigger oom a few times using a malloc program there.
>
> Interesting. That commit is not supposed to make any semantic
> difference at all. And even if we do end up in the retry path, the
> arch/x86/mm/fault.c code is very explicitly designed so that it
> retries only _once_.
>
> Michel, any ideas? I could see problems with the mmap_sem if
> VM_FAULT_OOM is set at the same time as VM_FAULT_RETRY, but I can't
> see how that could ever happen.
>
> Anybody?
>
> CAI, can you get any output from sysrq-W when this happens?

Things are known to be broken between
d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
d88c0922fa0e2c021a028b310a641126c6d4b7dc. CAI, do you have that in
your tree ? Also, can you test at
d065bd810b6deb67d4897a14bfe21f8eb526ba99 with
d88c0922fa0e2c021a028b310a641126c6d4b7dc cherry-picked on ?

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
       [not found] <1415319777.1020071291259410217.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
@ 2010-12-02  3:11 ` caiqian
  0 siblings, 0 replies; 10+ messages in thread
From: caiqian @ 2010-12-02  3:11 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc:  "linux-mm"; "Rik van Riel" <riel@redhat.com>; "Wu Fengguang" <fengguang.wu@intel.com>; "H. Peter Anvin" <hpa@zytor.com>; "Linus Torvalds


> Things are known to be broken between
> d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
> d88c0922fa0e2c021a028b310a641126c6d4b7dc. CAI, do you have that in
> your tree ? Also, can you test at
> d065bd810b6deb67d4897a14bfe21f8eb526ba99 with
> d88c0922fa0e2c021a028b310a641126c6d4b7dc cherry-picked on ?
Hi Michel, the mmotm tree did not include d88c0922fa0e2c021a028b310a641126c6d4b7dc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-01 20:15   ` Linus Torvalds
  2010-12-01 21:40     ` Michel Lespinasse
@ 2010-12-02  4:26     ` CAI Qian
  1 sibling, 0 replies; 10+ messages in thread
From: CAI Qian @ 2010-12-02  4:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, Michel Lespinasse, Rik van Riel, Wu Fengguang,
	H. Peter Anvin


> Interesting. That commit is not supposed to make any semantic
> difference at all. And even if we do end up in the retry path, the
> arch/x86/mm/fault.c code is very explicitly designed so that it
> retries only _once_.
> 
> Michel, any ideas? I could see problems with the mmap_sem if
> VM_FAULT_OOM is set at the same time as VM_FAULT_RETRY, but I can't
> see how that could ever happen.
> 
> Anybody?
> 
> CAI, can you get any output from sysrq-W when this happens?
Hi Linus, please see below,

CAI Qian

[  580.191996] SysRq : Show Blocked State
[  580.192024]   task                        PC stack   pid father
[  580.192024] Sched Debug Version: v0.09, 2.6.36+ #22
[  580.192024] now at 580203.234510 msecs
[  580.192024]   .jiffies                                 : 4295247509
[  580.192024]   .sysctl_sched_latency                    : 18.000000
[  580.192024]   .sysctl_sched_min_granularity            : 2.250000
[  580.192024]   .sysctl_sched_wakeup_granularity         : 3.000000
[  580.192024]   .sysctl_sched_child_runs_first           : 0
[  580.192024]   .sysctl_sched_features                   : 31855
[  580.192024]   .sysctl_sched_tunable_scaling            : 1 (logaritmic)
[  580.192024] 
[  580.192024] cpu#0, 2826.528 MHz
[  580.192024]   .nr_running                    : 1
[  580.192024]   .load                          : 1024
[  580.192024]   .nr_switches                   : 35799
[  580.192024]   .nr_load_updates               : 128515
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.247545
[  580.192024]   .curr->pid                     : 1366
[  580.192024]   .clock                         : 580191.025058
[  580.192024]   .cpu_load[0]                   : 1024
[  580.192024]   .cpu_load[1]                   : 1016
[  580.192024]   .cpu_load[2]                   : 957
[  580.192024]   .cpu_load[3]                   : 872
[  580.192024]   .cpu_load[4]                   : 799
[  580.192024]   .yld_count                     : 140
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 44224
[  580.192024]   .sched_goidle                  : 6268
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 11413
[  580.192024]   .ttwu_local                    : 8684
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[0]:/
[  580.192024]   .exec_clock                    : 125215.744234
[  580.192024]   .MIN_vruntime                  : 0.000001
[  580.192024]   .min_vruntime                  : 45692.541683
[  580.192024]   .max_vruntime                  : 0.000001
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 0.000000
[  580.192024]   .nr_running                    : 1
[  580.192024]   .load                          : 1024
[  580.192024]   .nr_spread_over                : 4
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[0]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024] R       sendmail  1366     45692.541683      9276   120     45692.541683     46469.996943    411694.347209 /
[  580.192024] 
[  580.192024] cpu#1, 2826.528 MHz
[  580.192024]   .nr_running                    : 2
[  580.192024]   .load                          : 2048
[  580.192024]   .nr_switches                   : 46514
[  580.192024]   .nr_load_updates               : 130936
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.247917
[  580.192024]   .curr->pid                     : 1295
[  580.192024]   .clock                         : 580557.002284
[  580.192024]   .cpu_load[0]                   : 2048
[  580.192024]   .cpu_load[1]                   : 1520
[  580.192024]   .cpu_load[2]                   : 1679
[  580.192024]   .cpu_load[3]                   : 1513
[  580.192024]   .cpu_load[4]                   : 1688
[  580.192024]   .yld_count                     : 124
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 54526
[  580.192024]   .sched_goidle                  : 6063
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 9145
[  580.192024]   .ttwu_local                    : 5902
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[1]:/
[  580.192024]   .exec_clock                    : 122340.374690
[  580.192024]   .MIN_vruntime                  : 51807.120538
[  580.192024]   .min_vruntime                  : 51807.120538
[  580.192024]   .max_vruntime                  : 51807.120538
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 6114.578855
[  580.192024]   .nr_running                    : 2
[  580.192024]   .load                          : 2048
[  580.192024]   .nr_spread_over                : 1
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[1]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024]      kworker/1:1    30     51798.120538      3390   120     51798.120538        14.488166    578377.071351 /
[  580.192024] Rhald-addon-inpu  1295     52383.947330      3612   120     52388.948353     21427.078504    454223.044707 /
[  580.192024]             sshd  1494     51807.120538      4985   120     51807.120538     41792.344148     43008.912088 /
[  580.192024] 
[  580.192024] cpu#2, 2826.528 MHz
[  580.192024]   .nr_running                    : 3
[  580.192024]   .load                          : 3072
[  580.192024]   .nr_switches                   : 38687
[  580.192024]   .nr_load_updates               : 128857
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.248178
[  580.192024]   .curr->pid                     : 1002
[  580.192024]   .clock                         : 580830.001334
[  580.192024]   .cpu_load[0]                   : 3072
[  580.192024]   .cpu_load[1]                   : 2688
[  580.192024]   .cpu_load[2]                   : 2231
[  580.192024]   .cpu_load[3]                   : 2408
[  580.192024]   .cpu_load[4]                   : 2606
[  580.192024]   .yld_count                     : 0
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 49977
[  580.192024]   .sched_goidle                  : 4442
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 7958
[  580.192024]   .ttwu_local                    : 5710
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[2]:/
[  580.192024]   .exec_clock                    : 122185.543310
[  580.192024]   .MIN_vruntime                  : 49939.236793
[  580.192024]   .min_vruntime                  : 49948.236793
[  580.192024]   .max_vruntime                  : 49939.236793
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 4255.695110
[  580.192024]   .nr_running                    : 3
[  580.192024]   .load                          : 3072
[  580.192024]   .nr_spread_over                : 5
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[2]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024]      kworker/2:1    31     49939.236793      3110   120     49939.236793        16.244410    577598.865226 /
[  580.192024]          kswapd0    33     49939.236793      5021   120     49939.236793     39855.128906    456899.562827 /
[  580.192024] R     irqbalance  1002     50700.231326     10599   120     50705.232536     37995.007739    451842.051677 /
[  580.192024] 
[  580.192024] cpu#3, 2826.528 MHz
[  580.192024]   .nr_running                    : 2
[  580.192024]   .load                          : 2048
[  580.192024]   .nr_switches                   : 28514
[  580.192024]   .nr_load_updates               : 142983
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.248441
[  580.192024]   .curr->pid                     : 1517
[  580.192024]   .clock                         : 581105.001367
[  580.192024]   .cpu_load[0]                   : 2048
[  580.192024]   .cpu_load[1]                   : 2048
[  580.192024]   .cpu_load[2]                   : 6702
[  580.192024]   .cpu_load[3]                   : 6746
[  580.192024]   .cpu_load[4]                   : 5982
[  580.192024]   .yld_count                     : 179
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 38579
[  580.192024]   .sched_goidle                  : 5981
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 8881
[  580.192024]   .ttwu_local                    : 7007
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[3]:/
[  580.192024]   .exec_clock                    : 135810.747600
[  580.192024]   .MIN_vruntime                  : 64636.582469
[  580.192024]   .min_vruntime                  : 64645.582469
[  580.192024]   .max_vruntime                  : 64636.582469
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 18953.040786
[  580.192024]   .nr_running                    : 3
[  580.192024]   .load                          : 8148
[  580.192024]   .nr_spread_over                : 4
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[3]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024]      kworker/3:1    32     64636.582469      4854   120     64636.582469       856.962108    576060.366837 /
[  580.192024]          audispd   952     64636.582469        96   112     64636.582469      7948.252669    553967.243338 /
[  580.192024] R          oom01  1517     65744.576906        64   120     65748.577669     11613.730238         0.000000 /
[  580.192024] 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
       [not found] <1043135380.1026761291266384009.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
@ 2010-12-02  5:07 ` caiqian
  2010-12-02  5:12   ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: caiqian @ 2010-12-02  5:07 UTC (permalink / raw)
  To: CAI Qian
  Cc: linux-mm, Michel Lespinasse, Rik van Riel, Wu Fengguang,
	H. Peter Anvin, Linus Torvalds


> [  580.192024]          kswapd0    33     49939.236793      5021   120
>     49939.236793     39855.128906    456899.562827 /
Follow-up on this, kswapd0 was doing this from SysRq-T output,

[ 2836.085008] kswapd0       R  running task        0    33      2 0x00000000
[ 2836.085008]  ffff8802276f9b10 0000000000000046 0000000000000000 ffffffff8100a84e
[ 2836.085008]  00000000000136c0 00000000000136c0 00000000000136c0 ffff88022b70c590
[ 2836.085008]  00000000000136c0 ffff8802276f9fd8 00000000000136c0 00000000000136c0
[ 2836.085008] Call Trace:
[ 2836.085008]  [<ffffffff8100a84e>] ? call_function_interrupt+0xe/0x20
[ 2836.085008]  [<ffffffff81114594>] ? mem_cgroup_del_lru_list+0x42/0x76
[ 2836.085008]  [<ffffffff8104a437>] __cond_resched+0x2a/0x35
[ 2836.085008]  [<ffffffff8146503c>] _cond_resched+0x1b/0x22
[ 2836.085008]  [<ffffffff810e03c7>] shrink_page_list+0x53/0x469
[ 2836.085008]  [<ffffffff810df7ba>] ? update_isolated_counts.clone.27+0x13d/0x15b
[ 2836.085008]  [<ffffffff810e0bcf>] shrink_inactive_list+0x22b/0x376
[ 2836.085008]  [<ffffffff810db07c>] ? determine_dirtyable_memory+0x1d/0x26
[ 2836.085008]  [<ffffffff810e12eb>] shrink_zone+0x32e/0x3ca
[ 2836.085008]  [<ffffffff814665ae>] ? _raw_spin_lock+0xe/0x10
[ 2836.085008]  [<ffffffff810e1e5d>] balance_pgdat+0x242/0x417
[ 2836.085008]  [<ffffffff810e2258>] kswapd+0x226/0x23c
[ 2836.085008]  [<ffffffff81069c8b>] ? autoremove_wake_function+0x0/0x39
[ 2836.085008]  [<ffffffff81466617>] ? _raw_spin_unlock_irqrestore+0x17/0x19
[ 2836.085008]  [<ffffffff810e2032>] ? kswapd+0x0/0x23c
[ 2836.085008]  [<ffffffff810697da>] kthread+0x82/0x8a
[ 2836.085008]  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
[ 2836.085008]  [<ffffffff81069758>] ? kthread+0x0/0x8a
[ 2836.085008]  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-02  5:07 ` caiqian
@ 2010-12-02  5:12   ` Linus Torvalds
  2010-12-02  6:40     ` CAI Qian
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2010-12-02  5:12 UTC (permalink / raw)
  To: caiqian
  Cc: linux-mm, Michel Lespinasse, Rik van Riel, Wu Fengguang,
	H. Peter Anvin

On Wed, Dec 1, 2010 at 9:07 PM,  <caiqian@redhat.com> wrote:
>
>> [  580.192024]          kswapd0    33     49939.236793      5021   120
>>     49939.236793     39855.128906    456899.562827 /
> Follow-up on this, kswapd0 was doing this from SysRq-T output,

Ok, this does seem like a lot of pages are busy, so shrink_page_list
ends up just looping.

And that is indeed the bug that commit d88c0922fa0e should have fixed.

So please check whether the kernel you are running has that fix
applied to it or not.

                  Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-02  5:12   ` Linus Torvalds
@ 2010-12-02  6:40     ` CAI Qian
  2010-12-02  6:48       ` CAI Qian
  0 siblings, 1 reply; 10+ messages in thread
From: CAI Qian @ 2010-12-02  6:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, Michel Lespinasse, Rik van Riel, Wu Fengguang,
	H. Peter Anvin


> Ok, this does seem like a lot of pages are busy, so shrink_page_list
> ends up just looping.
> 
> And that is indeed the bug that commit d88c0922fa0e should have
> fixed.
> 
> So please check whether the kernel you are running has that fix
> applied to it or not.
Indeed, I was able to reproduce it anymore after applied this patch. Thanks.

CAI Qian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: oom is broken in mmotm 2010-11-09-15-31 tree?
  2010-12-02  6:40     ` CAI Qian
@ 2010-12-02  6:48       ` CAI Qian
  0 siblings, 0 replies; 10+ messages in thread
From: CAI Qian @ 2010-12-02  6:48 UTC (permalink / raw)
  To: linux-mm, Michel Lespinasse, Rik van Riel, Wu Fengguang,
	H. Peter Anvin, Linus Torvalds


> Indeed, I was able to reproduce it anymore after applied this patch.
> Thanks.
Correction - I was not able to...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-12-02  6:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1415319777.1020071291259410217.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-12-02  3:11 ` oom is broken in mmotm 2010-11-09-15-31 tree? caiqian
     [not found] <1043135380.1026761291266384009.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-12-02  5:07 ` caiqian
2010-12-02  5:12   ` Linus Torvalds
2010-12-02  6:40     ` CAI Qian
2010-12-02  6:48       ` CAI Qian
2010-12-01  2:44 CAI Qian
2010-12-01 19:29 ` CAI Qian
2010-12-01 20:15   ` Linus Torvalds
2010-12-01 21:40     ` Michel Lespinasse
2010-12-02  4:26     ` CAI Qian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).