linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: CAI Qian <caiqian@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-mm <linux-mm@kvack.org>,
	Michel Lespinasse <walken@google.com>,
	Rik van Riel <riel@redhat.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: oom is broken in mmotm 2010-11-09-15-31 tree?
Date: Wed, 1 Dec 2010 23:26:54 -0500 (EST)	[thread overview]
Message-ID: <414645031.1024011291264014168.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> (raw)
In-Reply-To: <AANLkTi=tfDQhcNwhDeLz9jM5QHjDR_8WL+v6AWU3SJpZ@mail.gmail.com>


> Interesting. That commit is not supposed to make any semantic
> difference at all. And even if we do end up in the retry path, the
> arch/x86/mm/fault.c code is very explicitly designed so that it
> retries only _once_.
> 
> Michel, any ideas? I could see problems with the mmap_sem if
> VM_FAULT_OOM is set at the same time as VM_FAULT_RETRY, but I can't
> see how that could ever happen.
> 
> Anybody?
> 
> CAI, can you get any output from sysrq-W when this happens?
Hi Linus, please see below,

CAI Qian

[  580.191996] SysRq : Show Blocked State
[  580.192024]   task                        PC stack   pid father
[  580.192024] Sched Debug Version: v0.09, 2.6.36+ #22
[  580.192024] now at 580203.234510 msecs
[  580.192024]   .jiffies                                 : 4295247509
[  580.192024]   .sysctl_sched_latency                    : 18.000000
[  580.192024]   .sysctl_sched_min_granularity            : 2.250000
[  580.192024]   .sysctl_sched_wakeup_granularity         : 3.000000
[  580.192024]   .sysctl_sched_child_runs_first           : 0
[  580.192024]   .sysctl_sched_features                   : 31855
[  580.192024]   .sysctl_sched_tunable_scaling            : 1 (logaritmic)
[  580.192024] 
[  580.192024] cpu#0, 2826.528 MHz
[  580.192024]   .nr_running                    : 1
[  580.192024]   .load                          : 1024
[  580.192024]   .nr_switches                   : 35799
[  580.192024]   .nr_load_updates               : 128515
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.247545
[  580.192024]   .curr->pid                     : 1366
[  580.192024]   .clock                         : 580191.025058
[  580.192024]   .cpu_load[0]                   : 1024
[  580.192024]   .cpu_load[1]                   : 1016
[  580.192024]   .cpu_load[2]                   : 957
[  580.192024]   .cpu_load[3]                   : 872
[  580.192024]   .cpu_load[4]                   : 799
[  580.192024]   .yld_count                     : 140
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 44224
[  580.192024]   .sched_goidle                  : 6268
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 11413
[  580.192024]   .ttwu_local                    : 8684
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[0]:/
[  580.192024]   .exec_clock                    : 125215.744234
[  580.192024]   .MIN_vruntime                  : 0.000001
[  580.192024]   .min_vruntime                  : 45692.541683
[  580.192024]   .max_vruntime                  : 0.000001
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 0.000000
[  580.192024]   .nr_running                    : 1
[  580.192024]   .load                          : 1024
[  580.192024]   .nr_spread_over                : 4
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[0]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024] R       sendmail  1366     45692.541683      9276   120     45692.541683     46469.996943    411694.347209 /
[  580.192024] 
[  580.192024] cpu#1, 2826.528 MHz
[  580.192024]   .nr_running                    : 2
[  580.192024]   .load                          : 2048
[  580.192024]   .nr_switches                   : 46514
[  580.192024]   .nr_load_updates               : 130936
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.247917
[  580.192024]   .curr->pid                     : 1295
[  580.192024]   .clock                         : 580557.002284
[  580.192024]   .cpu_load[0]                   : 2048
[  580.192024]   .cpu_load[1]                   : 1520
[  580.192024]   .cpu_load[2]                   : 1679
[  580.192024]   .cpu_load[3]                   : 1513
[  580.192024]   .cpu_load[4]                   : 1688
[  580.192024]   .yld_count                     : 124
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 54526
[  580.192024]   .sched_goidle                  : 6063
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 9145
[  580.192024]   .ttwu_local                    : 5902
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[1]:/
[  580.192024]   .exec_clock                    : 122340.374690
[  580.192024]   .MIN_vruntime                  : 51807.120538
[  580.192024]   .min_vruntime                  : 51807.120538
[  580.192024]   .max_vruntime                  : 51807.120538
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 6114.578855
[  580.192024]   .nr_running                    : 2
[  580.192024]   .load                          : 2048
[  580.192024]   .nr_spread_over                : 1
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[1]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024]      kworker/1:1    30     51798.120538      3390   120     51798.120538        14.488166    578377.071351 /
[  580.192024] Rhald-addon-inpu  1295     52383.947330      3612   120     52388.948353     21427.078504    454223.044707 /
[  580.192024]             sshd  1494     51807.120538      4985   120     51807.120538     41792.344148     43008.912088 /
[  580.192024] 
[  580.192024] cpu#2, 2826.528 MHz
[  580.192024]   .nr_running                    : 3
[  580.192024]   .load                          : 3072
[  580.192024]   .nr_switches                   : 38687
[  580.192024]   .nr_load_updates               : 128857
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.248178
[  580.192024]   .curr->pid                     : 1002
[  580.192024]   .clock                         : 580830.001334
[  580.192024]   .cpu_load[0]                   : 3072
[  580.192024]   .cpu_load[1]                   : 2688
[  580.192024]   .cpu_load[2]                   : 2231
[  580.192024]   .cpu_load[3]                   : 2408
[  580.192024]   .cpu_load[4]                   : 2606
[  580.192024]   .yld_count                     : 0
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 49977
[  580.192024]   .sched_goidle                  : 4442
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 7958
[  580.192024]   .ttwu_local                    : 5710
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[2]:/
[  580.192024]   .exec_clock                    : 122185.543310
[  580.192024]   .MIN_vruntime                  : 49939.236793
[  580.192024]   .min_vruntime                  : 49948.236793
[  580.192024]   .max_vruntime                  : 49939.236793
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 4255.695110
[  580.192024]   .nr_running                    : 3
[  580.192024]   .load                          : 3072
[  580.192024]   .nr_spread_over                : 5
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[2]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024]      kworker/2:1    31     49939.236793      3110   120     49939.236793        16.244410    577598.865226 /
[  580.192024]          kswapd0    33     49939.236793      5021   120     49939.236793     39855.128906    456899.562827 /
[  580.192024] R     irqbalance  1002     50700.231326     10599   120     50705.232536     37995.007739    451842.051677 /
[  580.192024] 
[  580.192024] cpu#3, 2826.528 MHz
[  580.192024]   .nr_running                    : 2
[  580.192024]   .load                          : 2048
[  580.192024]   .nr_switches                   : 28514
[  580.192024]   .nr_load_updates               : 142983
[  580.192024]   .nr_uninterruptible            : 0
[  580.192024]   .next_balance                  : 4295.248441
[  580.192024]   .curr->pid                     : 1517
[  580.192024]   .clock                         : 581105.001367
[  580.192024]   .cpu_load[0]                   : 2048
[  580.192024]   .cpu_load[1]                   : 2048
[  580.192024]   .cpu_load[2]                   : 6702
[  580.192024]   .cpu_load[3]                   : 6746
[  580.192024]   .cpu_load[4]                   : 5982
[  580.192024]   .yld_count                     : 179
[  580.192024]   .sched_switch                  : 0
[  580.192024]   .sched_count                   : 38579
[  580.192024]   .sched_goidle                  : 5981
[  580.192024]   .avg_idle                      : 1000000
[  580.192024]   .ttwu_count                    : 8881
[  580.192024]   .ttwu_local                    : 7007
[  580.192024]   .bkl_count                     : 0
[  580.192024] 
[  580.192024] cfs_rq[3]:/
[  580.192024]   .exec_clock                    : 135810.747600
[  580.192024]   .MIN_vruntime                  : 64636.582469
[  580.192024]   .min_vruntime                  : 64645.582469
[  580.192024]   .max_vruntime                  : 64636.582469
[  580.192024]   .spread                        : 0.000000
[  580.192024]   .spread0                       : 18953.040786
[  580.192024]   .nr_running                    : 3
[  580.192024]   .load                          : 8148
[  580.192024]   .nr_spread_over                : 4
[  580.192024]   .shares                        : 0
[  580.192024] 
[  580.192024] rt_rq[3]:/
[  580.192024]   .rt_nr_running                 : 0
[  580.192024]   .rt_throttled                  : 0
[  580.192024]   .rt_time                       : 0.000000
[  580.192024]   .rt_runtime                    : 950.000000
[  580.192024] 
[  580.192024] runnable tasks:
[  580.192024]             task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
[  580.192024] ----------------------------------------------------------------------------------------------------------
[  580.192024]      kworker/3:1    32     64636.582469      4854   120     64636.582469       856.962108    576060.366837 /
[  580.192024]          audispd   952     64636.582469        96   112     64636.582469      7948.252669    553967.243338 /
[  580.192024] R          oom01  1517     65744.576906        64   120     65748.577669     11613.730238         0.000000 /
[  580.192024] 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-12-02  4:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-01  2:44 oom is broken in mmotm 2010-11-09-15-31 tree? CAI Qian
2010-12-01 19:29 ` CAI Qian
2010-12-01 20:15   ` Linus Torvalds
2010-12-01 21:40     ` Michel Lespinasse
2010-12-02  4:26     ` CAI Qian [this message]
     [not found] <1415319777.1020071291259410217.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-12-02  3:11 ` caiqian
     [not found] <1043135380.1026761291266384009.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-12-02  5:07 ` caiqian
2010-12-02  5:12   ` Linus Torvalds
2010-12-02  6:40     ` CAI Qian
2010-12-02  6:48       ` CAI Qian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=414645031.1024011291264014168.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com \
    --to=caiqian@redhat.com \
    --cc=fengguang.wu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).