From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "He, Bo" <bo.he@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"josh@joshtriplett.org" <josh@joshtriplett.org>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
"jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
"Zhang, Jun" <jun.zhang@intel.com>,
"Xiao, Jin" <jin.xiao@intel.com>,
"Zhang, Yanmin" <yanmin.zhang@intel.com>
Subject: Re: rcu_preempt caused oom
Date: Thu, 29 Nov 2018 05:06:47 -0800 [thread overview]
Message-ID: <20181129130647.GG4170@linux.ibm.com> (raw)
In-Reply-To: <CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com>
On Thu, Nov 29, 2018 at 08:49:35AM +0000, He, Bo wrote:
> Hi,
> we test on kernel 4.19.0 on android, after run more than 24 Hours monkey stress test, we see OOM on 1/10 2G memory board, the issue is not seen on the 4.14 kernel.
> we have done some debugs:
> 1. OOM is due to the filp consume too many memory: 300M vs 2G board.
> 2. with the 120s hung task detect, most of the tasks will block at __wait_rcu_gp: wait_for_completion(&rs_array[i].completion);
> [47571.863839] Kernel panic - not syncing: hung_task: blocked tasks
> [47571.875446] CPU: 1 PID: 13626 Comm: FinalizerDaemon Tainted: G U O 4.19.0-quilt-2e5dc0ac-gf3f313245eb6 #1
> [47571.887603] Call Trace:
> [47571.890547] dump_stack+0x70/0xa5
> [47571.894456] panic+0xe3/0x241
> [47571.897977] ? wait_for_completion_timeout+0x72/0x1b0
> [47571.903830] __wait_rcu_gp+0x17b/0x180
> [47571.908226] synchronize_rcu.part.76+0x38/0x50
> [47571.913393] ? __call_rcu.constprop.79+0x3a0/0x3a0
> [47571.918948] ? __bpf_trace_rcu_invoke_callback+0x10/0x10
> [47571.925094] synchronize_rcu+0x43/0x50
> [47571.929487] evdev_detach_client+0x59/0x60
> [47571.934264] evdev_release+0x4e/0xd0
> [47571.938464] __fput+0xfa/0x1f0
> [47571.942072] ____fput+0xe/0x10
> [47571.945683] task_work_run+0x90/0xc0
> [47571.949884] exit_to_usermode_loop+0x9f/0xb0
> [47571.954855] do_syscall_64+0xfa/0x110
> [47571.959151] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 3. after enable the rcu trace, we don't see rcu_quiescent_state_report trace in a long time, we see rcu_callback: rcu_preempt will never response with the rcu_invoke_callback.
> [47572.040668] ps-12388 1d..1 47566097572us : rcu_grace_period: rcu_preempt 23716088 AccWaitCB
> [47572.040707] ps-12388 1d... 47566097621us : rcu_callback: rcu_preempt rhp=00000000783a728b func=file_free_rcu 4354/82824
> [47572.040734] ps-12388 1d..1 47566097622us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Startleaf
> [47572.040756] ps-12388 1d..1 47566097623us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Prestarted
> [47572.040778] ps-12388 1d..1 47566097623us : rcu_grace_period: rcu_preempt 23716088 AccWaitCB
> [47572.040802] ps-12388 1d... 47566097674us : rcu_callback: rcu_preempt rhp=0000000042c76521 func=file_free_rcu 4354/82825
> [47572.040824] ps-12388 1d..1 47566097676us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Startleaf
> [47572.040847] ps-12388 1d..1 47566097676us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Prestarted
> [47572.040868] ps-12388 1d..1 47566097676us : rcu_grace_period: rcu_preempt 23716088 AccWaitCB
> [47572.040895] ps-12388 1d..1 47566097716us : rcu_callback: rcu_preempt rhp=000000005e40fde2 func=avc_node_free 4354/82826
> [47572.040919] ps-12388 1d..1 47566097735us : rcu_callback: rcu_preempt rhp=00000000f80fe353 func=avc_node_free 4354/82827
> [47572.040943] ps-12388 1d..1 47566097758us : rcu_callback: rcu_preempt rhp=000000007486f400 func=avc_node_free 4354/82828
> [47572.040967] ps-12388 1d..1 47566097760us : rcu_callback: rcu_preempt rhp=00000000b87872a8 func=avc_node_free 4354/82829
> [47572.040990] ps-12388 1d... 47566097789us : rcu_callback: rcu_preempt rhp=000000008c656343 func=file_free_rcu 4354/82830
> [47572.041013] ps-12388 1d..1 47566097790us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Startleaf
> [47572.041036] ps-12388 1d..1 47566097790us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Prestarted
> [47572.041057] ps-12388 1d..1 47566097791us : rcu_grace_period: rcu_preempt 23716088 AccWaitCB
> [47572.041081] ps-12388 1d... 47566097871us : rcu_callback: rcu_preempt rhp=000000007e6c898c func=file_free_rcu 4354/82831
> [47572.041103] ps-12388 1d..1 47566097872us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Startleaf
> [47572.041126] ps-12388 1d..1 47566097872us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Prestarted
> [47572.041147] ps-12388 1d..1 47566097873us : rcu_grace_period: rcu_preempt 23716088 AccWaitCB
> [47572.041170] ps-12388 1d... 47566097945us : rcu_callback: rcu_preempt rhp=0000000032f4f174 func=file_free_rcu 4354/82832
> [47572.041193] ps-12388 1d..1 47566097946us : rcu_future_grace_period: rcu_preempt 23716088 23716092 0 0 3 Startleaf
>
> Do you have any suggestions to debug the issue?
If you do not already have CONFIG_RCU_BOOST=y set, could you please
rebuild with that?
Could you also please send your .config file?
Thanx, Paul
next prev parent reply other threads:[~2018-11-29 13:07 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-29 8:49 rcu_preempt caused oom He, Bo
2018-11-29 13:06 ` Paul E. McKenney [this message]
2018-11-29 14:27 ` Paul E. McKenney
2018-11-30 8:03 ` He, Bo
2018-11-30 14:43 ` Paul E. McKenney
2018-11-30 15:16 ` Steven Rostedt
2018-11-30 15:18 ` He, Bo
2018-11-30 16:49 ` Paul E. McKenney
2018-12-03 7:44 ` He, Bo
2018-12-03 13:56 ` Paul E. McKenney
2018-12-04 7:50 ` He, Bo
2018-12-04 19:49 ` Paul E. McKenney
2018-12-05 8:42 ` He, Bo
2018-12-05 17:44 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A16C46@SHSMSX104.ccr.corp.intel.com>
2018-12-06 17:38 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A180C5@SHSMSX104.ccr.corp.intel.com>
2018-12-07 14:11 ` Paul E. McKenney
2018-12-09 19:56 ` Paul E. McKenney
2018-12-10 6:56 ` He, Bo
2018-12-11 0:38 ` Paul E. McKenney
2018-12-11 4:46 ` Paul E. McKenney
2018-12-11 5:29 ` He, Bo
2018-12-12 1:37 ` He, Bo
2018-12-12 2:24 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A192C3@SHSMSX104.ccr.corp.intel.com>
2018-12-12 15:42 ` Paul E. McKenney
2018-12-12 21:03 ` Paul E. McKenney
2018-12-12 23:13 ` He, Bo
2018-12-13 0:12 ` Paul E. McKenney
2018-12-13 2:11 ` Zhang, Jun
2018-12-13 2:42 ` Paul E. McKenney
[not found] ` <88DC34334CA3444C85D647DBFA962C2735AD5F9E@SHSMSX104.ccr.corp.intel.com>
2018-12-13 4:40 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A197EC@SHSMSX104.ccr.corp.intel.com>
2018-12-13 18:11 ` Paul E. McKenney
2018-12-14 1:30 ` He, Bo
2018-12-14 2:15 ` Paul E. McKenney
2018-12-14 2:40 ` He, Bo
2018-12-14 5:10 ` Paul E. McKenney
2018-12-14 5:38 ` Paul E. McKenney
2018-12-17 3:15 ` He, Bo
2018-12-17 4:26 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A1A634@SHSMSX104.ccr.corp.intel.com>
2018-12-18 2:46 ` Zhang, Jun
2018-12-18 3:12 ` He, Bo
2018-12-18 5:34 ` Paul E. McKenney
2019-02-13 8:31 ` [tip:core/rcu] rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes() tip-bot for Zhang, Jun
2019-02-13 8:30 ` [tip:core/rcu] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt tip-bot for Zhang, Jun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181129130647.GG4170@linux.ibm.com \
--to=paulmck@linux.ibm.com \
--cc=bo.he@intel.com \
--cc=jiangshanlai@gmail.com \
--cc=jin.xiao@intel.com \
--cc=josh@joshtriplett.org \
--cc=jun.zhang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=rostedt@goodmis.org \
--cc=yanmin.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.