* Ext4 stuck at wait_transaction_locked
@ 2016-08-26 6:52 Roy Yang
2016-08-27 13:36 ` Theodore Ts'o
0 siblings, 1 reply; 6+ messages in thread
From: Roy Yang @ 2016-08-26 6:52 UTC (permalink / raw)
To: linux-ext4
Hi Ext4 experts,
I need your help to debug one ext4 issue. We consistently see Ext4
stuck at wait_transaction_locked after another process is killed by
cgroup because of oom. We have two processes keeping writing data to
the same disk, and one was killed because of oom; the other process
will stall at all I/O operations pretty soon.
After did some search online, and found a couple of similar reports.
Not sure whether this is a known issue.
The following is the system information and part of dmesg, and I also
have the vmcore generated by sysrq+c.
[tmp]$ uname -a
Linux sedhaswell04-node-1 3.10.0-327.22.2.el7.cohesity.x86_64 #1 SMP
Tue Jul 5 12:41:09 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
[tmp]$ cat /etc/centos-release
CentOS Linux release 7.2.1511 (Core)
[96766.113014] Killed process 81593 (java) total-vm:19626048kB,
anon-rss:1424920kB, file-rss:5144kB
[97047.933383] INFO: task jbd2/sdb1-8:3229 blocked for more than 120 seconds.
[97047.941093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[97047.949879] jbd2/sdb1-8 D ffff88103a587000 0 3229 2 0x00000080
[97047.949883] ffff88104ad1bc90 0000000000000046 ffff88085129d080
ffff88104ad1bfd8
[97047.949886] ffff88104ad1bfd8 ffff88104ad1bfd8 ffff88085129d080
ffff88104ad1bda0
[97047.949888] ffff881040d2e0c0 ffff88085129d080 ffff88104ad1bd88
ffff88103a587000
[97047.949891] Call Trace:
[97047.949899] [<ffffffff8163b7f9>] schedule+0x29/0x70
[97047.949924] [<ffffffffa0201118>]
jbd2_journal_commit_transaction+0x248/0x19a0 [jbd2]
[97047.949932] [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[97047.949935] [<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30
[97047.949941] [<ffffffff8108d6fe>] ? try_to_del_timer_sync+0x5e/0x90
[97047.949947] [<ffffffffa0206d79>] kjournald2+0xc9/0x260 [jbd2]
[97047.949949] [<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30
[97047.949953] [<ffffffffa0206cb0>] ? commit_timeout+0x10/0x10 [jbd2]
[97047.949957] [<ffffffff810a5aef>] kthread+0xcf/0xe0
[97047.949959] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[97047.949963] [<ffffffff816467d8>] ret_from_fork+0x58/0x90
[97047.949965] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[97047.950005] INFO: task kworker/u386:2:69088 blocked for more than
120 seconds.
[97047.958096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[97047.966869] kworker/u386:2 D ffff88104ab27120 0 69088 2 0x00000080
[97047.966878] Workqueue: writeback bdi_writeback_workfn (flush-8:16)
[97047.966880] ffff880b8b6638e8 0000000000000046 ffff8810482be780
ffff880b8b663fd8
[97047.966897] ffff880b8b663fd8 ffff880b8b663fd8 ffff8810482be780
ffff881040d2e000
[97047.966900] ffff881040d2e078 00000000000fb086 ffff88103a587000
ffff88104ab27120
[97047.966902] Call Trace:
[97047.966905] [<ffffffff8163b7f9>] schedule+0x29/0x70
[97047.966911] [<ffffffffa01fe085>] wait_transaction_locked+0x85/0xd0 [jbd2]
[97047.966913] [<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30
[97047.966917] [<ffffffffa01fe400>] start_this_handle+0x2b0/0x5d0 [jbd2]
[97047.966925] [<ffffffffa02c8f03>] ? _scsih_qcmd+0x2d3/0x420 [mpt3sas]
[97047.966928] [<ffffffff811c153a>] ? kmem_cache_alloc+0x1ba/0x1d0
[97047.966932] [<ffffffffa01fe933>] jbd2__journal_start+0xf3/0x1e0 [jbd2]
[97047.966946] [<ffffffffa0377fc4>] ? ext4_writepages+0x434/0xd60 [ext4]
[97047.966956] [<ffffffffa03a8ea9>] __ext4_journal_start_sb+0x69/0xe0 [ext4]
[97047.966962] [<ffffffffa0377fc4>] ext4_writepages+0x434/0xd60 [ext4]
[97047.966967] [<ffffffff81174c38>] ? generic_writepages+0x58/0x80
[97047.966970] [<ffffffff81175cde>] do_writepages+0x1e/0x40
[97047.966972] [<ffffffff81208a20>] __writeback_single_inode+0x40/0x220
[97047.966974] [<ffffffff8120948e>] writeback_sb_inodes+0x25e/0x420
[97047.966976] [<ffffffff812096ef>] __writeback_inodes_wb+0x9f/0xd0
[97047.966978] [<ffffffff81209f33>] wb_writeback+0x263/0x2f0
[97047.966980] [<ffffffff810a0d96>] ? set_worker_desc+0x86/0xb0
[97047.966982] [<ffffffff8120c005>] bdi_writeback_workfn+0x115/0x460
[97047.966985] [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[97047.966986] [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[97047.966988] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[97047.966990] [<ffffffff810a5aef>] kthread+0xcf/0xe0
[97047.966992] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[97047.966994] [<ffffffff816467d8>] ret_from_fork+0x58/0x90
[97047.966996] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[97047.967003] INFO: task bridge_exec:79205 blocked for more than 120 seconds.
[97047.974801] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
>From crash dump, all the process stuck at UN state has similar backtrace:
: 69088 TASK: ffff8810482be780 CPU: 13 COMMAND: "kworker/u386:2"
#0 [ffff880b8b663888] __schedule at ffffffff8163b15d
#1 [ffff880b8b6638f0] schedule at ffffffff8163b7f9
#2 [ffff880b8b663900] wait_transaction_locked at ffffffffa01fe085 [jbd2]
#3 [ffff880b8b663958] start_this_handle at ffffffffa01fe400 [jbd2]
#4 [ffff880b8b6639f8] jbd2__journal_start at ffffffffa01fe933 [jbd2]
#5 [ffff880b8b663a40] __ext4_journal_start_sb at ffffffffa03a8ea9 [ext4]
#6 [ffff880b8b663a80] ext4_writepages at ffffffffa0377fc4 [ext4]
#7 [ffff880b8b663bb8] do_writepages at ffffffff81175cde
#8 [ffff880b8b663bc8] __writeback_single_inode at ffffffff81208a20
#9 [ffff880b8b663c08] writeback_sb_inodes at ffffffff8120948e
#10 [ffff880b8b663cb0] __writeback_inodes_wb at ffffffff812096ef
#11 [ffff880b8b663cf8] wb_writeback at ffffffff81209f33
#12 [ffff880b8b663d70] bdi_writeback_workfn at ffffffff8120c005
#13 [ffff880b8b663e20] process_one_work at ffffffff8109d5fb
#14 [ffff880b8b663e68] worker_thread at ffffffff8109e3cb
#15 [ffff880b8b663ec8] kthread at ffffffff810a5aef
#16 [ffff880b8b663f50] ret_from_fork at ffffffff816467d8
PID: 79204 TASK: ffff88084b050b80 CPU: 21 COMMAND: "bridge_exec"
#0 [ffff880d7795fc70] __schedule at ffffffff8163b15d
#1 [ffff880d7795fcd8] schedule at ffffffff8163b7f9
#2 [ffff880d7795fce8] do_exit at ffffffff81081504
#3 [ffff880d7795fd78] do_group_exit at ffffffff81081dff
#4 [ffff880d7795fda8] get_signal_to_deliver at ffffffff81092c10
#5 [ffff880d7795fe40] do_signal at ffffffff81014417
#6 [ffff880d7795ff30] do_notify_resume at ffffffff81014adf
#7 [ffff880d7795ff50] int_signal at ffffffff81646b3d
RIP: 00007fb2e6b2a9b9 RSP: 00007fff14cbc3d8 RFLAGS: 00000246
RAX: fffffffffffffe00 RBX: 00000000048e2020 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000489bcd8
RBP: 00007fff14cbc420 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000048e2258
R13: 000000000471d898 R14: 00000000048e2188 R15: 000000000489bcc0
ORIG_RAX: 00000000000000ca CS: 0033 SS: 002b
Any advice or suggestion will be appreciated!
Best Regards,
Roy
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Ext4 stuck at wait_transaction_locked
2016-08-26 6:52 Ext4 stuck at wait_transaction_locked Roy Yang
@ 2016-08-27 13:36 ` Theodore Ts'o
2016-08-27 14:45 ` Roy Yang
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2016-08-27 13:36 UTC (permalink / raw)
To: Roy Yang; +Cc: linux-ext4
On Thu, Aug 25, 2016 at 11:52:07PM -0700, Roy Yang wrote:
> I need your help to debug one ext4 issue. We consistently see Ext4
> stuck at wait_transaction_locked after another process is killed by
> cgroup because of oom. We have two processes keeping writing data to
> the same disk, and one was killed because of oom; the other process
> will stall at all I/O operations pretty soon.
You're using an ancient, 3.10-based RHEL 7 kernel:
> Linux sedhaswell04-node-1 3.10.0-327.22.2.el7.cohesity.x86_64 #1 SMP
> Tue Jul 5 12:41:09 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
As far as I know this bug does not exist in the upstream kernel ---
but the 3.10 kernel was released in June 2013, and since then changes
are the responsibility of Red Hat / CentOS. So you would need to get
support from Red Hat, since they have made a huge number of changes to
the kernel.
If you had given us the stack trace from the task that got OOM-killed,
we might be able to take a quick look, but if you use a distribution
kernel, it is the responsibility of the distribution to support you
--- this is, after all, why they get paid the big bucks. :-)
If you want to you use the latest upstream kernel, we would be much
more likely to help, although of course unlike Red Hat we don't have
any kind of guaranteed response time. For that, you would need to go
find a distribution and pay the aforementioned big bucks. :-)
Cheers,
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Ext4 stuck at wait_transaction_locked
2016-08-27 13:36 ` Theodore Ts'o
@ 2016-08-27 14:45 ` Roy Yang
2016-08-27 16:14 ` Theodore Ts'o
0 siblings, 1 reply; 6+ messages in thread
From: Roy Yang @ 2016-08-27 14:45 UTC (permalink / raw)
To: Theodore Ts'o, linux-ext4
Hi Ted,
Thank you very much for reply.
The stack trace of the process is killed is:
94346.777541] java invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[94346.777548] java cpuset=/ mems_allowed=0-1
[94346.777551] CPU: 11 PID: 18564 Comm: java Not tainted
3.10.0-327.22.2.el7..x86_64 #1
[94346.777553] Hardware name: Intel Corporation S2600TP/S2600TP, BIOS
SE5C610.86B.01.01.0016.033120161139 03/31/2016
[94346.777555] ffff88104cb7a280 000000003ccafc27 ffff881046c77cd0
ffffffff816360f4
[94346.777563] ffff881046c77d60 ffffffff8163108f ffff881046c77d18
0000000000000297
[94346.777567] ffffea003fff1fc0 ffff881046c77d30 ffffffff8116cd26
ffff8810491110e8
[94346.777572] Call Trace:
[94346.777584] [<ffffffff816360f4>] dump_stack+0x19/0x1b
[94346.777588] [<ffffffff8163108f>] dump_header+0x8e/0x214
[94346.777595] [<ffffffff8116cd26>] ? find_lock_task_mm+0x56/0xc0
[94346.777598] [<ffffffff8116d1be>] oom_kill_process+0x24e/0x3b0
[94346.777605] [<ffffffff81088dae>] ? has_capability_noaudit+0x1e/0x30
[94346.777611] [<ffffffff811d40e5>] mem_cgroup_oom_synchronize+0x575/0x5a0
[94346.777614] [<ffffffff811d34b0>] ? mem_cgroup_charge_common+0xc0/0xc0
[94346.777617] [<ffffffff8116da34>] pagefault_out_of_memory+0x14/0x90
[94346.777620] [<ffffffff8162f4bf>] mm_fault_error+0x68/0x12b
[94346.777625] [<ffffffff81642012>] __do_page_fault+0x3e2/0x450
[94346.777628] [<ffffffff816420a3>] do_page_fault+0x23/0x80
[94346.777633] [<ffffffff8163e308>] page_fault+0x28/0x30
[94346.777636] Task in /system.slice/_elasticsearch.scope killed as a
result of limit of /system.slice/_elasticsearch.scope
[94346.777639] memory: usage 1433600kB, limit 1433600kB, failcnt 45132
[94346.777640] memory+swap: usage 1433600kB, limit 9007199254740991kB, failcnt 0
[94346.777642] kmem: usage 0kB, limit 9007199254740991kB, failcnt 0
[94346.777643] Memory cgroup stats for
/system.slice/_elasticsearch.scope: cache:148KB rss:1433452KB
rss_huge:1126400KB mapped_file:4KB swap:0KB inactive_anon:0KB
active_anon:1433452KB inactive_file:108KB active_file:20KB
unevictable:0KB
We adjust the memory limit and avoid this process is killed. Now the
problem goes away. If possible, we still want to understand why Ext4
is stuck in this case.
Thank you,
Roy
On Sat, Aug 27, 2016 at 6:36 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Thu, Aug 25, 2016 at 11:52:07PM -0700, Roy Yang wrote:
>> I need your help to debug one ext4 issue. We consistently see Ext4
>> stuck at wait_transaction_locked after another process is killed by
>> cgroup because of oom. We have two processes keeping writing data to
>> the same disk, and one was killed because of oom; the other process
>> will stall at all I/O operations pretty soon.
>
> You're using an ancient, 3.10-based RHEL 7 kernel:
>
>> Linux sedhaswell04-node-1 3.10.0-327.22.2.el7.cohesity.x86_64 #1 SMP
>> Tue Jul 5 12:41:09 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
>
> As far as I know this bug does not exist in the upstream kernel ---
> but the 3.10 kernel was released in June 2013, and since then changes
> are the responsibility of Red Hat / CentOS. So you would need to get
> support from Red Hat, since they have made a huge number of changes to
> the kernel.
>
> If you had given us the stack trace from the task that got OOM-killed,
> we might be able to take a quick look, but if you use a distribution
> kernel, it is the responsibility of the distribution to support you
> --- this is, after all, why they get paid the big bucks. :-)
>
> If you want to you use the latest upstream kernel, we would be much
> more likely to help, although of course unlike Red Hat we don't have
> any kind of guaranteed response time. For that, you would need to go
> find a distribution and pay the aforementioned big bucks. :-)
>
> Cheers,
>
> - Ted
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Ext4 stuck at wait_transaction_locked
2016-08-27 14:45 ` Roy Yang
@ 2016-08-27 16:14 ` Theodore Ts'o
[not found] ` <CAFz9U7RxRLLB7dK7YtEVY9qWKKnQ-AbwpL0MT98jx3urbafJmA@mail.gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2016-08-27 16:14 UTC (permalink / raw)
To: Roy Yang; +Cc: linux-ext4
On Sat, Aug 27, 2016 at 07:45:57AM -0700, Roy Yang wrote:
> Hi Ted,
> Thank you very much for reply.
> The stack trace of the process is killed is:
Hmm, are you sure this is the only process that got oom-killed? What
I was expecting was a process with a stack trace that showed it was in
the middle of ext4 or jbd2 code at the time when it tried to do a
memory allocation and this either caused an OOM kill, or an EXT4-fs
error report when a memory allocation failed.
> We adjust the memory limit and avoid this process is killed. Now the
> problem goes away. If possible, we still want to understand why Ext4
> is stuck in this case.
What is probably happening is that a memory allocation failed in the
middle of an ext4 operation, and it was holding jbd2 handle at the
time of the failure. The error handling path didn't release the
handle, and so the next time the jbd2 subsystem tried to do a
transaction commit, it waited for all of the handles to complete, and
since there was a handle leak due to the allocation failure / OOM
kill, it caused the jbd2 commit code to wait forever.
If that was the only process that got OOM killed, were there any ext4
error or warning messages before the OOM kill, or maybe slightly after
it?
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Ext4 stuck at wait_transaction_locked
[not found] ` <CAFz9U7RxRLLB7dK7YtEVY9qWKKnQ-AbwpL0MT98jx3urbafJmA@mail.gmail.com>
@ 2016-08-27 19:02 ` Theodore Ts'o
2016-08-27 20:12 ` Roy Yang
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2016-08-27 19:02 UTC (permalink / raw)
To: Roy Yang; +Cc: linux-ext4
On Sat, Aug 27, 2016 at 09:29:51AM -0700, Roy Yang wrote:
> Yes, that is the process which was killed by oom killer since it is
> exceeding 1.4GB memory.
> The full dmesg and also processes stuck at ext4
> start_this_handle/wait_transaction_locked
> are attached.
Yeah, unfortunately it's not helpful. My guess is that it's somewhere
in the call chain of page fault code calling ext4_page_mkwrite() where
we have a bad error handling path, but finding it will be a pain.
Since you have a reliable repro, would you be willing to try and
reproduce the problem with the latest upstream kernel? If you can
reproduce it there, I'll promise to help you instrument the kernel so
we can track down the problem.
> I also searched Redhat support portal, online resource. Similar issue
> was found but not sure whether
> upstream patch 2c869b262a10ca99cb866d04087d75311587a30c mentioned at
> https://groups.google.com/a/chromium.org/forum/#!topic/chromium-os-checkins/y6n90Mqop7A
> has addressed the problem.
No, this patch fixes a problem with online resize, which has nothing
to do with what you are seeing.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Ext4 stuck at wait_transaction_locked
2016-08-27 19:02 ` Theodore Ts'o
@ 2016-08-27 20:12 ` Roy Yang
0 siblings, 0 replies; 6+ messages in thread
From: Roy Yang @ 2016-08-27 20:12 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-ext4
Thanks Ted,
I will work with our team to reproduce this with upstream kernel.
Will update you after we get this after next week.
Best Regards
Roy
On Sat, Aug 27, 2016 at 12:02 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Sat, Aug 27, 2016 at 09:29:51AM -0700, Roy Yang wrote:
>> Yes, that is the process which was killed by oom killer since it is
>> exceeding 1.4GB memory.
>> The full dmesg and also processes stuck at ext4
>> start_this_handle/wait_transaction_locked
>> are attached.
>
> Yeah, unfortunately it's not helpful. My guess is that it's somewhere
> in the call chain of page fault code calling ext4_page_mkwrite() where
> we have a bad error handling path, but finding it will be a pain.
>
> Since you have a reliable repro, would you be willing to try and
> reproduce the problem with the latest upstream kernel? If you can
> reproduce it there, I'll promise to help you instrument the kernel so
> we can track down the problem.
>
>> I also searched Redhat support portal, online resource. Similar issue
>> was found but not sure whether
>> upstream patch 2c869b262a10ca99cb866d04087d75311587a30c mentioned at
>> https://groups.google.com/a/chromium.org/forum/#!topic/chromium-os-checkins/y6n90Mqop7A
>> has addressed the problem.
>
> No, this patch fixes a problem with online resize, which has nothing
> to do with what you are seeing.
>
> - Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-08-27 20:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-26 6:52 Ext4 stuck at wait_transaction_locked Roy Yang
2016-08-27 13:36 ` Theodore Ts'o
2016-08-27 14:45 ` Roy Yang
2016-08-27 16:14 ` Theodore Ts'o
[not found] ` <CAFz9U7RxRLLB7dK7YtEVY9qWKKnQ-AbwpL0MT98jx3urbafJmA@mail.gmail.com>
2016-08-27 19:02 ` Theodore Ts'o
2016-08-27 20:12 ` Roy Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).