From: Theodore Ts'o <tytso@mit.edu>
To: linux-ext4@vger.kernel.org, linux-aio@kvack.org
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Zach Brown <zab@zabbo.net>, Benjamin LaHaise <bcrl@kvack.org>
Subject: RCU related deadlock in AIO? (in 3.18-rc2)
Date: Wed, 29 Oct 2014 07:06:39 -0400 [thread overview]
Message-ID: <nsxk33jjfsw.fsf@closure.thunk.org> (raw)
I have a vague memory of seeing discussion about a stack trace that
looks similar to this in the past few weeks. Does this ring a bell at
all?
The following hang happened while running xfstests generic/323 under
ext4 in no-journal mode, running on a 3.18-rc2 kernel. This is a
recently added test, to deal with a proble which I thought had already
been fixed in mainline.
# Run aio-last-ref-held-by-io - last put of ioctx not in process
# context. We've had a couple of instances in the past where having the
# last reference to an ioctx be held by the IO (instead of the
# process) would cause problems (hung system, crashes).
Thanks,
- Ted
generic/323 123s ... [06:42:18][30020.393779] ------------[ cut here ]------------
[30020.394962] WARNING: CPU: 1 PID: 0 at /usr/projects/linux/ext4/lib/percpu-refcount.c:151 percpu_ref_switch_to_atomic_rcu+0x9f/0xbc()
[30020.397073] percpu ref (free_ioctx_reqs) <= 0 (-65534) after switching to atomic
[30020.397073] Modules linked in:
[30020.397073] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.18.0-rc2-00005-gb790865 #2277
[30020.401678] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[30020.402522] 00000000 00000000 f553ff28 c088aba8 f553ff54 f553ff44 c0170af6 00000097
[30020.406004] c046cf41 df614874 ca3cd9f8 7fff0004 f553ff5c c0170b38 00000009 f553ff54
[30020.408552] c0b9dcf8 f553ff70 f553ff88 c046cf41 c0b9dc8d 00000097 c0b9dcf8 c028233d
[30020.410484] Call Trace:
[30020.410828] [<c088aba8>] dump_stack+0x48/0x60
[30020.411578] [<c0170af6>] warn_slowpath_common+0x6c/0x83
[30020.412701] [<c046cf41>] ? percpu_ref_switch_to_atomic_rcu+0x9f/0xbc
[30020.413541] [<c0170b38>] warn_slowpath_fmt+0x2b/0x2f
[30020.413715] [<c046cf41>] percpu_ref_switch_to_atomic_rcu+0x9f/0xbc
[30020.414267] [<c028233d>] ? rcu_lock_acquire+0x1c/0x1c
[30020.415101] [<c01b767b>] rcu_process_callbacks+0x2b5/0x433
[30020.415763] [<c01b767b>] ? rcu_process_callbacks+0x2b5/0x433
[30020.417024] [<c046cea2>] ? percpu_ref_reinit+0x50/0x50
[30020.417273] [<c0173f8f>] ? __local_bh_disable_ip+0x6e/0x6e
[30020.417959] [<c01740fc>] __do_softirq+0x16d/0x376
[30020.418688] [<c0173f8f>] ? __local_bh_disable_ip+0x6e/0x6e
[30020.419324] [<c01381ec>] do_softirq_own_stack+0x26/0x2c
[30020.420372] <IRQ> [<c01744f9>] irq_exit+0x42/0x8f
[30020.420796] [<c0137b94>] do_IRQ+0x89/0x9d
[30020.421606] [<c08943f1>] common_interrupt+0x31/0x38
[30020.422162] [<c013007b>] ? add_atomic_switch_msr+0x173/0x173
[30020.422850] [<c0164b9c>] ? native_safe_halt+0x5/0x7
[30020.423696] [<c013dee4>] default_idle+0x22/0x4b
[30020.424321] [<c013e49a>] arch_cpu_idle+0xe/0x10
[30020.424967] [<c019b7f0>] cpu_startup_entry+0x130/0x250
[30020.425604] [<c015a3a3>] start_secondary+0x15d/0x166
[30020.426333] ---[ end trace 4f7947d25059a4b4 ]---
[30240.260397] INFO: task aio-last-ref-he:9183 blocked for more than 120 seconds.[30240.261418] Tainted: G W 3.18.0-rc2-00005-gb790865 #2277
[30240.262399] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[30240.263575] d7e99e78 00000046 c0f6b280 c0f6b280 a9372026 00001b4d cdfc5800 00001b4d
[30240.264979] d9758350 f6250880 d7e99e60 c0192666 000396f0 00000000 a9371bab 00001b4d
[30240.266194] a9371cc6 00001b4d a9371bab 00001b4d d97588c4 c0f6b880 00000001 d7e99e8c
[30240.267449] Call Trace:
[30240.267795] [<c0192666>] ? sched_clock_local+0x10/0x10e
[30240.268515] [<c019298f>] ? sched_clock_cpu+0xd1/0xec
[30240.269207] [<c0192666>] ? sched_clock_local+0x10/0x10e
[30240.269929] [<c088fdaf>] schedule+0x5a/0x5c
[30240.270530] [<c0892b2b>] schedule_timeout+0x1a/0x9c
[30240.271245] [<c01929f4>] ? local_clock+0x18/0x22
[30240.271890] [<c01a0085>] ? lock_release_holdtime.part.25+0x60/0x6d
[30240.272736] [<c089343b>] ? _raw_spin_unlock_irq+0x27/0x36
[30240.273480] [<c01a1a20>] ? trace_hardirqs_on_caller+0x15f/0x17a
[30240.274329] [<c01a1a46>] ? trace_hardirqs_on+0xb/0xd
[30240.275054] [<c0893440>] ? _raw_spin_unlock_irq+0x2c/0x36
[30240.275798] [<c0890584>] __wait_for_common+0xcb/0xf5
[30240.276485] [<c0892b11>] ? console_conditional_schedule+0x29/0x29
[30240.277343] [<c01909aa>] ? wake_up_state+0x11/0x11
[30240.278016] [<c08905c7>] wait_for_completion+0x19/0x1c
[30240.278727] [<c0282f60>] SYSC_io_destroy+0x7a/0xaf
[30240.279409] [<c08904ea>] ? __wait_for_common+0x31/0xf5
[30240.280116] [<c02837dc>] SyS_io_destroy+0x10/0x12
[30240.280815] [<c046895a>] ? trace_hardirqs_on_thunk+0xc/0x10
[30240.281583] [<c02837dc>] ? SyS_io_destroy+0x10/0x12
[30240.282271] [<c0893e2a>] syscall_call+0x7/0x7
[30240.282881] [<c0890000>] ? schedule_preempt_disabled+0xd/0x1d
[30240.283692] no locks held by aio-last-ref-he/9183.
next reply other threads:[~2014-10-29 11:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-29 11:06 Theodore Ts'o [this message]
2014-10-29 12:57 ` RCU related deadlock in AIO? (in 3.18-rc2) Paul E. McKenney
2014-10-29 16:21 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nsxk33jjfsw.fsf@closure.thunk.org \
--to=tytso@mit.edu \
--cc=bcrl@kvack.org \
--cc=linux-aio@kvack.org \
--cc=linux-ext4@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=zab@zabbo.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.