From: Tomasz Chmielewski <tch@virtall.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: kernel crashes with btrfs and busy database IO - how to debug?
Date: Thu, 11 Jun 2015 20:33:41 +0900 [thread overview]
Message-ID: <ae9b9ca434c98509ca6c1ba6dbd84b63@admin.virtall.com> (raw)
I have a server where I've installed a couple of LXC guests, btrfs - so
easy to test things with snapshots. Or so it seems.
Unfortunately the box crashes when I put "too much IO load" - with too
much load being these two running at the same time:
- quite busy MySQL database (doing up to 100% IO wait when running
alone)
- busy mongo database (doing up to 100% IO wait when running alone)
With both mongo and mysql running at the same time, it crashes after 1-2
days (tried kernels 4.0.4, 4.0.5, 4.1-rc7 from Ubuntu "kernel-ppa"). It
does not crash if I only run mongo, or only mysql. There is plenty of
memory available (just around 2-4 GB used out of 32 GB) when it crashes.
As the box is only reachable remotely, I'm not able to catch a crash.
Sometimes, I'm able to get a bit of it printed via remote SSH, like
here:
[162276.341030] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[162276.341069] IP: [<ffffffff810c06cd>]
prepare_to_wait_event+0xcd/0x100
[162276.341096] PGD 80a15e067 PUD 6e08c2067 PMD 0
[162276.341116] Oops: 0002 [#1] SMP
[162276.341133] Modules linked in: xfs libcrc32c xt_conntrack veth
xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc
intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
aesni_intel aes_x86_64 lrw eeepc_wmi gf128mul asus_wmi glue_helper
sparse_keymap ablk_helper cryptd ie31200_edac shpchp lpc_ich edac_core
mac_hid 8250_fintek tpm_infineon wmi serio_raw video lp parport btrfs
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq e1000e raid1 raid0 ahci ptp libahci multipath
pps_core linear [last unloaded: xfs]
[162276.341394] CPU: 6 PID: 12853 Comm: mysqld Not tainted
4.1.0-040100rc7-generic #201506080035
[162276.341428] Hardware name: System manufacturer System Product
Name/P8B WS, BIOS 0904 10/24/2011
[162276.341463] task: ffff8800730d8a10 ti: ffff88047a0f8000 task.ti:
ffff88047a0f8000
[162276.341495] RIP: 0010:[<ffffffff810c06cd>] [<ffffffff810c06cd>]
prepare_to_wait_event+0xcd/0x100
[162276.341532] RSP: 0018:ffff88047a0fbcd8 EFLAGS: 00010046
[162276.341583] RDX: ffff88047a0fbd48 RSI: ffff8800730d8a10 RDI:
ffff8801e2f96ee8
[162276.341615] RBP: ffff88047a0fbd08 R08: 0000000000000000 R09:
0000000000000001
[162276.341646] R10: 0000000000000001 R11: 0000000000000000 R12:
ffff8801e2f96ee8
[162276.341678] R13: 0000000000000002 R14: ffff8801e2f96e60 R15:
ffff8806b513f248
[162276.341709] FS: 00007f9f2bbd3700(0000) GS:ffff88082fb80000(0000)
knlGS:0000000000000000
Remote syslog does not capture anything.
The above crash does not point at btrfs - although the box does not
crash with the same tests done on ext4. The box passes memtests and is
generally stable otherwise.
How can I debug this further?
"prepare_to_wait_event" can be found here in 4.1-rc7 kernel:
include/linux/wait.h: long __int = prepare_to_wait_event(&wq,
&__wait, state);\
include/linux/wait.h:long prepare_to_wait_event(wait_queue_head_t *q,
wait_queue_t *wait, int state);
kernel/sched/wait.c:long prepare_to_wait_event(wait_queue_head_t *q,
wait_queue_t *wait, int state)
kernel/sched/wait.c:EXPORT_SYMBOL(prepare_to_wait_event);
--
Tomasz Chmielewski
http://wpkg.org
next reply other threads:[~2015-06-11 11:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-11 11:33 Tomasz Chmielewski [this message]
2015-06-12 7:13 ` kernel crashes with btrfs and busy database IO - how to debug? Qu Wenruo
2015-06-12 8:35 ` Tomasz Chmielewski
2015-06-12 9:09 ` Qu Wenruo
2015-06-12 23:23 ` Tomasz Chmielewski
2015-06-14 0:30 ` Tomasz Chmielewski
2015-06-14 7:58 ` Tomasz Chmielewski
2015-06-15 8:10 ` Qu Wenruo
2015-06-15 10:31 ` Tomasz Chmielewski
2015-06-12 7:53 ` Duncan
2015-06-12 16:26 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae9b9ca434c98509ca6c1ba6dbd84b63@admin.virtall.com \
--to=tch@virtall.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.