All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <kernel@kyup.com>
To: linux-fsdevel@vger.kernel.org
Cc: "Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
	Jan Kara <jack@suse.com>,
	darrick.wong@oracle.com,
	SiteGround Operations <operations@siteground.com>
Subject: Crash in jbd2_chksum due to null journal->j_chksum_driver
Date: Wed, 30 Sep 2015 16:35:49 +0300	[thread overview]
Message-ID: <560BE535.9080604@kyup.com> (raw)

Hello, 

Today a colleague was testing something and while doing so he observed 
the following crash: 

jbd2_journal_bmap: journal block not found at offset 67 on dm-26-8
Aborting journal on device dm-26-8.
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
PGD 3fcef54067 PUD 3fce84e067 PMD 0 
Oops: 0000 [#1] SMP 
Modules linked in: act_police cls_basic sch_ingress veth dm_snapshot openvswitch gre vxlan ip_tunnel xt_owner xt_conntrack iptable_mangle xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_CT nf_conntrack iptable_raw ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ext2 dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mirror dm_region_hash dm_log ses enclosure igb i2c_algo_bit x86_pkg_temp_thermal crc32_pclmul i2c_i801 lpc_ich mfd_core ioapic ioatdma dca shpchp ipmi_devintf ipmi_si ipmi_msghandler
CPU: 0 PID: 12059 Comm: jbd2/dm-26-8 Not tainted 3.12.47-clouder1 #1
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
task: ffff883f904958b0 ti: ffff883fce4d8000 task.ti: ffff883fce4d8000
RIP: 0010:[<ffffffff812b12eb>]  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
RSP: 0018:ffff883fce4d9a58  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff883f8dd77000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: ffff883f8dd77000 RDI: ffff883fa0fc6800
RBP: ffff883fce4d9a88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000f0459c0b
R13: 0000000000000411 R14: ffff883f8dd77000 R15: 00000000560bb55d
FS:  0000000000000000(0000) GS:ffff881fffa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000003fd145d000 CR4: 00000000001407f0
Stack:
 ffffffff81e07402 ffff883fa0fc6800 00000000fffffffb ffff883fce4d9b90
 ffff883f8dd77000 ffff883fa0fc6800 ffff883fce4d9aa8 ffffffff812b1369
 0000000000000010 ffff883f90c772d8 ffff883fce4d9ae8 ffffffff812b1455
Call Trace:
 [<ffffffff812b1369>] jbd2_superblock_csum_set+0x29/0x40
 [<ffffffff812b1455>] jbd2_write_superblock+0x85/0x1b0
 [<ffffffff812b1b70>] jbd2_journal_update_sb_errno+0x50/0x60
 [<ffffffff812b1bd0>] __journal_abort_soft+0x50/0x60
 [<ffffffff812b1c80>] jbd2_journal_bmap+0x90/0xa0
 [<ffffffff812b1ec7>] jbd2_journal_next_log_block+0x77/0x80
 [<ffffffff812b1ef3>] jbd2_journal_get_descriptor_buffer+0x23/0xb0
 [<ffffffff812aa02c>] journal_submit_commit_record+0x7c/0x1e0
 [<ffffffff812abade>] jbd2_journal_commit_transaction+0x194e/0x1d20
 [<ffffffff812b062f>] kjournald2+0xef/0x2b0
 [<ffffffff810aef00>] ? wake_up_bit+0x40/0x40
 [<ffffffff812b0540>] ? commit_timeout+0x10/0x10
 [<ffffffff810ae48e>] kthread+0xce/0xe0
 [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
 [<ffffffff816571c8>] ret_from_fork+0x58/0x90
 [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
Code: 55 48 89 e5 41 54 53 48 83 ec 20 0f 1f 44 00 00 44 8b a6 fc 00 00 00 48 89 f3 c7 86 fc 00 00 00 00 00 00 00 48 8b 87 d0 04 00 00 <83> 38 04 77 39 48 89 45 d0 c7 45 d8 00 00 00 00 48 8d 7d d0 c7 
RIP  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
 RSP <ffff883fce4d9a58>
CR2: 0000000000000000
---[ end trace e1bd94031f410b71 ]---

The ffffffff812b12eb address actually is jbd2_chksum and the 
instruction where the deference is happening in 
crypto_shash_descsize(), essentially journal->j_chksum_driver is NULL. 

Now, how we got ourselves in this situation - we have an lvm thin 
volume with ext4 fs and a container started from it,
then, while the container is running we invoke the following 
command to scrub its contents:

openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero | dd bs=64K of=/dev/volumegroupname/volumename


And then when we try to umount the volume we get the aforementioned 
crash. Naturally, because we overwrite the on-disk contents jbd2_journal_bmap 
fails which triggers the journal abort which wants to update the on-disk
errno, which naturally triggers a superblock checksum regeneration
and this goes BOOM. 

I looked around the code but couldn't figure out a code path
which allows the checksum driver to become null at runtime.

             reply	other threads:[~2015-09-30 13:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-30 13:35 Nikolay Borisov [this message]
2015-09-30 17:12 ` Crash in jbd2_chksum due to null journal->j_chksum_driver Darrick J. Wong
2015-09-30 18:13   ` Nikolay Borisov
2015-09-30 18:43     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=560BE535.9080604@kyup.com \
    --to=kernel@kyup.com \
    --cc=darrick.wong@oracle.com \
    --cc=jack@suse.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=operations@siteground.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.