All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Elder <elder@dreamhost.com>
To: Travis Rhoden <trhoden@gmail.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: kernel crash from RBD in Ubuntu 12.04
Date: Tue, 19 Jun 2012 13:45:23 -0500	[thread overview]
Message-ID: <4FE0C8C3.9020603@dreamhost.com> (raw)
In-Reply-To: <CACkq2mpAZ9-g+Y2peGhP7S2nVCY+j+TCJ9jK6Yu1jMBAuxpY+g@mail.gmail.com>

On 06/19/2012 01:32 PM, Travis Rhoden wrote:
> Hey folks,
> 
> Ran into this today.  Not sure what I did wrong.  =)

It appears you are running Linux 3.2.0.  This has symptoms that
could be explained by a bug that has been fixed in newer Ceph
code.  Specifically, I think this is the fix that, without it,
you might see something like this:

    rbd: don't drop the rbd_id too early

https://github.com/ceph/ceph-client/commit/32eec68d2f233e8a6ae1cd326022f6862e2b9ce3


					-Alex

> I had an RBD successfully mounted and was done with it.  Proceeded to
> do the following:
> 
> root@spcnode2:~# ls /sys/bus/rbd/devices/
> 0
> root@spcnode2:~# echo 0 > /sys/bus/rbd/remove
> root@spcnode2:~# ls /sys/bus/rbd/devices/      <--- At this point, I
> believe the RBD has been successfully removed
> 
> ----  About an hour passes where I am messing with my ceph cluster.
> No other commands are run on this machine ----
> ----  New cluster is up.  Time to mount my new RBD
> 
> root@spcnode2:~# echo "10.55.30.0,10.55.30.1,10.55.30.2
> name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd
> perftest" | tee /sys/bus/rbd/add
> 10.55.30.0,10.55.30.1,10.55.30.2
> name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd
> perftest
> Segmentation fault
> 
> Well that's ugly.  What's in syslog?
> 
> Jun 19 11:16:56 spcnode2 kernel: [76564.387890] ------------[ cut here
> ]------------
> Jun 19 11:16:56 spcnode2 kernel: [76564.392569] WARNING: at
> /build/buildd/linux-3.2.0/fs/sysfs/inode.c:324
> sysfs_hash_and_remove+0xa9/0xb0()
> Jun 19 11:16:56 spcnode2 kernel: [76564.402233] Hardware name: Relion 1702
> Jun 19 11:16:56 spcnode2 kernel: [76564.406079] sysfs: can not remove
> 'bdi', no directory
> Jun 19 11:16:56 spcnode2 kernel: [76564.411268] Modules linked in: rbd
> libceph ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
> xt_state ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp xt_conntrack
> iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4
> ipmi_devintf ipmi_si iptable_filter ipmi_msghandler ip_tables x_tables
> kvm_intel kvm bnep rfcomm bluetooth parport_pc ppdev nfsd nfs lockd
> fscache auth_rpcgss nfs_acl sunrpc ext2 xfs vesafb ib_iser rdma_cm
> ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp
> libiscsi scsi_transport_iscsi bridge mtdchar i7core_edac psmouse 8021q
> garp stp lp parport dm_multipath mac_hid serio_raw edac_core ioatdma
> usbhid hid sfc mtd i2c_algo_bit igb mdio dca btrfs zlib_deflate
> libcrc32c
> Jun 19 11:16:56 spcnode2 kernel: [76564.477972] Pid: 6924, comm: bash
> Tainted: G      D W    3.2.0-25-generic #40-Ubuntu
> Jun 19 11:16:56 spcnode2 kernel: [76564.485837] Call Trace:
> Jun 19 11:16:56 spcnode2 kernel: [76564.488394]  [<ffffffff810672af>]
> warn_slowpath_common+0x7f/0xc0
> Jun 19 11:16:56 spcnode2 kernel: [76564.494511]  [<ffffffff810673a6>]
> warn_slowpath_fmt+0x46/0x50
> Jun 19 11:16:56 spcnode2 kernel: [76564.500348]  [<ffffffff81192958>]
> ? iput_final+0xe8/0x210
> Jun 19 11:16:56 spcnode2 kernel: [76564.505888]  [<ffffffff811ebc59>]
> sysfs_hash_and_remove+0xa9/0xb0
> Jun 19 11:16:56 spcnode2 kernel: [76564.512082]  [<ffffffff811ee356>]
> sysfs_remove_link+0x26/0x30
> Jun 19 11:16:56 spcnode2 kernel: [76564.517959]  [<ffffffff812fb960>]
> del_gendisk+0x100/0x260
> Jun 19 11:16:56 spcnode2 kernel: [76564.523448]  [<ffffffffa0623868>]
> rbd_dev_release+0x108/0x110 [rbd]
> Jun 19 11:16:56 spcnode2 kernel: [76564.529861]  [<ffffffff813f1407>]
> device_release+0x27/0xa0
> Jun 19 11:16:56 spcnode2 kernel: [76564.535432]  [<ffffffff8130cfdc>]
> kobject_release+0x4c/0xa0
> Jun 19 11:16:56 spcnode2 kernel: [76564.541163]  [<ffffffff8130cf90>]
> ? kobject_del+0x40/0x40
> Jun 19 11:16:56 spcnode2 kernel: [76564.546694]  [<ffffffff8130e686>]
> kref_put+0x36/0x70
> Jun 19 11:16:56 spcnode2 kernel: [76564.551764]  [<ffffffff8130ce97>]
> kobject_put+0x27/0x60
> Jun 19 11:16:56 spcnode2 kernel: [76564.557126]  [<ffffffff8131d33c>]
> ? _kstrtoull+0x2c/0x90
> Jun 19 11:16:56 spcnode2 kernel: [76564.562523]  [<ffffffff813f1167>]
> put_device+0x17/0x20
> Jun 19 11:16:56 spcnode2 kernel: [76564.567808]  [<ffffffff813f225e>]
> device_unregister+0x1e/0x30
> Jun 19 11:16:56 spcnode2 kernel: [76564.573647]  [<ffffffffa06211ea>]
> rbd_remove+0x15a/0x160 [rbd]
> Jun 19 11:16:56 spcnode2 kernel: [76564.579594]  [<ffffffff813f3c47>]
> bus_attr_store+0x27/0x30
> Jun 19 11:16:56 spcnode2 kernel: [76564.585113]  [<ffffffff811ebebf>]
> sysfs_write_file+0xef/0x170
> Jun 19 11:16:56 spcnode2 kernel: [76564.590907]  [<ffffffff81177f23>]
> vfs_write+0xb3/0x180
> Jun 19 11:16:56 spcnode2 kernel: [76564.596158]  [<ffffffff8117824a>]
> sys_write+0x4a/0x90
> Jun 19 11:16:56 spcnode2 kernel: [76564.601258]  [<ffffffff81665c42>]
> system_call_fastpath+0x16/0x1b
> Jun 19 11:16:56 spcnode2 kernel: [76564.607321] ---[ end trace
> ace27f1cbf93eeaa ]---
> Jun 19 11:16:57 spcnode2 kernel: [76564.612447] BUG: unable to handle
> kernel NULL pointer dereference at 0000000000000079
> Jun 19 11:16:57 spcnode2 kernel: [76564.620374] IP:
> [<ffffffff811ed770>] sysfs_find_dirent+0x10/0x110
> Jun 19 11:16:57 spcnode2 kernel: [76564.626475] PGD 404514067 PUD
> 5f89cc067 PMD 0
> Jun 19 11:16:57 spcnode2 kernel: [76564.630958] Oops: 0000 [#2] SMP
> Jun 19 11:16:57 spcnode2 kernel: [76564.634254] CPU 5
> Jun 19 11:16:57 spcnode2 kernel: [76564.636113] Modules linked in: rbd
> libceph ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
> xt_state ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp xt_conntrack
> iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4
> ipmi_devintf ipmi_si iptable_filter ipmi_msghandler ip_tables x_tables
> kvm_intel kvm bnep rfcomm bluetooth parport_pc ppdev nfsd nfs lockd
> fscache auth_rpcgss nfs_acl sunrpc ext2 xfs vesafb ib_iser rdma_cm
> ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp
> libiscsi scsi_transport_iscsi bridge mtdchar i7core_edac psmouse 8021q
> garp stp lp parport dm_multipath mac_hid serio_raw edac_core ioatdma
> usbhid hid sfc mtd i2c_algo_bit igb mdio dca btrfs zlib_deflate
> libcrc32c
> Jun 19 11:16:57 spcnode2 kernel: [76564.701251]
> Jun 19 11:16:57 spcnode2 kernel: [76564.702740] Pid: 6924, comm: bash
> Tainted: G      D W    3.2.0-25-generic #40-Ubuntu Penguin Computing
> Relion 1702/X8DTT
> Jun 19 11:16:57 spcnode2 kernel: [76564.713752] RIP:
> 0010:[<ffffffff811ed770>]  [<ffffffff811ed770>]
> sysfs_find_dirent+0x10/0x110
> Jun 19 11:16:57 spcnode2 kernel: [76564.722319] RSP:
> 0018:ffff8805f8f9bc58  EFLAGS: 00010246
> Jun 19 11:16:57 spcnode2 kernel: [76564.727719] RAX: ffff8806186edbc0
> RBX: 0000000000000000 RCX: 00000000000988e6
> Jun 19 11:16:57 spcnode2 kernel: [76564.734892] RDX: ffffffff81a0158d
> RSI: 0000000000000000 RDI: 0000000000000000
> Jun 19 11:16:57 spcnode2 kernel: [76564.742083] RBP: ffff8805f8f9bc78
> R08: ffffea00303f6580 R09: ffffffff8130cfe9
> Jun 19 11:16:57 spcnode2 kernel: [76564.749221] R10: ffff880c0fe5de28
> R11: 0000000000000000 R12: 0000000000000000
> Jun 19 11:16:57 spcnode2 kernel: [76564.756437] R13: ffffffff81a0158d
> R14: ffff880bf45a5a50 R15: ffff880c0fd1de18
> Jun 19 11:16:57 spcnode2 kernel: [76564.763630] FS:
> 00007fe308eb7700(0000) GS:ffff880c3fc20000(0000)
> knlGS:0000000000000000
> Jun 19 11:16:57 spcnode2 kernel: [76564.771717] CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jun 19 11:16:57 spcnode2 kernel: [76564.777549] CR2: 0000000000000079
> CR3: 00000005f89cd000 CR4: 00000000000006e0
> Jun 19 11:16:57 spcnode2 kernel: [76564.784738] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Jun 19 11:16:57 spcnode2 kernel: [76564.791877] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jun 19 11:16:57 spcnode2 kernel: [76564.798991] Process bash (pid:
> 6924, threadinfo ffff8805f8f9a000, task ffff8806186edbc0)
> Jun 19 11:16:57 spcnode2 kernel: [76564.807295] Stack:
> Jun 19 11:16:57 spcnode2 kernel: [76564.809302]  0000000000000000
> 0000000000000000 ffffffff81a0158d ffff880bf45a5a50
> Jun 19 11:16:57 spcnode2 kernel: [76564.816832]  ffff8805f8f9bca8
> ffffffff811ed9bc ffff8805f8f9bcd8 ffffffff81c34b00
> Jun 19 11:16:57 spcnode2 kernel: [76564.824341]  ffff880605b36878
> 0000000000000000 ffff8805f8f9bce8 ffffffff811efa15
> Jun 19 11:16:57 spcnode2 kernel: [76564.831894] Call Trace:
> Jun 19 11:16:57 spcnode2 kernel: [76564.834337]  [<ffffffff811ed9bc>]
> sysfs_get_dirent+0x3c/0x80
> Jun 19 11:16:57 spcnode2 kernel: [76564.840041]  [<ffffffff811efa15>]
> sysfs_remove_group+0x35/0x100
> Jun 19 11:16:57 spcnode2 kernel: [76564.846029]  [<ffffffff810fee24>]
> blk_trace_remove_sysfs+0x14/0x20
> Jun 19 11:16:57 spcnode2 kernel: [76564.852195]  [<ffffffff812f50d9>]
> blk_unregister_queue+0x59/0x80
> Jun 19 11:16:57 spcnode2 kernel: [76564.858270]  [<ffffffff812fb97b>]
> del_gendisk+0x11b/0x260
> Jun 19 11:16:57 spcnode2 kernel: [76564.863661]  [<ffffffffa0623868>]
> rbd_dev_release+0x108/0x110 [rbd]
> Jun 19 11:16:57 spcnode2 kernel: [76564.869962]  [<ffffffff813f1407>]
> device_release+0x27/0xa0
> Jun 19 11:16:57 spcnode2 kernel: [76564.875448]  [<ffffffff8130cfdc>]
> kobject_release+0x4c/0xa0
> Jun 19 11:16:57 spcnode2 kernel: [76564.881061]  [<ffffffff8130cf90>]
> ? kobject_del+0x40/0x40
> Jun 19 11:16:57 spcnode2 kernel: [76564.886502]  [<ffffffff8130e686>]
> kref_put+0x36/0x70
> Jun 19 11:16:57 spcnode2 kernel: [76564.891521]  [<ffffffff8130ce97>]
> kobject_put+0x27/0x60
> Jun 19 11:16:57 spcnode2 kernel: [76564.896739]  [<ffffffff8131d33c>]
> ? _kstrtoull+0x2c/0x90
> Jun 19 11:16:57 spcnode2 kernel: [76564.902043]  [<ffffffff813f1167>]
> put_device+0x17/0x20
> Jun 19 11:16:57 spcnode2 kernel: [76564.907226]  [<ffffffff813f225e>]
> device_unregister+0x1e/0x30
> Jun 19 11:16:57 spcnode2 kernel: [76564.913057]  [<ffffffffa06211ea>]
> rbd_remove+0x15a/0x160 [rbd]
> Jun 19 11:16:57 spcnode2 kernel: [76564.918881]  [<ffffffff813f3c47>]
> bus_attr_store+0x27/0x30
> Jun 19 11:16:57 spcnode2 kernel: [76564.924436]  [<ffffffff811ebebf>]
> sysfs_write_file+0xef/0x170
> Jun 19 11:16:57 spcnode2 kernel: [76564.930174]  [<ffffffff81177f23>]
> vfs_write+0xb3/0x180
> Jun 19 11:16:57 spcnode2 kernel: [76564.935450]  [<ffffffff8117824a>]
> sys_write+0x4a/0x90
> Jun 19 11:16:57 spcnode2 kernel: [76564.940497]  [<ffffffff81665c42>]
> system_call_fastpath+0x16/0x1b
> Jun 19 11:16:57 spcnode2 kernel: [76564.946488] Code: 41 5c 41 5d 41
> 5e 41 5f 5d c3 90 4c 89 f7 e8 68 df 46 00 eb c3 0f 0b 0f 1f 40 00 55
> 48 89 e5 41 56 41 55 41 54 53 66 66 66 66 90 <80> 7f 79 00 4c 8b 67 70
> 49 89 d6 48 89 f3 0f 95 c0 48 85 f6 0f
> Jun 19 11:16:57 spcnode2 kernel: [76564.966571] RIP
> [<ffffffff811ed770>] sysfs_find_dirent+0x10/0x110
> Jun 19 11:16:57 spcnode2 kernel: [76564.972826]  RSP <ffff8805f8f9bc58>
> Jun 19 11:16:57 spcnode2 kernel: [76564.976331] CR2: 0000000000000079
> Jun 19 11:16:57 spcnode2 kernel: [76564.979725] ---[ end trace
> ace27f1cbf93eeab ]---
> 
> 
> Had to do a hard reset on the machine afterwards.
> 
> The machine mounting the RBD is running Ubuntu 12.04, and is not
> hosting any OSDs or MONs.
> root@spcnode2:~# uname -a
> Linux spcnode2 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC
> 2012 x86_64 x86_64 x86_64 GNU/Linux
> root@spcnode2:~# ceph --version
> ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372)
> 
> - Travis
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


  reply	other threads:[~2012-06-19 18:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-19 18:32 kernel crash from RBD in Ubuntu 12.04 Travis Rhoden
2012-06-19 18:45 ` Alex Elder [this message]
2012-06-19 18:50   ` Travis Rhoden
2012-06-19 23:33     ` Dan Mick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FE0C8C3.9020603@dreamhost.com \
    --to=elder@dreamhost.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=elder@inktank.com \
    --cc=trhoden@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.