From: "Richard W.M. Jones" <rjones@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>,
Jeff Moyer <jmoyer@redhat.com>,
msnitzer@redhat.com, Li Zefan <lizefan@huawei.com>,
Johannes Weiner <hannes@cmpxchg.org>,
cgroups@vger.kernel.org,
"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: __blkg_lookup oops with 4.2-rcX
Date: Fri, 4 Sep 2015 11:46:02 +0100 [thread overview]
Message-ID: <20150904104602.GN29283@redhat.com> (raw)
In-Reply-To: <20150902153255.GH22326@mtj.duckdns.org>
On Wed, Sep 02, 2015 at 11:32:55AM -0400, Tejun Heo wrote:
> Hello,
>
> On Wed, Sep 02, 2015 at 10:53:07AM -0400, Tejun Heo wrote:
> > On Sun, Aug 30, 2015 at 08:30:41AM -0400, Josh Boyer wrote:
> > I think the offending commit is 776687bce42b ("block, blk-mq: draining
> > can't be skipped even if bypass_depth was non-zero"). It looks like
> > the patch makes shutdown path travel data structure which is already
> > destroyed. Will post the fix soon.
>
> Hmm... I can't reproduce it here or see how such oops would happen.
>
> * Is the problem reproducible on v4.2? If so, can you please describe
> the steps to reproduce? How is cgroup set up?
We have a test suite which does a lot of filesystem and device
operations, and this triggers it randomly (not reliably nor in the
same place every time, but still pretty frequently).
So .. I don't have steps that can reproduce it reliably unfortunately.
However I'm going to work on that now to see if I can create a
sequence of operations that triggers it some or all of the time.
> * Can you please run gdb or addr2line on it and report which line is
> causing the oops?
Below is another stack trace that I just collected. It came from a
test that does some hotplugging of a virtual machine. The kernel this
time is 4.2.0-0.rc3.git4.1.fc24.x86_64 (which is a bit old - am also
going to upgrade to the newest kernel soon).
The addr2line output from this one is:
$ addr2line -e /usr/lib/debug/lib/modules/4.2.0-0.rc3.git4.1.fc24.x86_64/vmlinux ffffffff814107a0
/usr/src/debug/kernel-4.1.fc24/linux-4.2.0-0.rc3.git4.1.fc24.x86_64/block/blk-throttle.c:1642
1636 /*
1637 * Drain each tg while doing post-order walk on the blkg tree, s 1637 o
1638 * that all bios are propagated to td->service_queue. It'd be
1639 * better to walk service_queue tree directly but blkg walk is
1640 * easier.
1641 */
1642 blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
1643 tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
1644
Rich.
[ 6.784689] BUG: unable to handle kernel NULL pointer dereference at 0000000000000bb8
[ 6.787605] IP: [<ffffffff814107a0>] blk_throtl_drain+0x80/0x220
[ 6.789797] PGD 0
[ 6.790598] Oops: 0000 [#1] SMP
[ 6.791848] Modules linked in: kvm_intel kvm snd_pcsp snd_pcm snd_timer snd ghash_clmulni_intel soundcore joydev ata_generic serio_raw pata_acpi libcrc32c crc8 crc_itu_t crc_ccitt virtio_pci virtio_mmio virtio_input virtio_balloon virtio_scsi sym53c8xx scsi_transport_spi megaraid_sas megaraid_mbox megaraid_mm megaraid ideapad_laptop rfkill sparse_keymap video virtio_net virtio_gpu ttm drm_kms_helper drm virtio_console virtio_rng virtio_blk virtio_ring virtio crc32 crct10dif_pclmul crc32c_intel crc32_pclmul
[ 6.809710] CPU: 0 PID: 27 Comm: kworker/0:1 Not tainted 4.2.0-0.rc3.git4.1.fc24.x86_64 #1
[ 6.812650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[ 6.816068] Workqueue: events_freezable virtscsi_handle_event [virtio_scsi]
[ 6.818588] task: ffff88001dfb3a00 ti: ffff88001d090000 task.ti: ffff88001d090000
[ 6.821252] RIP: 0010:[<ffffffff814107a0>] [<ffffffff814107a0>] blk_throtl_drain+0x80/0x220
[ 6.824302] RSP: 0000:ffff88001d0939d8 EFLAGS: 00010046
[ 6.826213] RAX: 0000000000000000 RBX: ffff88001b8f6698 RCX: 00000000000000e0
[ 6.828743] RDX: 31e18f88fc458000 RSI: 0000000000000000 RDI: 0000000000000000
[ 6.831292] RBP: ffff88001d093a08 R08: 0000000000000000 R09: 0000000000000000
[ 6.833835] R10: ffff88001dfb3a00 R11: ffffffff81e58200 R12: ffff88001ba67200
[ 6.836380] R13: ffff88001b8f6698 R14: ffff88001b9ee1f0 R15: ffff88001b9ee0d0
[ 6.838920] FS: 0000000000000000(0000) GS:ffff88001ee00000(0000) knlGS:0000000000000000
[ 6.841781] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6.843838] CR2: 0000000000000bb8 CR3: 00000000180c4000 CR4: 00000000000006f0
[ 6.846383] Stack:
[ 6.847132] ffffffff81410756 ffff88001b9ee1f0 ffff88001d093a08 ffff88001b8f6698
[ 6.849950] ffffffff81ef5320 0000000000000000 ffff88001d093a28 ffffffff8140d5fd
[ 6.852746] ffff88001b8f6698 ffff88001b8f6698 ffff88001d093a58 ffffffff813e7839
[ 6.855562] Call Trace:
[ 6.856473] [<ffffffff81410756>] ? blk_throtl_drain+0x36/0x220
[ 6.858581] [<ffffffff8140d5fd>] blkcg_drain_queue+0x2d/0x60
[ 6.860639] [<ffffffff813e7839>] __blk_drain_queue+0xc9/0x1a0
[ 6.862741] [<ffffffff813e9218>] ? blk_queue_bypass_start+0x68/0xb0
[ 6.865029] [<ffffffff813e9222>] blk_queue_bypass_start+0x72/0xb0
[ 6.867236] [<ffffffff8140b539>] blkcg_deactivate_policy+0x39/0x100
[ 6.869513] [<ffffffff814173e0>] cfq_exit_queue+0xd0/0xf0
[ 6.871481] [<ffffffff813e5081>] elevator_exit+0x31/0x50
[ 6.873423] [<ffffffff813ef91e>] blk_release_queue+0x4e/0xc0
[ 6.875495] [<ffffffff814204aa>] kobject_release+0x7a/0x190
[ 6.877524] [<ffffffff8142035f>] kobject_put+0x2f/0x60
[ 6.879413] [<ffffffff813e7765>] blk_put_queue+0x15/0x20
[ 6.881351] [<ffffffff815bf324>] scsi_device_dev_release_usercontext+0xc4/0x120
[ 6.884010] [<ffffffff815bf260>] ? scsi_device_dev_release+0x20/0x20
[ 6.886297] [<ffffffff810cad3c>] execute_in_process_context+0x9c/0xb0
[ 6.888636] [<ffffffff815bf25c>] scsi_device_dev_release+0x1c/0x20
[ 6.890897] [<ffffffff81573706>] device_release+0x36/0xa0
[ 6.892867] [<ffffffff814204aa>] kobject_release+0x7a/0x190
[ 6.894901] [<ffffffff8142035f>] kobject_put+0x2f/0x60
[ 6.896772] [<ffffffff81573a47>] put_device+0x17/0x20
[ 6.898617] [<ffffffff815b050f>] scsi_device_put+0x2f/0x40
[ 6.900614] [<ffffffffa0155f61>] virtscsi_handle_event+0x101/0x1a0 [virtio_scsi]
[ 6.903284] [<ffffffff810cb3b2>] process_one_work+0x232/0x840
[ 6.905380] [<ffffffff810cb31b>] ? process_one_work+0x19b/0x840
[ 6.907522] [<ffffffff8112553d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 6.909893] [<ffffffff810cba95>] ? worker_thread+0xd5/0x450
[ 6.911921] [<ffffffff810cba0e>] worker_thread+0x4e/0x450
[ 6.913902] [<ffffffff810cb9c0>] ? process_one_work+0x840/0x840
[ 6.916066] [<ffffffff810cb9c0>] ? process_one_work+0x840/0x840
[ 6.918232] [<ffffffff810d2594>] kthread+0x104/0x120
[ 6.920059] [<ffffffff810d2490>] ? kthread_create_on_node+0x250/0x250
[ 6.922396] [<ffffffff8187105f>] ret_from_fork+0x3f/0x70
[ 6.924339] [<ffffffff810d2490>] ? kthread_create_on_node+0x250/0x250
[ 6.926663] Code: 04 24 56 07 41 81 e8 20 72 cf ff e8 9b 4d d1 ff 85 c0 74 0d 80 3d 64 04 b5 00 00 0f 84 19 01 00 00 49 8b 84 24 d0 00 00 00 31 ff <48> 8b 80 b8 0b 00 00 48 8b 70 28 e8 60 04 d5 ff 48 85 c0 48 89
[ 6.936207] RIP [<ffffffff814107a0>] blk_throtl_drain+0x80/0x220
[ 6.938432] RSP <ffff88001d0939d8>
[ 6.939692] CR2: 0000000000000bb8
[ 6.940915] ---[ end trace f1acb54c2a225dd4 ]---
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
next prev parent reply other threads:[~2015-09-04 10:46 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-30 12:30 __blkg_lookup oops with 4.2-rcX Josh Boyer
2015-08-30 18:04 ` Richard W.M. Jones
2015-09-02 14:53 ` Tejun Heo
[not found] ` <20150902145307.GG22326-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-09-02 15:32 ` Tejun Heo
2015-09-04 10:46 ` Richard W.M. Jones [this message]
[not found] ` <20150904104602.GN29283-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-09-04 17:13 ` Tejun Heo
[not found] ` <20150904171302.GE25329-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-09-04 18:17 ` Richard W.M. Jones
2015-09-04 20:42 ` Richard W.M. Jones
2015-09-05 15:34 ` Richard W.M. Jones
[not found] ` <20150905153439.GA18461-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-09-05 15:48 ` Richard W.M. Jones
[not found] ` <20150905154840.GA19460-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-09-05 18:38 ` Tejun Heo
[not found] ` <20150905183801.GA8231-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-09-05 19:47 ` [PATCH block/for-linus] block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg Tejun Heo
2015-09-06 8:30 ` Richard W.M. Jones
[not found] ` <20150905194736.GB8231-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-09-08 15:31 ` Tejun Heo
2015-09-08 15:35 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150904104602.GN29283@redhat.com \
--to=rjones@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jmoyer@redhat.com \
--cc=jwboyer@fedoraproject.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=msnitzer@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).