All of lore.kernel.org
 help / color / mirror / Atom feed
From: keith.busch@intel.com (Busch, Keith)
Subject: [PATCH 8/9] nvme: remove dead controllers from a work item
Date: Thu, 22 Oct 2015 18:10:15 +0000	[thread overview]
Message-ID: <20151022181014.GE21840@localhost.localdomain> (raw)
In-Reply-To: <1445515421-4940-9-git-send-email-hch@lst.de>

On Thu, Oct 22, 2015@02:03:40PM +0200, Christoph Hellwig wrote:
> Compared to the kthread this gives us multiple call prevention for free.

My apologies in advance, I usually try to be more helpful debugging
new issues, but I'm running low on time today. I might be able to dig
into this more tomorrow, but if you could, please verify what happens
if an admin command times out during initialization. The reset_work
gets queued twice on two different work_queues and causes trouble. I
get one of several crashes or warnings after applying the whole series,
but this is the most frequent I'm seeing so far:

[  348.845441] BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
[  348.846615] IP: [<ffffffffa02c1a48>] nvme_dev_shutdown+0x303/0x3e8 [nvme]
[  348.847615] PGD b8749067 PUD b8748067 PMD 0
[  348.848328] Oops: 0002 [#1] SMP
[  348.848826] Modules linked in: nvme bnep rfcomm bluetooth rfkill nfsd auth_rpcgss oid_registry nfs_acl nfs lockdgrace fscache sunrpc dm_round_robin loop dm_multipath parport_pc parport evdev psmouse pcspkr serio_raw processor button ext4 crc16 jbd2 mbcache sg sr_mod sd_mod cdrom ata_generic floppy e1000 ata_piix libata scsi_mod
[  348.849428] CPU: 0 PID: 3415 Comm: kworker/0:0 Not tainted 4.3.0-rc4+ #4
[  348.849428] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[  348.849428] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
[  348.849428] task: ffff88013afe6ec0 ti: ffff88013afe8000 task.ti: ffff88013afe8000
[  348.849428] RIP: 0010:[<ffffffffa02c1a48>]  [<ffffffffa02c1a48>] nvme_dev_shutdown+0x303/0x3e8 [nvme]
[  348.849428] RSP: 0018:ffff88013afebcd8  EFLAGS: 00010202
[  348.849428] RAX: 0000000000464001 RBX: ffff8800369c2800 RCX: 0000005138ced189
[  348.849428] RDX: 0000000000000000 RSI: ffffffff81409bf0 RDI: ffff88013ab53f10
[  348.849428] RBP: ffff88013afe6ec0 R08: ffff88013afe8000 R09: 0000000000000000
[  348.849428] R10: 0000000000000000 R11: ffff88013fc142c0 R12: 0000000000000000
[  348.849428] R13: ffff880036f0c540 R14: ffff88013afebcf8 R15: 0000000000000000
[  348.849428] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
[  348.849428] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  348.849428] CR2: 0000000000000014 CR3: 00000000b8cb2000 CR4: 00000000000006f0
[  348.849428] Stack:
[  348.849428]  ffff8800b8e556b8 0000000000000000 ffff8800b8e55640 0000000000000082
[  348.849428]  ffff88013afe6ec0 ffff88013afebd10 ffff880100000000 0000000000000000
[  348.849428]  ffff88013afebd18 ffff88013afebd18 0000000000000000 0000000000000000
[  348.849428] Call Trace:
[  348.849428]  [<ffffffffa02c1bd6>] ? nvme_remove+0x6f/0xc0 [nvme]
[  348.849428]  [<ffffffff8121b67a>] ? pci_device_remove+0x43/0x8e
[  348.849428]  [<ffffffff812bb882>] ? __device_release_driver+0x91/0x109
[  348.849428]  [<ffffffff812bb90b>] ? device_release_driver+0x11/0x1a
[  348.849428]  [<ffffffff812165e2>] ? pci_stop_bus_device+0x62/0x85
[  348.849428]  [<ffffffff81216798>] ? pci_stop_and_remove_bus_device+0x9/0x12
[  348.849428]  [<ffffffff812167b9>] ? pci_stop_and_remove_bus_device_locked+0x18/0x21
[  348.849428]  [<ffffffffa02bfd44>] ? nvme_remove_dead_ctrl_work+0x1e/0x28 [nvme]
[  348.849428]  [<ffffffff810560b6>] ? process_one_work+0x177/0x27b
[  348.849428]  [<ffffffff81056394>] ? worker_thread+0x1b4/0x28b
[  348.849428]  [<ffffffff810561e0>] ? process_scheduled_works+0x26/0x26
[  348.849428]  [<ffffffff8105a4b6>] ? kthread+0x99/0xa1
[  348.849428]  [<ffffffff8105a41d>] ? kthread_parkme+0x16/0x16
[  348.849428]  [<ffffffff813c9d6f>] ? ret_from_fork+0x3f/0x70
[  348.849428]  [<ffffffff8105a41d>] ? kthread_parkme+0x16/0x16
[  348.849428] Code: 48 89 44 24 18 48 8b 44 24 18 4c 89 ef e8 ce 88 d9 e0 8b 83 2c 01 00 00 48 8b 93 38 01 00 00 80 e4 3f 80 cc 40 89 83 2c 01 00 00 <89> 42 14 0f b6 2d fe 46 00 00 48 8b 05 a7 85 34 e1 65 4c 8b 24
[  348.849428] RIP  [<ffffffffa02c1a48>] nvme_dev_shutdown+0x303/0x3e8 [nvme]
[  348.849428]  RSP <ffff88013afebcd8>
[  348.849428] CR2: 0000000000000014
[  348.849428] ---[ end trace f11db05b8d65bab0 ]---

  reply	other threads:[~2015-10-22 18:10 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-22 12:03 nvme abort and reset fixes Christoph Hellwig
2015-10-22 12:03 ` [PATCH 1/9] nvme: only add a controller to dev_list after it's been fully initialized Christoph Hellwig
2015-10-22 12:03 ` [PATCH 2/9] nvme: don't take the I/O queue q_lock in nvme_timeout Christoph Hellwig
2015-10-22 12:03 ` [PATCH 3/9] nvme: merge nvme_abort_req and nvme_timeout Christoph Hellwig
2015-10-22 12:03 ` [PATCH 4/9] nvme: do not restart the request timeout if we're resetting the controller Christoph Hellwig
2015-10-22 16:27   ` Busch, Keith
2015-10-22 16:30     ` Christoph Hellwig
2015-10-22 17:15       ` Busch, Keith
2015-10-22 18:17         ` Christoph Hellwig
2015-10-22 12:03 ` [PATCH 5/9] nvme: simplify resets Christoph Hellwig
2015-10-22 12:03 ` [PATCH 6/9] nvme: abort requests on the reqeueue list when shutting down a controller Christoph Hellwig
2015-10-22 14:44   ` Busch, Keith
2015-10-22 14:58     ` Christoph Hellwig
2015-10-22 15:16       ` Busch, Keith
2015-10-22 16:27         ` Christoph Hellwig
2015-10-22 12:03 ` [PATCH 7/9] nvme: merge probe_work and reset_work Christoph Hellwig
2015-10-22 12:03 ` [PATCH 8/9] nvme: remove dead controllers from a work item Christoph Hellwig
2015-10-22 18:10   ` Busch, Keith [this message]
2015-10-22 18:12     ` Christoph Hellwig
2015-10-22 20:36       ` Busch, Keith
2015-10-23  5:57         ` Christoph Hellwig
2015-10-23 14:51           ` Busch, Keith
2015-10-23 19:31             ` Busch, Keith
2015-10-22 12:03 ` [PATCH 9/9] nvme: switch abort_limit to an atomic_t Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151022181014.GE21840@localhost.localdomain \
    --to=keith.busch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.