* NVMe drive kernel fail after hotplug kernel 4.16.12
[not found] ` <AM6PR03MB4038A5FDD819AE34DDDE54A4F87E0@AM6PR03MB4038.eurprd03.prod.outlook.com>
@ 2018-06-13 15:11 ` Keith Busch
[not found] ` <AM6PR03MB40389E7C34537DF6CCDF0904F8400@AM6PR03MB4038.eurprd03.prod.outlook.com>
0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2018-06-13 15:11 UTC (permalink / raw)
On Wed, Jun 13, 2018@01:31:37AM -0700, Albert Schlegel wrote:
> Hi,
>
>
> I have a problem with NVMe drive and hotplug where the kernel (nvme driver) has some problems. I posted the detailed problem here:
> https://www.linuxquestions.org/questions/showthread.php?p=5866497#post5866497
>
>
> Is there a solution for this or it is possible to fix this?
Could you see if this commit fixes your issue?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/pci.c?id=1d39e6928cbd0eb737c51545210b5186d5551ba1
^ permalink raw reply [flat|nested] 5+ messages in thread
* NVMe drive kernel fail after hotplug kernel 4.16.12
[not found] ` <AM6PR03MB40389E7C34537DF6CCDF0904F8400@AM6PR03MB4038.eurprd03.prod.outlook.com>
@ 2018-07-05 19:52 ` Busch, Keith
2018-12-03 10:50 ` AW: " Albert Schlegel
0 siblings, 1 reply; 5+ messages in thread
From: Busch, Keith @ 2018-07-05 19:52 UTC (permalink / raw)
Super, thanks for the confirmation!
________________________________________
From: Albert Schlegel [mailto:albi.schlegel@hotmail.com]
Sent: Wednesday, July 4, 2018 11:58 PM
To: Busch, Keith <keith.busch at intel.com>
Cc: linux-nvme at lists.infradead.org
Subject: AW: NVMe drive kernel fail after hotplug kernel 4.16.12
Yes, this solves the problem.
Thanks for the patch!
________________________________________
Von: Keith Busch <keith.busch at intel.com>
Gesendet: Mittwoch, 13. Juni 2018 17:11
An: Albert Schlegel
Cc: linux-nvme at lists.infradead.org
Betreff: Re: NVMe drive kernel fail after hotplug kernel 4.16.12
?
On Wed, Jun 13, 2018@01:31:37AM -0700, Albert Schlegel wrote:
> Hi,
>
>
> I have a problem with NVMe drive and hotplug where the kernel (nvme driver) has some problems. I posted the detailed problem here:
> https://www.linuxquestions.org/questions/showthread.php?p=5866497#post5866497
>
>
> Is there a solution for this or it is possible to fix this?
Could you see if this commit fixes your issue?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/pci.c?id=1d39e6928cbd0eb737c51545210b5186d5551ba1
^ permalink raw reply [flat|nested] 5+ messages in thread
* AW: NVMe drive kernel fail after hotplug kernel 4.16.12
2018-07-05 19:52 ` Busch, Keith
@ 2018-12-03 10:50 ` Albert Schlegel
2018-12-03 14:27 ` Keith Busch
0 siblings, 1 reply; 5+ messages in thread
From: Albert Schlegel @ 2018-12-03 10:50 UTC (permalink / raw)
Hi Keith,
unfornately your commit does not fix our problem. Now we get a NULL pointer dereference.
We tested the kernel? 4.20-rc3 and got the following output after power fail of the nvme device:
[ 324.913779] nvme nvme0: failed to set APST feature (-19)
[ 324.973652] pci 0000:03:00.0: [1987:5008] type 00 class 0x010802
[ 324.973678] pci 0000:03:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[ 324.973707] pci 0000:03:00.0: Max Payload Size set to 256 (was 128, max 256)
[ 324.974013] pci 0000:03:00.0: BAR 0: assigned [mem 0xf7000000-0xf7003fff 64bit]
[ 324.974021] pci 0000:05:00.0: PCI bridge to [bus 06]
[ 324.974131] nvme nvme0: pci function 0000:03:00.0
[ 324.974146] nvme 0000:03:00.0: enabling device (0000 -> 0002)
[ 325.081780] nvme nvme0: missing or invalid SUBNQN field.
[ 325.086371] nvme nvme0: allocated 64 MiB host memory buffer.
[ 357.462342] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
[ 357.530344] nvme 0000:03:00.0: Refused to change power state, currently in D3
[ 357.530404] nvme nvme0: Removing after probe failure status: -19
[ 357.562355] BUG: unable to handle kernel NULL pointer dereference at 00000000
[ 357.562358] *pdpt = 0000000000000000 *pde = f000eef30000ee01
[ 357.562360] Oops: 0000 [#1] SMP PTI
[ 357.562362] CPU: 0 PID: 301 Comm: kworker/u8:3 Not tainted 4.20.0-rc3 #1
[ 357.562363] Hardware name: System manufacturer System Product Name/Q170M-C, BIOS 3805 05/10/2018
[ 357.562367] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]
[ 357.562370] EIP: sbitmap_any_bit_set+0xe/0x40
[ 357.562371] Code: 39 56 08 77 df 83 c4 04 5b 5e 5f 5d c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 8b 48 08 85 c9 74 31 55 89 e5 53 8b 58 0c <8b> 03 85 c0 75 1c 31 c0 eb 0c 89 c2 c1 e2 06 8b 14 13 85 d2 75 0c
[ 357.562374] EAX: f68d108c EBX: 00000000 ECX: 00000001 EDX: f69c3e64
[ 357.562374] ESI: 00000001 EDI: f6a90d20 EBP: f69c3e5c ESP: f69c3e58
[ 357.562376] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202
[ 357.562380] print_req_error: I/O error, dev nvme0n1, sector 0
[ 357.562382] CR0: 80050033 CR2: 00000000 CR3: 019b6000 CR4: 003406f0
[ 357.562383] Call Trace:
[ 357.562402] blk_mq_run_hw_queue+0xa8/0x100
[ 357.562404] blk_mq_run_hw_queues+0x46/0x60
[ 357.562406] blk_mq_unquiesce_queue+0x23/0x30
[ 357.562409] nvme_kill_queues+0x23/0x50 [nvme_core]
[ 357.562412] nvme_remove_namespaces+0x85/0x90 [nvme_core]
[ 357.562414] nvme_remove+0x72/0x130 [nvme]
[ 357.562416] pci_device_remove+0x38/0xc0
[ 357.562419] device_release_driver_internal+0x141/0x1f0
[ 357.562421] device_release_driver+0x11/0x20
[ 357.562422] nvme_remove_dead_ctrl_work+0x1a/0x30 [nvme]
[ 357.562425] process_one_work+0x130/0x310
[ 357.562427] worker_thread+0x39/0x330
[ 357.562429] kthread+0xe2/0x110
[ 357.562431] ? process_scheduled_works+0x30/0x30
[ 357.562433] ? kthread_create_worker+0x30/0x30
[ 357.562435] ret_from_fork+0x2e/0x38
[ 357.562437] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc loop snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 snd_hda_intel snd_hda_codec snd_hda_core drm_kms_helper snd_hwdep snd_pcm eeepc_wmi snd_timer drm asus_wmi snd parport_pc pcc_cpufreq sparse_keymap rfkill evdev xhci_pci rtc_cmos parport iTCO_wdt iTCO_vendor_support xhci_hcd soundcore i2c_i801 i2c_algo_bit nvme mei_me nvme_core mei tpm_tis psmouse i2c_core tpm_tis_core pcspkr serio_raw mxm_wmi button wmi_bmof tpm rng_core video acpi_pad wmi ext4 crc16 mbcache jbd2 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq crc32c_generic crc32c_intel libcrc32c nbd uhci_hcd ehci_hcd usbcore usb_common sg sd_mod ahci libahci thermal e1000e ptp pps_core libata scsi_mod fan
[ 357.562466] CR2: 0000000000000000
[ 357.562467] ---[ end trace df26c057a341ee3b ]---
Do you have an other idea what could cause this problem?
Thanks!
Von: Busch, Keith <keith.busch at intel.com>
Gesendet: Donnerstag, 5. Juli 2018 21:52
An: Albert Schlegel
Cc: linux-nvme at lists.infradead.org
Betreff: RE: NVMe drive kernel fail after hotplug kernel 4.16.12
?
Super, thanks for the confirmation!
________________________________________
From: Albert Schlegel [mailto:albi.schlegel@hotmail.com]
Sent: Wednesday, July 4, 2018 11:58 PM
To: Busch, Keith <keith.busch at intel.com>
Cc: linux-nvme at lists.infradead.org
Subject: AW: NVMe drive kernel fail after hotplug kernel 4.16.12
Yes, this solves the problem.
Thanks for the patch!
________________________________________
Von: Keith Busch <keith.busch at intel.com>
Gesendet: Mittwoch, 13. Juni 2018 17:11
An: Albert Schlegel
Cc: linux-nvme at lists.infradead.org
Betreff: Re: NVMe drive kernel fail after hotplug kernel 4.16.12
?
On Wed, Jun 13, 2018@01:31:37AM -0700, Albert Schlegel wrote:
> Hi,
>
>
> I have a problem with NVMe drive and hotplug where the kernel (nvme driver) has some problems. I posted the detailed problem here:
> https://www.linuxquestions.org/questions/showthread.php?p=5866497#post5866497
>
>
> Is there a solution for this or it is possible to fix this?
Could you see if this commit fixes your issue?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/pci.c?id=1d39e6928cbd0eb737c51545210b5186d5551ba1
^ permalink raw reply [flat|nested] 5+ messages in thread
* NVMe drive kernel fail after hotplug kernel 4.16.12
2018-12-03 10:50 ` AW: " Albert Schlegel
@ 2018-12-03 14:27 ` Keith Busch
2018-12-17 15:33 ` AW: " Albert Schlegel
0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2018-12-03 14:27 UTC (permalink / raw)
On Mon, Dec 03, 2018@02:50:44AM -0800, Albert Schlegel wrote:
> Hi Keith,
>
> unfornately your commit does not fix our problem. Now we get a NULL pointer dereference.
> We tested the kernel? 4.20-rc3 and got the following output after power fail of the nvme device:
You're talking about a different problem. Alex reported this regression
after 4.20-rc2, and Igor wrote the fix(*) committed in 4.20-rc4. Try
that one.
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=751a0cc0cd3a0d51e6aaf6fd3b8bd31f4ecfaf3e
^ permalink raw reply [flat|nested] 5+ messages in thread
* AW: NVMe drive kernel fail after hotplug kernel 4.16.12
2018-12-03 14:27 ` Keith Busch
@ 2018-12-17 15:33 ` Albert Schlegel
0 siblings, 0 replies; 5+ messages in thread
From: Albert Schlegel @ 2018-12-17 15:33 UTC (permalink / raw)
Hi Keith,
the suggested fix solves our problem.
Many thanks!
Von: Keith Busch <keith.busch at intel.com>
Gesendet: Montag, 3. Dezember 2018 15:27
An: Albert Schlegel
Cc: linux-nvme at lists.infradead.org
Betreff: Re: NVMe drive kernel fail after hotplug kernel 4.16.12
?
On Mon, Dec 03, 2018@02:50:44AM -0800, Albert Schlegel wrote:
> Hi Keith,
>
> unfornately your commit does not fix our problem. Now we get a NULL pointer dereference.
> We tested the kernel? 4.20-rc3 and got the following output after power fail of the nvme device:
You're talking about a different problem. Alex reported this regression
after 4.20-rc2, and Igor wrote the fix(*) committed in 4.20-rc4. Try
that one.
?* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=751a0cc0cd3a0d51e6aaf6fd3b8bd31f4ecfaf3e
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-12-17 15:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <AM6PR03MB4038E7290A514371F9D60E17F87F0@AM6PR03MB4038.eurprd03.prod.outlook.com>
[not found] ` <AM6PR03MB4038A5FDD819AE34DDDE54A4F87E0@AM6PR03MB4038.eurprd03.prod.outlook.com>
2018-06-13 15:11 ` NVMe drive kernel fail after hotplug kernel 4.16.12 Keith Busch
[not found] ` <AM6PR03MB40389E7C34537DF6CCDF0904F8400@AM6PR03MB4038.eurprd03.prod.outlook.com>
2018-07-05 19:52 ` Busch, Keith
2018-12-03 10:50 ` AW: " Albert Schlegel
2018-12-03 14:27 ` Keith Busch
2018-12-17 15:33 ` AW: " Albert Schlegel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox