From: keith.busch@intel.com (Keith Busch)
Subject: kernel BUG at drivers/block/nvme-core.c:732!
Date: Wed, 9 Dec 2015 22:43:28 +0000 [thread overview]
Message-ID: <20151209224328.GA4617@localhost.localdomain> (raw)
In-Reply-To: <3AAC62DB-A747-48D7-9D9F-F96433D71FFE@roche.com>
On Wed, Dec 09, 2015@02:14:37PM -0800, Seufert, Tim wrote:
> Computer: i7-6700k CPU, Supermicro X11SS-Q motherboard, and a Samsung 950 Pro NVME SSD
> Linux version: CentOS 6.7 with ElRepo kernel-ml 4.3.0
>
> What led up to the event: This is a very new system and I had just put it together and copied over a KVM guest (the guest OS is also CentOS 6.7). At the time the ?kernel BUG? occurred, the guest was midway through updating itself, so yum/rpm was generating plenty of I/O. Since its disk image file was located on the host?s 950 Pro, this was generating NVME traffic. The BUG resulted in the guest hanging forever (couldn?t open new terminals, make SSH connections, or do anything else that required disk I/O), but oddly enough the host did not hang even though its root FS was on the same NVME SSD partition containing the guest image. I had to reboot the host to recover.
>
> I have since replaced the host OS installation with a fresh install of CentOS 7, but am still running kernel-ml 4.3.0. So far I have not seen a repetition of this BUG.
The BUG_ON below means the driver detected the SGL list it was provided
is not PRP'able. In the past, this has meant that the virtual address
page offset does not match the DMA address offset.
I've not seen this repeat on x86 architectures before. If you can find
a test case that reproduces this, we should be able to figure out what
is making this happen.
> A side question: Is the advice to not enable discard in Intel?s NVME driver reference guide (https://downloadmirror.intel.com/23929/eng/Intel_Linux_NVMe_Driver_Reference_Guide_330602-002.pdf) still considered valid? It claims "You want to allow the SSD manage blocks and its activity between the NVM (non-volatile memory) and host with more advanced and consistent approaches in the SSD Controller? but it?s not clear to me how the SSD controller can have a more advanced and consistent approach if it isn?t ever notified when blocks are okay to throw away.
Not sure what the guide taking about. I'll check with the author.
> Dec 1 19:08:52 verra kernel: ------------[ cut here ]------------
> Dec 1 19:08:52 verra kernel: kernel BUG at drivers/block/nvme-core.c:732!
next prev parent reply other threads:[~2015-12-09 22:43 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-09 22:14 kernel BUG at drivers/block/nvme-core.c:732! Seufert, Tim
2015-12-09 22:43 ` Keith Busch [this message]
-- strict thread matches above, loose matches on Subject: below --
2015-12-21 9:45 John Morrison
2015-12-21 21:12 ` Keith Busch
2016-01-27 0:37 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151209224328.GA4617@localhost.localdomain \
--to=keith.busch@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).