From: keith.busch@intel.com (Keith Busch)
Subject: NVMe driver with kernel panic
Date: Mon, 21 Aug 2017 16:04:37 -0400 [thread overview]
Message-ID: <20170821200436.GF21397@localhost.localdomain> (raw)
In-Reply-To: <CADcj3=5oXJzsBO7eNPNbGM+W9qKwbVBmmokY9h7-qP8butZLTg@mail.gmail.com>
On Mon, Aug 21, 2017@03:23:09PM -0400, Felipe Arturo Polanco wrote:
> Hello,
>
> We have been having kernel panics in our servers while using NVMe disks.
> Our setup consist of two Intel P4500 in Software Raid1 with mdadm.
> We are running KVM on top of them.
>
> The message we see in ring buffer is the following:
>
> [531622.412922] ------------[ cut here ]------------
> [531622.413254] kernel BUG at drivers/nvme/host/pci.c:467!
> [531622.413468] invalid opcode: 0000 [#1] SMP
>
> Online we found a workaround to avoid using the explicit BUG_ON() and
> instead we got that changed to WARN_ONCE() to not crash the server but
> we are not entirely sure if this is a fix at all as it may cause other
> issues.
Hi,
The WARN isn't really a work-around to the BUG, but it should make it
easier to determine what's broken. You'll get IO errrors instead of a
kernel panic.
> We were told by a developer that this issue is caused by wrong block
> size being reported by the hardware, 4KB expected and got 512 bytes
> instead.
This should mean that the driver got a scatter list that isn't usable
under the queue constraints it registered with for PRP alignment. It's a
memory alignment problem rather than a block size problem.
> Has anyone seen this before or has applied a patch that fixed this?
>
> We are running VzLinux7 based on RHEL 7.3, kernel 3.10.0-514.26.1.vz7.33.22
The stacking drivers like MD RAID may have been able to submit incorrectly
merged IO in that release. Do you know if this successful in RHEL 7.4? I
think all the issues with merging were fixed there.
next prev parent reply other threads:[~2017-08-21 20:04 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-21 19:23 NVMe driver with kernel panic Felipe Arturo Polanco
2017-08-21 20:04 ` Keith Busch [this message]
2017-08-21 21:51 ` Felipe Arturo Polanco
2017-08-28 14:36 ` Keith Busch
[not found] ` <CADcj3=5W68+MJDTwCGEcTqcKfRpOw5g+h3s8jFpT7hqcZoYvxw@mail.gmail.com>
[not found] ` <20170828150512.GA3913@localhost.localdomain>
2017-08-28 16:53 ` Felipe Arturo Polanco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170821200436.GF21397@localhost.localdomain \
--to=keith.busch@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.