* [Bug 112241] New: Under heavy load FC TARGET going to Oops
@ 2016-02-10 7:32 bugzilla-daemon
2016-02-29 2:24 ` [Bug 112241] " bugzilla-daemon
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: bugzilla-daemon @ 2016-02-10 7:32 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=112241
Bug ID: 112241
Summary: Under heavy load FC TARGET going to Oops
Product: SCSI Drivers
Version: 2.5
Kernel Version: 4.3.3
Hardware: Intel
OS: Linux
Tree: Fedora
Status: NEW
Severity: high
Priority: P1
Component: QLOGIC QLA2XXX
Assignee: scsi_drivers-qla2xxx@kernel-bugs.osdl.org
Reporter: anthony.bloodoff@gmail.com
Regression: No
Created attachment 203261
--> https://bugzilla.kernel.org/attachment.cgi?id=203261&action=edit
Kernel stacktrace
Storage on Linux Fedora with QLogic Corp. ISP2532-based 8Gb Fibre Channel to
PCI Express HBA exporting Adaptec RAID6 with bcache on Intel SSD to VMWARE 5
On heavy load (for example VM migrating from storage) system going to Oops.
Stacktrace attached
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug 112241] Under heavy load FC TARGET going to Oops
2016-02-10 7:32 [Bug 112241] New: Under heavy load FC TARGET going to Oops bugzilla-daemon
@ 2016-02-29 2:24 ` bugzilla-daemon
2016-02-29 2:26 ` bugzilla-daemon
2016-03-01 5:16 ` bugzilla-daemon
2 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2016-02-29 2:24 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=112241
--- Comment #1 from Anthony <anthony.bloodoff@gmail.com> ---
Created attachment 206351
--> https://bugzilla.kernel.org/attachment.cgi?id=206351&action=edit
Screenshot with call trace for kernel 4.5.0
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug 112241] Under heavy load FC TARGET going to Oops
2016-02-10 7:32 [Bug 112241] New: Under heavy load FC TARGET going to Oops bugzilla-daemon
2016-02-29 2:24 ` [Bug 112241] " bugzilla-daemon
@ 2016-02-29 2:26 ` bugzilla-daemon
2016-03-01 5:16 ` Nicholas A. Bellinger
2016-03-01 5:16 ` bugzilla-daemon
2 siblings, 1 reply; 5+ messages in thread
From: bugzilla-daemon @ 2016-02-29 2:26 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=112241
Anthony <anthony.bloodoff@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Kernel Version|4.3.3 |4.5.0
--- Comment #2 from Anthony <anthony.bloodoff@gmail.com> ---
With kernel 4.5.0 on target, system hang after clients connects to target.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 112241] Under heavy load FC TARGET going to Oops
2016-02-29 2:26 ` bugzilla-daemon
@ 2016-03-01 5:16 ` Nicholas A. Bellinger
0 siblings, 0 replies; 5+ messages in thread
From: Nicholas A. Bellinger @ 2016-03-01 5:16 UTC (permalink / raw)
To: bugzilla-daemon; +Cc: linux-scsi, target-devel
Hi Anthony,
On Mon, 2016-02-29 at 02:26 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=112241
>
> Anthony <anthony.bloodoff@gmail.com> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Kernel Version|4.3.3 |4.5.0
>
> --- Comment #2 from Anthony <anthony.bloodoff@gmail.com> ---
> With kernel 4.5.0 on target, system hang after clients connects to target.
>
So there are two things going on here.
First, the BUG_ON your ESX <-> LIO FC setup triggered has been addressed
recently in v4.5-rc4 and later kernels with the following series:
http://www.spinics.net/lists/target-devel/msg11822.html
Note these patches will be making it back to earlier stable kernels over
the next weeks.
However, this specific bug is a final consequence of larger ESX v5.5u2+
host side issue of AtomicTestandSet (ATS) heartbeat being enabled (by
default) for all VMFS5 mounts:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2113956
Other folks have been hitting this recently, here's some extra
background:
http://permalink.gmane.org/gmane.linux.scsi.target.devel/11574
http://www.spinics.net/lists/target-devel/msg12124.html
Note this effects all targets w/ VAAI ATS (including EMC, IBM, 3PAR,
SolidFire, etc) and the current solution for ESX v5.5u2+ is to either:
- Explicitly disable ATS heartbeat usage on all VMFS5 mounts as
described in the VMWare -kb article, or:
- Explicitly disable all ATS logic completely from LIO using
emulate_caw=0 on all backends connected to ESX v5.5u2+ hosts
with VMFS5.
You can google for 'esx ats heartbeat bug' to see the gory details.
Thanks for reporting!
--nab
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug 112241] Under heavy load FC TARGET going to Oops
2016-02-10 7:32 [Bug 112241] New: Under heavy load FC TARGET going to Oops bugzilla-daemon
2016-02-29 2:24 ` [Bug 112241] " bugzilla-daemon
2016-02-29 2:26 ` bugzilla-daemon
@ 2016-03-01 5:16 ` bugzilla-daemon
2 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2016-03-01 5:16 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=112241
--- Comment #3 from nab <nab@linux-iscsi.org> ---
Hi Anthony,
On Mon, 2016-02-29 at 02:26 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=112241
>
> Anthony <anthony.bloodoff@gmail.com> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Kernel Version|4.3.3 |4.5.0
>
> --- Comment #2 from Anthony <anthony.bloodoff@gmail.com> ---
> With kernel 4.5.0 on target, system hang after clients connects to target.
>
So there are two things going on here.
First, the BUG_ON your ESX <-> LIO FC setup triggered has been addressed
recently in v4.5-rc4 and later kernels with the following series:
http://www.spinics.net/lists/target-devel/msg11822.html
Note these patches will be making it back to earlier stable kernels over
the next weeks.
However, this specific bug is a final consequence of larger ESX v5.5u2+
host side issue of AtomicTestandSet (ATS) heartbeat being enabled (by
default) for all VMFS5 mounts:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2113956
Other folks have been hitting this recently, here's some extra
background:
http://permalink.gmane.org/gmane.linux.scsi.target.devel/11574
http://www.spinics.net/lists/target-devel/msg12124.html
Note this effects all targets w/ VAAI ATS (including EMC, IBM, 3PAR,
SolidFire, etc) and the current solution for ESX v5.5u2+ is to either:
- Explicitly disable ATS heartbeat usage on all VMFS5 mounts as
described in the VMWare -kb article, or:
- Explicitly disable all ATS logic completely from LIO using
emulate_caw=0 on all backends connected to ESX v5.5u2+ hosts
with VMFS5.
You can google for 'esx ats heartbeat bug' to see the gory details.
Thanks for reporting!
--nab
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-03-01 5:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-10 7:32 [Bug 112241] New: Under heavy load FC TARGET going to Oops bugzilla-daemon
2016-02-29 2:24 ` [Bug 112241] " bugzilla-daemon
2016-02-29 2:26 ` bugzilla-daemon
2016-03-01 5:16 ` Nicholas A. Bellinger
2016-03-01 5:16 ` bugzilla-daemon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).