linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-next scsi-mq hang in suspend-resume
@ 2017-07-12 14:51 Tomi Sarvela
  2017-07-12 16:50 ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Tomi Sarvela @ 2017-07-12 14:51 UTC (permalink / raw)
  To: linux-block; +Cc: axboe

Hello there,

I've been running Intel GFX CI testing for linux DRM-Tip i915 driver, 
and couple of weeks ago we took linux-next for a ride to see what kind 
of integration problems there might pop up when pulling 4.13-rc1. 
Latest results can be seen at

https://intel-gfx-ci.01.org/CI/next-issues.html
https://intel-gfx-ci.01.org/CI/next-all.html

The purple blocks are hangs, starting from 20170628 (20170627 was 
untestable due to locking changes which were reverted). Traces were 
pointing to ext4 but bisecting between good 20170626 and bad 20170628 
pointed to:

commit 5c279bd9e40624f4ab6e688671026d6005b066fa
Date:   Fri Jun 16 10:27:55 2017 +0200

    scsi: default to scsi-mq

Reproduction is 100% or close to it when running two i-g-t tests as a 
testlist. I'm assuming that it creates the correct amount or pattern 
of actions to the device. The testlist consists of the following 
lines:

igt@gem_exec_gttfill@basic
igt@gem_exec_suspend@basic-s3

Kernel option scsi_mod.use_blk_mq=0 hides the issue on testhosts. 
Configuration option was copied over on testhosts and 20170712 was re-
tested, that's why today looks so much greener.

More information including traces and reproduction instructions at
https://bugzilla.kernel.org/show_bug.cgi?id=196223

I can run patchsets through the farm, if needed. In addition, daily 
linux-next tags are automatically tested and results published.

Best regards,

Tomi Sarvela
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: linux-next scsi-mq hang in suspend-resume
@ 2017-07-17 15:18 Evangelos Foutras
  2017-07-17 17:17 ` Evangelos Foutras
  0 siblings, 1 reply; 10+ messages in thread
From: Evangelos Foutras @ 2017-07-17 15:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

(Hopefully I got the In-Reply-To header right and won't mess up the thread.)

On 17/07/17 10:53, Christoph Hellwig wrote:
> I still haven't gotten hold of an i915 machine where I could
> run the actua ltest suite.

At the risk of posting an unproductive "me too" reply, I also got bit by
the dead disk on resume from S3 when Arch Linux enabled MQ by default in
the 4.12 kernel (CONFIG_SCSI_MQ_DEFAULT=y). The configuration change was
later reverted due to this issue.

For me the hang occurs pretty reliably (tested about 5-6 times) on an
Intel laptop and an AMD desktop, both with HDDs and ext4 on top of LUKS.
It feels as if the disk stops responding to commands. The machine itself
wakes up from sleep but even a simple `ls` will hang and do nothing.

> But I did some audit of the code, and it seems blk-mq is lacking
> support for the RQF_PM flag.  While I can't directly see how
> this would cause the hang your caused it's a least easy to test.
>
> Can you apply the patch below and test with the use_blk_mq=0 parameter?

I think the patch needs to be tested with scsi_mod.use_blk_mq=1 (which I
will try to do and report back).

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-07-17 17:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-12 14:51 linux-next scsi-mq hang in suspend-resume Tomi Sarvela
2017-07-12 16:50 ` Jens Axboe
2017-07-13  7:12   ` Christoph Hellwig
2017-07-14 12:44     ` Christoph Hellwig
2017-07-14 13:33       ` Tomi Sarvela
2017-07-17  7:53         ` Christoph Hellwig
2017-07-17 10:30           ` Tomi Sarvela
2017-07-17 10:35             ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2017-07-17 15:18 Evangelos Foutras
2017-07-17 17:17 ` Evangelos Foutras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).