From: Douglas Gilbert <dgilbert@interlog.com>
To: SCSI development list <linux-scsi@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
"Elliott, Robert (Server Storage)" <elliott@hp.com>
Subject: Re: lk 3.17-rc4 blk_mq large write problems
Date: Mon, 22 Sep 2014 19:14:41 -0400 [thread overview]
Message-ID: <5420AD61.4030600@interlog.com> (raw)
In-Reply-To: <540FCB96.8000606@interlog.com>
On 14-09-09 11:55 PM, Douglas Gilbert wrote:
> A few days ago I was trying to create a large file
> (say 16 GB) of zeros on an ext4 file system:
> dd if=/dev/zero bs=64k count=256k of=zero_16g.bin
>
> After about 5 seconds there was a NULL de-reference that
> crashed the machine (shown below). This was with a clean
> version of lk 3.17-rc4 (from kernel.org) where the target
> was a SATA SSD directly connected to a LSI 9300-4i SAS-3
> HBA (mpt3sas). Significantly (IMO) the kernel boot line
> contained:
> scsi_mod.use_blk_mq=Y
>
> In all cases changing that to "N" fixed the problem. I tried
> many things, including a SAS SSD but the problem persisted when
> use_blk_mq=Y. It doesn't always oops as shown in the first
> case below. There were also:
> - immediate reboots
> - lock-ups without any oops on the console
> - different oopses of a somewhat stranger nature
> (hard to catch as logging everything on a real
> serial port is fiddly) like double bus errors
>
> Rob Elliott has been unable to replicate this problem.
>
> Today I switched to another machine running Debian 7 (the
> first machine was Ubuntu 14.04 based); both x86_64.
> Built the same kernel on the second machine, this time
> with a LSI 9212-4i4e SAS-2 HBA (mpt2sas) and a SAS SSD
> directly connected. Roughly speaking it was the same
> test case:
> # <create 1 partition on say /dev/sdb>
> # mkfs.ext4 /dev/sdb1
> # mount /dev/sdb1 /mnt/spare
> # cd /mnt/spare
> # dd if=/dev/zero bs=64k count=256k of=zero_16g.bin
> # cd
> # umount /mnt/spare
>
> Usually the dd or the umount would crash. Then after a
> crash, following a power cycle, the mount would crash.
> Changing to scsi_mod.use_blk_mq=N restored sanity.
>
> Tried some other SAS controllers: couldn't get a MR-9240-4i
> (MegaRaid) to work at all on my newer box (doesn't like
> PCIe 3 ?). Got a ARC-1882I working and it did not have
> problems with the big dd (perhaps the arcmsr driver still
> uses the host_lock to serialize commands).
>
> So it could be common, bad code in the mpt2sas and mpt3sas
> drivers. Or it could be somewhere else. Perhaps there is
> more than one problem.
>
> Testers out there are encouraged to run the above test case.
> The SATA and SAS SSDs that I used can consume writes in the
> 300 to 600 MB/sec range.
>
> Part of the strangeness of this first attached oops is that
> blk_mq_timeout_check() appears twice. The second one (typically
> from the umount) is a blown stack.
Using the block/for-linus tree that I built today,
the freeze-during-boot-up problem has gone away as
reported earlier.
That allows me to retest the problem reported in this
thread with the same disk (INTEL SSDSA2M080) and the
same configuration. Just did four cycles of the test
sequence shown above plus a shutdown. No problems seen.
Doug Gilbert
prev parent reply other threads:[~2014-09-22 23:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-10 3:55 lk 3.17-rc4 blk_mq large write problems Douglas Gilbert
2014-09-10 15:41 ` Christoph Hellwig
2014-09-10 16:47 ` Jens Axboe
2014-09-10 18:09 ` Christoph Hellwig
2014-09-10 18:26 ` Jens Axboe
2014-09-10 18:40 ` Christoph Hellwig
2014-09-10 18:42 ` Jens Axboe
2014-09-11 0:58 ` Douglas Gilbert
2014-09-11 2:00 ` Jens Axboe
2014-09-11 3:48 ` Elliott, Robert (Server Storage)
2014-09-11 5:37 ` Douglas Gilbert
2014-09-17 17:04 ` Christoph Hellwig
2014-09-22 23:14 ` Douglas Gilbert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5420AD61.4030600@interlog.com \
--to=dgilbert@interlog.com \
--cc=axboe@kernel.dk \
--cc=elliott@hp.com \
--cc=hch@lst.de \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox