linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>,
	linux-block@vger.kernel.org, Andrew Jones <drjones@redhat.com>,
	Bart Van Assche <bart.vanassche@wdc.com>,
	linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	Christoph Hellwig <hch@lst.de>,
	"James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
	stable <stable@vger.kernel.org>,
	"jianchao . wang" <jianchao.w.wang@oracle.com>
Subject: Re: [PATCH V2] SCSI: fix queue cleanup race before queue initialization is done
Date: Wed, 21 Nov 2018 17:02:13 -0500	[thread overview]
Message-ID: <20181121220213.GK26006@thunk.org> (raw)
In-Reply-To: <4e24ace9-c83f-5311-5419-18f4a0fb5148@kernel.dk>

On Wed, Nov 21, 2018 at 02:47:35PM -0700, Jens Axboe wrote:
> > Thanks applied, this bug was elusive but ever present in recent
> > testing that we did internally, it's been a huge pain in the butt.
> > The symptoms were usually a crash in blk_mq_get_driver_tag() with
> > hctx->tags == NULL, or a crash inside deadline request insert off
> > requeue.
> 
> I'm still hitting some weird crashes even with this applied, like
> this one:

FYI, there are a number of Ubuntu users running 4.19, 4.19.1, and
4.19.2 which have been reporting file system corruption problems.
They have a fix of configurations, but one of the things which is seem
to be a common factor is they all have CONFIG_SCSI_MQ_DEFAULT
disabled.  (Which also happens to be how I happen to be running my
laptop, and I've noticed no problems.)

	https://bugzilla.kernel.org/show_bug.cgi?id=201685

One user in particular reported that 4.19 worked fine, and 4.19.1 had
fs corruptions (and there are no storage-related changes between 4.19
and 4.19.1) --- but the one thing those two kernels had in common was
his 4.19 build had SCSI_MQ_DEFAULT disabled, and his 4.19.1 build did
*not* have SCSI_MQ_DEFAULT enabled.  This same user tried 4.19.3, and
after two hours of heavy I/O, he's not seen a repeat, and
interestingly, 4.19.3 has the backport mentioned on this thread.

The weird thing is that it looked like the problem that was fixed by
this commit would only show up at queue setup and teardown time.  Is
that correct?  Is it possible that the bug fixed here would manifest
as data corruptions on disk?  Or would only manifest as kernel
BUG_ON's and/or crashes?

One more thing.  I tried building a 4.20-rc2 based kernel with
CONFIG_SCSI_MQ_DEFAULT=y, and I tried running gce-xfstests (which uses
virtio-scsi) and I saw no failures.  So I don't have a clean repro of
Kernel Bugzilla #201685, and at the moment I'm too chicken to enable
CONFIG_SCSI_MQ_DEFAULT on my primary development laptop...

Any thoughts/suggestions appreciated.

						- Ted

  reply	other threads:[~2018-11-21 22:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-14  8:25 [PATCH V2] SCSI: fix queue cleanup race before queue initialization is done Ming Lei
2018-11-14 15:02 ` Bart Van Assche
2018-11-15  0:48   ` Ming Lei
2018-11-14 15:20 ` Jens Axboe
2018-11-15  1:02   ` Ming Lei
2018-11-21 21:47   ` Jens Axboe
2018-11-21 22:02     ` Theodore Y. Ts'o [this message]
2018-11-22  3:43       ` Theodore Y. Ts'o
2018-11-22  1:00     ` Ming Lei
2018-11-22  1:42       ` Jens Axboe
2018-11-22  2:00         ` Ming Lei
2018-11-22  2:14           ` Jens Axboe
2018-11-22  2:47             ` Ming Lei
2019-03-29 20:21         ` James Smart
2019-03-29 23:22           ` Ming Lei
2019-03-31  3:11           ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181121220213.GK26006@thunk.org \
    --to=tytso@mit.edu \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@wdc.com \
    --cc=drjones@redhat.com \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=jianchao.w.wang@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).