From: Ming Lei <ming.lei@redhat.com>
To: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Jens Axboe <axboe@kernel.dk>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [PATCH] blk-mq: fix corruption with direct issue
Date: Fri, 7 Dec 2018 17:30:17 +0800 [thread overview]
Message-ID: <20181207093016.GE29027@ming.t460p> (raw)
In-Reply-To: <20181207034437.GB22188@ming.t460p>
On Fri, Dec 07, 2018 at 11:44:39AM +0800, Ming Lei wrote:
> On Thu, Dec 06, 2018 at 09:46:42PM -0500, Theodore Y. Ts'o wrote:
> > On Wed, Dec 05, 2018 at 11:03:01AM +0800, Ming Lei wrote:
> > >
> > > But at that time, there isn't io scheduler for MQ, so in theory the
> > > issue should be there since v4.11, especially 945ffb60c11d ("mq-deadline:
> > > add blk-mq adaptation of the deadline IO scheduler").
> >
> > Hi Ming,
> >
> > How were serious you about this issue being there (theoretically) an
> > issue since 4.11? Can you talk about how it might get triggered, and
> > how we can test for it? The reason why I ask is because we're trying
> > to track down a mysterious file system corruption problem on a 4.14.x
> > stable kernel. The symptoms are *very* eerily similar to kernel
> > bugzilla #201685.
>
> Hi Theodore,
>
> It is just a theory analysis.
>
> blk_mq_try_issue_directly() is called in two branches of blk_mq_make_request(),
> both are on real MQ disks.
>
> IO merge can be done on none or real io schedulers, so in theory there might
> be the risk from v4.1, but IO merge on sw queue didn't work for a bit long,
> especially it was fixed by ab42f35d9cb5ac49b5a2.
>
> As Jens mentioned in bugzilla, there are several conditions required
> for triggering the issue:
>
> - MQ device
>
> - queue busy can be triggered. It is hard to trigger in NVMe PCI,
> but may be possible on NVMe FC. However, it can be quite easy to
> trigger on SCSI devices. We know there are some MQ SCSI HBA,
> qlogic FC, megaraid_sas.
>
> - IO merge is enabled.
>
> I have setup scsi_debug in the following way:
>
> modprobe scsi_debug dev_size_mb=4096 clustering=1 \
> max_luns=1 submit_queues=2 max_queue=2
>
> - submit_queues=2 may set this disk as MQ
> - max_queue=4 may trigger the queue busy condition easily
>
> and run some write IO on ext4 over the disk: fio, kernel building,... for
> some time, but still can't trigger the data corruption once.
>
> I should have created more LUN, so that queue may be easier to become
> busy, will do that soon.
Actually I should have used SDEBUG_OPT_HOST_BUSY to simulate the queue busy.
Thanks,
Ming
next prev parent reply other threads:[~2018-12-07 9:30 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-04 22:47 [PATCH] blk-mq: fix corruption with direct issue Jens Axboe
2018-12-05 1:37 ` Ming Lei
2018-12-05 2:16 ` Jens Axboe
2018-12-05 2:23 ` Jens Axboe
2018-12-05 2:27 ` Ming Lei
2018-12-05 2:30 ` Jens Axboe
2018-12-05 2:58 ` Ming Lei
2018-12-05 3:03 ` Ming Lei
2018-12-05 3:05 ` Jens Axboe
2018-12-07 2:46 ` Theodore Y. Ts'o
2018-12-07 3:04 ` Jens Axboe
2018-12-07 3:44 ` Ming Lei
2018-12-07 9:30 ` Ming Lei [this message]
2018-12-05 3:04 ` Jens Axboe
2018-12-05 1:38 ` Guenter Roeck
2018-12-05 2:25 ` Jens Axboe
2018-12-05 17:55 ` Guenter Roeck
2018-12-05 17:59 ` Jens Axboe
2018-12-05 19:09 ` Guenter Roeck
2018-12-05 20:11 ` Jens Axboe
2018-12-05 14:41 ` Christoph Hellwig
2018-12-05 15:15 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181207093016.GE29027@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.