From: Vladislav Bolkhovitin <vst@vlnb.net>
To: Tejun Heo <tj@kernel.org>
Cc: Bryan Mesich <bryan.mesich@ndsu.edu>,
scst-devel@lists.sourceforge.net,
Jens Axboe <jens.axboe@oracle.com>,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
dm-devel@redhat.com
Subject: RAID/block regression starting from 2.6.32, bisected
Date: Wed, 28 Jul 2010 22:16:19 +0400 [thread overview]
Message-ID: <4C5073F3.1060406@vlnb.net> (raw)
In-Reply-To: <20100727220110.GF31152@atlantis.cc.ndsu.nodak.edu>
Hello,
In recent kernels we are experiencing a problem that in our setup using SCST BLOCKIO backend some BIOs are finished, i.e. the finish callback called for them, with error -EIO. It happens quite often, much more often than one would expect to have an actual IO error. (BLOCKIO backend just converts all incoming SCSI commands to the corresponding block requests.)
After some investigation, we figured out, that, most likely, raid5.c::make_request() for some reason sometimes calls bio_endio() with not BIO_UPTODATE bios.
We bisected it to commit:
commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
Author: Tejun Heo <tj@kernel.org>
Date: Fri Jul 3 17:48:16 2009 +0900
block: use the same failfast bits for bio and request
bio and request use the same set of failfast bits. This patch makes
the following changes to simplify things.
* enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
bits coincide with __REQ_FAILFAST_* bits.
* The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
but the matching is useless anyway. init_request_from_bio() is
responsible for setting FAILFAST bits on FS requests and non-FS
requests never use BIO_RW_AHEAD. Drop the code and comment from
blk_rq_bio_prep().
* Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
simplify FAILFAST flags handling in init_request_from_bio().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
After looking at it I can't see how it can lead to the effect we are experiencing. Could anybody comment on this, please? Is it a known problem?
The error can be only reproduced when running RAID 5. The general layout is:
Disks --> RAID5 --> LVM --> BLOCKIO VDISK
The problem is easy to reproduce by forcing the RAID 5 array to re-sync its members, eg just fail out one member and add it back into the array and then generate some IO using dd. In fact, just writing out to the partition table on the exported block device is usually enough to provoke the error.
The complete thread about the topic you can find in http://sourceforge.net/mailarchive/forum.php?thread_name=20100727220110.GF31152%40atlantis.cc.ndsu.nodak.edu&forum_name=scst-devel
If any additional information is needed we would be glad to provide it.
Thanks,
Vlad
next parent reply other threads:[~2010-07-28 18:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100628010346.GA2376@atlantis.cc.ndsu.nodak.edu>
[not found] ` <4C28EFD6.2070203@vlnb.net>
[not found] ` <20100714190325.GA25148@atlantis.cc.ndsu.nodak.edu>
[not found] ` <4C3EF3AD.5070509@vlnb.net>
[not found] ` <20100723191844.GB31152@atlantis.cc.ndsu.nodak.edu>
[not found] ` <4C4D7DF5.9060909@vlnb.net>
[not found] ` <20100727220110.GF31152@atlantis.cc.ndsu.nodak.edu>
2010-07-28 18:16 ` Vladislav Bolkhovitin [this message]
2010-07-30 10:29 ` RAID/block regression starting from 2.6.32, bisected Tejun Heo
2010-08-02 0:42 ` Neil Brown
2010-08-02 14:12 ` [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
2010-08-02 14:13 ` [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
2010-08-02 14:15 ` [PATCH RESEND " Tejun Heo
2010-08-02 14:18 ` Tejun Heo
2010-08-02 14:15 ` [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits Tejun Heo
2010-08-02 21:52 ` Neil Brown
2010-08-03 9:27 ` Tejun Heo
2010-08-05 18:45 ` Jeff Moyer
2010-08-05 18:57 ` Vladislav Bolkhovitin
2010-08-05 19:30 ` Jeff Moyer
2010-08-06 4:04 ` Tao Ma
2010-08-06 6:29 ` Tejun Heo
2010-08-19 8:38 ` [dm-devel] " Christoph Hellwig
2010-08-02 14:17 ` Tejun Heo
2010-08-02 20:04 ` Jens Axboe
2010-08-03 9:53 ` [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, " Tejun Heo
2010-08-03 11:15 ` Jens Axboe
2010-08-03 11:21 ` Jens Axboe
2010-08-03 15:52 ` Tejun Heo
2010-08-03 16:02 ` [PATCH] coda: rename REQ_* to CODA_REQ_* Tejun Heo
2010-08-03 16:11 ` Jan Harkes
2010-08-03 17:31 ` Jens Axboe
2010-08-03 9:54 ` [PATCH 2/2 block#for-2.6.36] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C5073F3.1060406@vlnb.net \
--to=vst@vlnb.net \
--cc=bryan.mesich@ndsu.edu \
--cc=dm-devel@redhat.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=scst-devel@lists.sourceforge.net \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox