* [RFC] A SCSI fault injection framework using SystemTap.
@ 2008-01-15 3:04 K.Tanaka
2008-01-15 3:31 ` Matthew Wilcox
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: K.Tanaka @ 2008-01-15 3:04 UTC (permalink / raw)
To: linux-scsi; +Cc: linux-kernel, linux-raid, dm-devel
I would like to introduce a SCSI fault injection framework using SystemTap.
Currently, kernel has Fault-injection framework and Faulty mode for md,
which can also be used for testing the error handling. But, they could
only produce fixed type of errors stochastically. In order to simulate
more realistic scsi disk faults, I have created a new flexible fault injection
framework using SystemTap.
The new fault injection framework has the following features:
1) The new framework is flexible, easy to change the condition without changing
the kernel because actually they are SystemTap scripts.
For example, device faults resulting in scsi command timeout, and media
faults which could be corrected by writing data to the failed sector
could be simulated using this framework.
2) The new framework generates "pseudo" faults in the SCSI mid-layer.
Any upper layer app/driver using the SCSI mid-layer can apply this framework.
3) The new framework rewrite the status code and sense data for SCSI command and
pass it to the upper layer. So the real error handling routine of the upper
layer for I/O request can be tested.
I have tested the software RAID (md/dm-mirror) using this framework
and found some bugs.
e.g.
-The kernel thread for md RAID1 could cause a deadlock when the error handler for
md RAID1 contends with the write access to the md RAID1 array.
-dm-mirror's redundancy doesn't work. A read error from the disk consisting
the array will be directory passed to the userspace, without reading from
the other mirror.
(It turns out that this issue is a known issue, but the patch is not merged.
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-raid1-handle-read-failures.patch)
There are also some other bugs for error handling routine in the multiple
fault situation. I will report the details about these bugs later.
The new framework is tested on Fedora8(i386) running with kernel 2.6.23.12.
So far, I'm cleaning up the tool set for release, and plan to post it in the near future.
If you are interested, take a look at it.
If you have any comments, please let me know.
--
------------------------------------------------------------------------
Kenichi TANAKA | Open Source Software Platform Development Division
| Computers Software Operations Unit, NEC Corporation
| k-tanaka@ce.jp.nec.com
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC] A SCSI fault injection framework using SystemTap.
2008-01-15 3:04 [RFC] A SCSI fault injection framework using SystemTap K.Tanaka
@ 2008-01-15 3:31 ` Matthew Wilcox
2008-01-15 9:54 ` K.Tanaka
2008-01-15 11:17 ` Alasdair G Kergon
2008-01-22 10:26 ` K.Tanaka
2 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2008-01-15 3:31 UTC (permalink / raw)
To: K.Tanaka; +Cc: linux-raid, dm-devel, linux-kernel, linux-scsi
On Tue, Jan 15, 2008 at 12:04:09PM +0900, K.Tanaka wrote:
> I would like to introduce a SCSI fault injection framework using SystemTap.
>
> Currently, kernel has Fault-injection framework and Faulty mode for md,
> which can also be used for testing the error handling. But, they could
> only produce fixed type of errors stochastically. In order to simulate
> more realistic scsi disk faults, I have created a new flexible fault injection
> framework using SystemTap.
How does it compare to using scsi_debug, which I believe can do all of
the above and more?
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] A SCSI fault injection framework using SystemTap.
2008-01-15 3:31 ` Matthew Wilcox
@ 2008-01-15 9:54 ` K.Tanaka
0 siblings, 0 replies; 5+ messages in thread
From: K.Tanaka @ 2008-01-15 9:54 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-scsi, linux-kernel, linux-raid, dm-devel
Matthew Wilcox wrote:
> On Tue, Jan 15, 2008 at 12:04:09PM +0900, K.Tanaka wrote:
>> I would like to introduce a SCSI fault injection framework using SystemTap.
>>
>> Currently, kernel has Fault-injection framework and Faulty mode for md,
>> which can also be used for testing the error handling. But, they could
>> only produce fixed type of errors stochastically. In order to simulate
>> more realistic scsi disk faults, I have created a new flexible fault injection
>> framework using SystemTap.
>
> How does it compare to using scsi_debug, which I believe can do all of
> the above and more?
>
Sorry for the lack of explanation.
The new framework is supposed to be used by a userspace testing tool
(such as a shell script). For the availability, this framework enables user to
designate the inode number of the target file on the device to inject faults.
On accessing the target file through page caches, a fault will be injected.
Also, user can designate the logical block address as the target position
of a fault injection.
--
---------------------------------------------------------
Kenichi TANAKA | Open Source Software Platform Development Division
| Computers Software Operations Unit, NEC Corporation
| k-tanaka@ce.jp.nec.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] A SCSI fault injection framework using SystemTap.
2008-01-15 3:04 [RFC] A SCSI fault injection framework using SystemTap K.Tanaka
2008-01-15 3:31 ` Matthew Wilcox
@ 2008-01-15 11:17 ` Alasdair G Kergon
2008-01-22 10:26 ` K.Tanaka
2 siblings, 0 replies; 5+ messages in thread
From: Alasdair G Kergon @ 2008-01-15 11:17 UTC (permalink / raw)
To: device-mapper development; +Cc: linux-raid, linux-kernel, linux-scsi
On Tue, Jan 15, 2008 at 12:04:09PM +0900, K.Tanaka wrote:
> -dm-mirror's redundancy doesn't work. A read error from the disk consisting
> the array will be directory passed to the userspace, without reading from
> the other mirror.
> (It turns out that this issue is a known issue, but the patch is not merged.
> http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-raid1-handle-read-failures.patch)
It's in the queue for 2.6.25.
Alasdair
--
agk@redhat.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] A SCSI fault injection framework using SystemTap.
2008-01-15 3:04 [RFC] A SCSI fault injection framework using SystemTap K.Tanaka
2008-01-15 3:31 ` Matthew Wilcox
2008-01-15 11:17 ` Alasdair G Kergon
@ 2008-01-22 10:26 ` K.Tanaka
2 siblings, 0 replies; 5+ messages in thread
From: K.Tanaka @ 2008-01-22 10:26 UTC (permalink / raw)
To: linux-scsi; +Cc: linux-raid, dm-devel
>The new framework is tested on Fedora8(i386) running with kernel 2.6.23.12.
>So far, I'm cleaning up the tool set for release, and plan to post it in the near future.
Now it's ready. The scsi fault injection tool is available from the following site.
https://sourceforge.net/projects/scsifaultinjtst/
If you have any comments, please let me know.
Additionally, the deadlock problem reproduced also on md RAID10. I think that the
same reason for RAID1 deadlock reported earlier cause this problem, because
raid10.c is based on raid1.c.
> e.g.
> -The kernel thread for md RAID1 could cause a deadlock when the error handler for
> md RAID1 contends with the write access to the md RAID1 array.
I've reproduced the deadlock on RAID10 using this tool with a small shell script for
automatically injecting a fault repeatedly. But I can't come up with any good
idea for the patch to fix this problem so far.
--
------------------------------------------------------------------------
Kenichi TANAKA | Open Source Software Platform Development Division
| Computers Software Operations Unit, NEC Corporation
| k-tanaka@ce.jp.nec.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-01-22 10:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-15 3:04 [RFC] A SCSI fault injection framework using SystemTap K.Tanaka
2008-01-15 3:31 ` Matthew Wilcox
2008-01-15 9:54 ` K.Tanaka
2008-01-15 11:17 ` Alasdair G Kergon
2008-01-22 10:26 ` K.Tanaka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).