* Bug#604049: linux-image-2.6.32-5-amd64: data corruption with promise stex driver and use of device-mapper layers (lvm/dm-crypt/..)
[not found] <20101119192614.1895.78157.reportbug@niassan.19.ros.03046.com>
@ 2010-11-20 5:33 ` Ben Hutchings
2010-11-22 18:33 ` Ed Lin - PTU
0 siblings, 1 reply; 2+ messages in thread
From: Ben Hutchings @ 2010-11-20 5:33 UTC (permalink / raw)
To: Ed Lin - PTU, Jens Axboe, dm-devel; +Cc: Markus Schulz, 604049
[-- Attachment #1: Type: text/plain, Size: 1438 bytes --]
On Fri, 2010-11-19 at 20:26 +0100, Markus Schulz wrote:
> Package: linux-2.6
> Version: 2.6.32-27
> Severity: critical
> Tags: d-i upstream
> Justification: causes serious data loss
>
> any use of the stex.ko promise hw-raid controller driver with a
> device-mapper layer produces data corruption (or filesystem corruption
> like you can see in my dmesg).
[...]
> i've asked Ed Lin (Maintainer of stex.c from promise) on lkml and got the following answer:
>
> > We found similar problem during test.
>
> > The stex driver sets sg_tablesize as 32 (for st_yel it's 38) in the probe
> > entry. It seems that this value was overridden by the system if using
> > dm/lvm, for unknown reason. The driver received requests with more
> > sg items than registered. Sg item number could be as high as 64. This
> > is completely unexpected. The firmware could not handle such
> > requests, and error occurred.
[..]
I have little idea how this stuff is supposed to work, but it looks like
dm_dispatch_request() calls blk_insert_cloned_request() which calls
blk_rq_check_limits() which checks the request against the maximum
number of segments initialised from sg_tablesize.
We can perhaps mitigate the data loss by checking the number of segments
again in scsi_dispatch_cmd(), but it won't really solve the problem.
Ben.
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Bug#604049: linux-image-2.6.32-5-amd64: data corruption with promise stex driver and use of device-mapper layers (lvm/dm-crypt/..)
2010-11-20 5:33 ` Bug#604049: linux-image-2.6.32-5-amd64: data corruption with promise stex driver and use of device-mapper layers (lvm/dm-crypt/..) Ben Hutchings
@ 2010-11-22 18:33 ` Ed Lin - PTU
0 siblings, 0 replies; 2+ messages in thread
From: Ed Lin - PTU @ 2010-11-22 18:33 UTC (permalink / raw)
To: Ben Hutchings, Jens Axboe, dm-devel; +Cc: Markus Schulz, 604049
>-----Original Message-----
>From: Ben Hutchings [mailto:ben@decadent.org.uk]
>Sent: 2010年11月19日 21:34
>To: Ed Lin - PTU; Jens Axboe; dm-devel@redhat.com
>Cc: Markus Schulz; 604049@bugs.debian.org
>Subject: Re: Bug#604049: linux-image-2.6.32-5-amd64: data
>corruption with promise stex driver and use of device-mapper
>layers (lvm/dm-crypt/..)
>
>
>On Fri, 2010-11-19 at 20:26 +0100, Markus Schulz wrote:
>> Package: linux-2.6
>> Version: 2.6.32-27
>> Severity: critical
>> Tags: d-i upstream
>> Justification: causes serious data loss
>>
>> any use of the stex.ko promise hw-raid controller driver with a
>> device-mapper layer produces data corruption (or filesystem
>corruption
>> like you can see in my dmesg).
>[...]
>> i've asked Ed Lin (Maintainer of stex.c from promise) on
>lkml and got the following answer:
>>
>> > We found similar problem during test.
>>
>> > The stex driver sets sg_tablesize as 32 (for st_yel it's
>38) in the probe
>> > entry. It seems that this value was overridden by the
>system if using
>> > dm/lvm, for unknown reason. The driver received requests with more
>> > sg items than registered. Sg item number could be as high
>as 64. This
>> > is completely unexpected. The firmware could not handle such
>> > requests, and error occurred.
>[..]
>
>I have little idea how this stuff is supposed to work, but it
>looks like
>dm_dispatch_request() calls blk_insert_cloned_request() which calls
>blk_rq_check_limits() which checks the request against the maximum
>number of segments initialised from sg_tablesize.
>
>We can perhaps mitigate the data loss by checking the number
>of segments
>again in scsi_dispatch_cmd(), but it won't really solve the problem.
>
I believe the cause of the problem has been found. An explanation
and a possible solution was posted at the linux-scsi mail list
and dm-devel@redhat.com.
The link is http://marc.info/?l=linux-scsi&m=129021716922966&w=2
So far no one responded. Any comment is welcome.
Thanks,
Ed Lin
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-11-22 18:33 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20101119192614.1895.78157.reportbug@niassan.19.ros.03046.com>
2010-11-20 5:33 ` Bug#604049: linux-image-2.6.32-5-amd64: data corruption with promise stex driver and use of device-mapper layers (lvm/dm-crypt/..) Ben Hutchings
2010-11-22 18:33 ` Ed Lin - PTU
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.