* Bitmap does not account for write errors prior to kicking?
@ 2014-01-15 15:31 Ethan Wilson
2014-01-15 18:14 ` Chris Murphy
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Ethan Wilson @ 2014-01-15 15:31 UTC (permalink / raw)
To: linux-raid
Hello list,
I have spotted a problem but I don't know if it's related to MD, libata,
SCSI, or the drive firmware. MD seems a bit more likely.
(kernel is 3.2.0-31-generic #50-Ubuntu SMP)
I have a raid5 array with a bitmap, like this
md1 : active raid5 sdd1[3] sdf1[5] sdc1[2] sdb1[1] sda1[0]
3867340800 blocks super 1.2 level 5, 1024k chunk, algorithm 2
[5/5] [UUUUU]
bitmap: 5/58 pages [20KB], 8192KB chunk
sdd1 failed a few days ago with an IDNF error logged in dmesg (see
bottom of email), which was repeated 8 times and eventually, after about
90 seconds, triggered kicking of the drive out of the array.
After some SMART tests, the sdd disk appeared good to me, so I re-added
it to the array.
There was a brief resync due to the bitmap, then all seemed good.
However, running a check afterwards, I found 464 mismatches between
sectors 1206556120 and 1207194368 (numbers as coming out of sync_completed)
This is very near to the position of the IDNF error. Also I am quite
confident that there were no mismatches before such episode.
Since the disk was kicked at the 8th IDNF error in dmesg and not before
(not at the 1st or 2nd error... why?), and MD didn't complain before the
8th error, I am wondering if something went wrong, such as some bitmaps
bits might have cleared when in fact there were write errors pending
(e.g. cleared after write errors were logged in dmesg, but prior to
kicking the drive; such time frame was 88 seconds).
One more thing that puzzles me is that the position of the mismatches
found by check is not equal, in fact a bit below in term of LBA
position, to what I would expect from the IDNF, even if it's very near:
Consider that partition 1 in sdd starts at 32768 sectors (16MB) from the
start of the disk, then there is an 1.2 MD superblock which is 1MB afair.
The IDNF error is at sector 1207453816 according to dmesg, and this is
computed from the start of the disk, so I would have expected the
mismatch to be a value 17MB lower than that, to account for partition
offset and MD superblock, so it would be at sector 1207419000 , but in
fact the mismatches are found between 1206556120 and 1207194368 as I
said so they are at least 109MB above the reported IDNF error. 109MB are
still a small distance considering the disk is 1TB so I am pretty sure
such mismatches are related to the IDNF failure / drive kicking and not
purely random.
But maybe the 109MB distance is due to that I don't know how to read
those numbers.
This is how I found the mismatches offset: i periodically dumped the
value of mismatch_cnt and that of sync_completed , while the array was
performing the check:
the file is like:
.....
0,1205239464 / 1933670400
0,1205239464 / 1933670400
0,1205239464 / 1933670400
0,1205239464 / 1933670400
0,1205797472 / 1933670400
0,1205797472 / 1933670400
0,1205797472 / 1933670400
0,1205797472 / 1933670400
0,1206556120 / 1933670400
0,1206556120 / 1933670400
0,1206556120 / 1933670400
0,1206556120 / 1933670400
0,1206556120 / 1933670400
0,1206556120 / 1933670400
464,1206556120 / 1933670400
464,1207194368 / 1933670400
464,1207194368 / 1933670400
464,1207194368 / 1933670400
464,1207194368 / 1933670400
464,1207194368 / 1933670400
464,1207849472 / 1933670400
464,1207849472 / 1933670400
464,1207849472 / 1933670400
464,1207849472 / 1933670400
464,1207849472 / 1933670400
.....
so you see it jumps from 0 to 464 mismatches between 1206556120 and
1207194368.
Here is the relevant part of the dmesg:
---------------------------------------------------------------------------
[7432327.406259] ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0
action 0x0
[7432327.406333] ata4.00: irq_stat 0x40000008
[7432327.406374] ata4.00: failed command: WRITE FPDMA QUEUED
[7432327.406423] ata4.00: cmd 61/00:70:50:61:f6/04:00:47:00:00/40 tag 14
ncq 524288 out
[7432327.406425] res 41/10:a0:50:61:f6/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432327.406563] ata4.00: status: { DRDY ERR }
[7432327.406602] ata4.00: error: { IDNF }
[7432327.408627] ata4.00: configured for UDMA/133
[7432327.408779] ata4: EH complete
[7432343.618770] ata4.00: exception Emask 0x0 SAct 0x40000000 SErr 0x0
action 0x0
[7432343.618844] ata4.00: irq_stat 0x40000008
[7432343.618881] ata4.00: failed command: WRITE FPDMA QUEUED
[7432343.618929] ata4.00: cmd 61/a8:f0:58:55:f7/02:00:47:00:00/40 tag 30
ncq 348160 out
[7432343.618931] res 41/10:48:58:55:f7/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432343.619071] ata4.00: status: { DRDY ERR }
[7432343.619110] ata4.00: error: { IDNF }
[7432343.621130] ata4.00: configured for UDMA/133
[7432343.621149] ata4: EH complete
[7432355.448978] ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0
action 0x0
[7432355.449052] ata4.00: irq_stat 0x40000008
[7432355.449094] ata4.00: failed command: WRITE FPDMA QUEUED
[7432355.449142] ata4.00: cmd 61/00:b8:78:48:f8/04:00:47:00:00/40 tag 23
ncq 524288 out
[7432355.449144] res 41/10:c0:78:48:f8/00:01:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432355.449283] ata4.00: status: { DRDY ERR }
[7432355.449322] ata4.00: error: { IDNF }
[7432355.451395] ata4.00: configured for UDMA/133
[7432355.451552] ata4: EH complete
[7432362.463288] ata4.00: exception Emask 0x0 SAct 0x7fffff81 SErr 0x0
action 0x0
[7432362.463360] ata4.00: irq_stat 0x40000008
[7432362.463400] ata4.00: failed command: WRITE FPDMA QUEUED
[7432362.463448] ata4.00: cmd 61/00:38:78:48:f8/04:00:47:00:00/40 tag 7
ncq 524288 out
[7432362.463450] res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432362.463589] ata4.00: status: { DRDY ERR }
[7432362.463628] ata4.00: error: { IDNF }
[7432362.466467] ata4.00: configured for UDMA/133
[7432362.466589] ata4: EH complete
[7432374.895106] ata4.00: exception Emask 0x0 SAct 0x1800000 SErr 0x0
action 0x0
[7432374.895157] ata4.00: irq_stat 0x40000008
[7432374.895197] ata4.00: failed command: WRITE FPDMA QUEUED
[7432374.895244] ata4.00: cmd 61/00:b8:78:48:f8/04:00:47:00:00/40 tag 23
ncq 524288 out
[7432374.895246] res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432374.895380] ata4.00: status: { DRDY ERR }
[7432374.895416] ata4.00: error: { IDNF }
[7432374.897471] ata4.00: configured for UDMA/133
[7432374.897498] ata4: EH complete
[7432381.899309] ata4.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x0
[7432381.899357] ata4.00: irq_stat 0x40000008
[7432381.899398] ata4.00: failed command: WRITE FPDMA QUEUED
[7432381.899446] ata4.00: cmd 61/00:08:78:48:f8/04:00:47:00:00/40 tag 1
ncq 524288 out
[7432381.899448] res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432381.899585] ata4.00: status: { DRDY ERR }
[7432381.899624] ata4.00: error: { IDNF }
[7432381.901567] ata4.00: configured for UDMA/133
[7432381.901587] ata4: EH complete
[7432388.901393] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[7432388.901442] ata4.00: irq_stat 0x40000008
[7432388.901482] ata4.00: failed command: WRITE FPDMA QUEUED
[7432388.901526] ata4.00: cmd 61/00:00:78:48:f8/04:00:47:00:00/40 tag 0
ncq 524288 out
[7432388.901528] res 41/10:02:78:48:f8/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432388.901660] ata4.00: status: { DRDY ERR }
[7432388.901696] ata4.00: error: { IDNF }
[7432388.903708] ata4.00: configured for UDMA/133
[7432388.903728] ata4: EH complete
[7432395.903549] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[7432395.903598] ata4.00: irq_stat 0x40000008
[7432395.903634] ata4.00: failed command: WRITE FPDMA QUEUED
[7432395.903682] ata4.00: cmd 61/00:00:78:48:f8/04:00:47:00:00/40 tag 0
ncq 524288 out
[7432395.903684] res 41/10:02:78:48:f8/00:00:47:00:00/40 Emask
0x481 (invalid argument) <F>
[7432395.905818] ata4.00: status: { DRDY ERR }
[7432395.905859] ata4.00: error: { IDNF }
[7432395.907981] ata4.00: configured for UDMA/133
[7432395.908045] sd 3:0:0:0: [sdd] Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[7432395.908050] sd 3:0:0:0: [sdd] Sense Key : Aborted Command
[current] [descriptor]
[7432395.908057] Descriptor sense data with sense descriptors (in hex):
[7432395.908061] 72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[7432395.908074] 47 f8 48 78
[7432395.908079] sd 3:0:0:0: [sdd] Add. Sense: Recorded entity not found
[7432395.908086] sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 47 f8 48 78 00
04 00 00
[7432395.908099] end_request: I/O error, dev sdd, sector 1207453816
[7432395.908231] ata4: EH complete
[7432402.906933] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[7432402.906982] ata4.00: irq_stat 0x40000001
[7432402.907021] ata4.00: failed command: FLUSH CACHE EXT
[7432402.907068] ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[7432402.907070] res 51/04:00:38:df:f7/00:00:00:00:00/a7 Emask
0x1 (device error)
[7432402.907177] ata4.00: status: { DRDY ERR }
[7432402.907216] ata4.00: error: { ABRT }
[7432402.909334] ata4.00: configured for UDMA/133
[7432402.909349] ata4: EH complete
[7432409.907953] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[7432409.908002] ata4.00: irq_stat 0x40000001
[7432409.908041] ata4.00: failed command: FLUSH CACHE EXT
[7432409.908088] ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[7432409.908090] res 51/04:00:38:df:f7/00:00:00:00:00/a7 Emask
0x1 (device error)
[7432409.908202] ata4.00: status: { DRDY ERR }
[7432409.908266] ata4.00: error: { ABRT }
[7432409.910976] ata4.00: configured for UDMA/133
[7432409.910998] ata4: EH complete
[7432414.078061] md/raid:md1: Disk failure on sdd1, disabling device.
[7432414.078062] md/raid:md1: Operation continuing on 4 devices.
[7432414.096961] RAID conf printout:
[7432414.096964] --- level:5 rd:5 wd:4
[7432414.096965] disk 0, o:1, dev:sda1
[7432414.096967] disk 1, o:1, dev:sdb1
[7432414.096968] disk 2, o:1, dev:sdc1
[7432414.096969] disk 3, o:0, dev:sdd1
[7432414.096970] disk 4, o:1, dev:sdf1
[7432414.097043] RAID conf printout:
[7432414.097049] --- level:5 rd:5 wd:4
[7432414.097054] disk 0, o:1, dev:sda1
[7432414.097057] disk 1, o:1, dev:sdb1
[7432414.097060] disk 2, o:1, dev:sdc1
[7432414.097063] disk 4, o:1, dev:sdf1
---------------------------------------------------------------------------
What do you think about this problem?
Thanks
Ethan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bitmap does not account for write errors prior to kicking?
2014-01-15 15:31 Bitmap does not account for write errors prior to kicking? Ethan Wilson
@ 2014-01-15 18:14 ` Chris Murphy
2014-01-15 18:35 ` Piergiorgio Sartor
2014-01-15 22:23 ` NeilBrown
2 siblings, 0 replies; 4+ messages in thread
From: Chris Murphy @ 2014-01-15 18:14 UTC (permalink / raw)
To: linux-raid
On Jan 15, 2014, at 8:31 AM, Ethan Wilson <ethan.wilson@shiftmail.org> wrote:
> Hello list,
> I have spotted a problem but I don't know if it's related to MD, libata, SCSI, or the drive firmware. MD seems a bit more likely.
In all cases I've found googling, IDNF is a drive problem. If reported to the manufacturer and it's in warranty they replace it, so I'd try that and if they don't report back with their explanation how this is normal behavior.
Chris Murphy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bitmap does not account for write errors prior to kicking?
2014-01-15 15:31 Bitmap does not account for write errors prior to kicking? Ethan Wilson
2014-01-15 18:14 ` Chris Murphy
@ 2014-01-15 18:35 ` Piergiorgio Sartor
2014-01-15 22:23 ` NeilBrown
2 siblings, 0 replies; 4+ messages in thread
From: Piergiorgio Sartor @ 2014-01-15 18:35 UTC (permalink / raw)
To: Ethan Wilson; +Cc: linux-raid
On Wed, Jan 15, 2014 at 04:31:41PM +0100, Ethan Wilson wrote:
[...]
> What do you think about this problem?
Hi Ethan,
I noticed something similar, but I
could not really pin point it.
I've a test RAID-6 made of USB drives.
These drives, being USB, are prone to
leave the array.
So, sometimes, one drive is kicked out.
Having the bitmap active, is no problem
to re-add it back.
Checking the filesystem later, it happens,
sometimes, to find errors.
Now that you point out, it seems a similar
situation. As you wrote is seems the bitmap
is clear before the actual data is written.
Unfortunately, it is difficult (for me) to
reproduce or debug.
bye,
pg
> Thanks
> Ethan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
piergiorgio
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bitmap does not account for write errors prior to kicking?
2014-01-15 15:31 Bitmap does not account for write errors prior to kicking? Ethan Wilson
2014-01-15 18:14 ` Chris Murphy
2014-01-15 18:35 ` Piergiorgio Sartor
@ 2014-01-15 22:23 ` NeilBrown
2 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2014-01-15 22:23 UTC (permalink / raw)
To: Ethan Wilson; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 12260 bytes --]
On Wed, 15 Jan 2014 16:31:41 +0100 Ethan Wilson <ethan.wilson@shiftmail.org>
wrote:
> Hello list,
> I have spotted a problem but I don't know if it's related to MD, libata,
> SCSI, or the drive firmware. MD seems a bit more likely.
Indeed you have spotted a problem - thanks!
We avoid clearing the bit if any device was failed or missing when the
request started, but if the write fails to any device we don't note the fact
in a way that will stop the bit from being cleared.
This patch should fix it.
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index cc055da02e2a..287892477365 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2110,6 +2110,7 @@ static void raid5_end_write_request(struct bio *bi, int error)
set_bit(R5_MadeGoodRepl, &sh->dev[i].flags);
} else {
if (!uptodate) {
+ set_bit(STRIPE_DEGRADED, &sh->state);
set_bit(WriteErrorSeen, &rdev->flags);
set_bit(R5_WriteError, &sh->dev[i].flags);
if (!test_and_set_bit(WantReplacement, &rdev->flags))
I'll send up upstream in due course.
Thanks for your very thorough report!
NeilBrown
>
> (kernel is 3.2.0-31-generic #50-Ubuntu SMP)
>
> I have a raid5 array with a bitmap, like this
>
> md1 : active raid5 sdd1[3] sdf1[5] sdc1[2] sdb1[1] sda1[0]
> 3867340800 blocks super 1.2 level 5, 1024k chunk, algorithm 2
> [5/5] [UUUUU]
> bitmap: 5/58 pages [20KB], 8192KB chunk
>
> sdd1 failed a few days ago with an IDNF error logged in dmesg (see
> bottom of email), which was repeated 8 times and eventually, after about
> 90 seconds, triggered kicking of the drive out of the array.
>
> After some SMART tests, the sdd disk appeared good to me, so I re-added
> it to the array.
> There was a brief resync due to the bitmap, then all seemed good.
>
> However, running a check afterwards, I found 464 mismatches between
> sectors 1206556120 and 1207194368 (numbers as coming out of sync_completed)
> This is very near to the position of the IDNF error. Also I am quite
> confident that there were no mismatches before such episode.
>
> Since the disk was kicked at the 8th IDNF error in dmesg and not before
> (not at the 1st or 2nd error... why?), and MD didn't complain before the
> 8th error, I am wondering if something went wrong, such as some bitmaps
> bits might have cleared when in fact there were write errors pending
> (e.g. cleared after write errors were logged in dmesg, but prior to
> kicking the drive; such time frame was 88 seconds).
>
> One more thing that puzzles me is that the position of the mismatches
> found by check is not equal, in fact a bit below in term of LBA
> position, to what I would expect from the IDNF, even if it's very near:
> Consider that partition 1 in sdd starts at 32768 sectors (16MB) from the
> start of the disk, then there is an 1.2 MD superblock which is 1MB afair.
> The IDNF error is at sector 1207453816 according to dmesg, and this is
> computed from the start of the disk, so I would have expected the
> mismatch to be a value 17MB lower than that, to account for partition
> offset and MD superblock, so it would be at sector 1207419000 , but in
> fact the mismatches are found between 1206556120 and 1207194368 as I
> said so they are at least 109MB above the reported IDNF error. 109MB are
> still a small distance considering the disk is 1TB so I am pretty sure
> such mismatches are related to the IDNF failure / drive kicking and not
> purely random.
>
> But maybe the 109MB distance is due to that I don't know how to read
> those numbers.
> This is how I found the mismatches offset: i periodically dumped the
> value of mismatch_cnt and that of sync_completed , while the array was
> performing the check:
> the file is like:
> .....
> 0,1205239464 / 1933670400
> 0,1205239464 / 1933670400
> 0,1205239464 / 1933670400
> 0,1205239464 / 1933670400
> 0,1205797472 / 1933670400
> 0,1205797472 / 1933670400
> 0,1205797472 / 1933670400
> 0,1205797472 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 464,1206556120 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> .....
>
> so you see it jumps from 0 to 464 mismatches between 1206556120 and
> 1207194368.
>
>
> Here is the relevant part of the dmesg:
> ---------------------------------------------------------------------------
> [7432327.406259] ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0
> action 0x0
> [7432327.406333] ata4.00: irq_stat 0x40000008
> [7432327.406374] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432327.406423] ata4.00: cmd 61/00:70:50:61:f6/04:00:47:00:00/40 tag 14
> ncq 524288 out
> [7432327.406425] res 41/10:a0:50:61:f6/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432327.406563] ata4.00: status: { DRDY ERR }
> [7432327.406602] ata4.00: error: { IDNF }
> [7432327.408627] ata4.00: configured for UDMA/133
> [7432327.408779] ata4: EH complete
> [7432343.618770] ata4.00: exception Emask 0x0 SAct 0x40000000 SErr 0x0
> action 0x0
> [7432343.618844] ata4.00: irq_stat 0x40000008
> [7432343.618881] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432343.618929] ata4.00: cmd 61/a8:f0:58:55:f7/02:00:47:00:00/40 tag 30
> ncq 348160 out
> [7432343.618931] res 41/10:48:58:55:f7/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432343.619071] ata4.00: status: { DRDY ERR }
> [7432343.619110] ata4.00: error: { IDNF }
> [7432343.621130] ata4.00: configured for UDMA/133
> [7432343.621149] ata4: EH complete
> [7432355.448978] ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0
> action 0x0
> [7432355.449052] ata4.00: irq_stat 0x40000008
> [7432355.449094] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432355.449142] ata4.00: cmd 61/00:b8:78:48:f8/04:00:47:00:00/40 tag 23
> ncq 524288 out
> [7432355.449144] res 41/10:c0:78:48:f8/00:01:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432355.449283] ata4.00: status: { DRDY ERR }
> [7432355.449322] ata4.00: error: { IDNF }
> [7432355.451395] ata4.00: configured for UDMA/133
> [7432355.451552] ata4: EH complete
> [7432362.463288] ata4.00: exception Emask 0x0 SAct 0x7fffff81 SErr 0x0
> action 0x0
> [7432362.463360] ata4.00: irq_stat 0x40000008
> [7432362.463400] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432362.463448] ata4.00: cmd 61/00:38:78:48:f8/04:00:47:00:00/40 tag 7
> ncq 524288 out
> [7432362.463450] res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432362.463589] ata4.00: status: { DRDY ERR }
> [7432362.463628] ata4.00: error: { IDNF }
> [7432362.466467] ata4.00: configured for UDMA/133
> [7432362.466589] ata4: EH complete
> [7432374.895106] ata4.00: exception Emask 0x0 SAct 0x1800000 SErr 0x0
> action 0x0
> [7432374.895157] ata4.00: irq_stat 0x40000008
> [7432374.895197] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432374.895244] ata4.00: cmd 61/00:b8:78:48:f8/04:00:47:00:00/40 tag 23
> ncq 524288 out
> [7432374.895246] res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432374.895380] ata4.00: status: { DRDY ERR }
> [7432374.895416] ata4.00: error: { IDNF }
> [7432374.897471] ata4.00: configured for UDMA/133
> [7432374.897498] ata4: EH complete
> [7432381.899309] ata4.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x0
> [7432381.899357] ata4.00: irq_stat 0x40000008
> [7432381.899398] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432381.899446] ata4.00: cmd 61/00:08:78:48:f8/04:00:47:00:00/40 tag 1
> ncq 524288 out
> [7432381.899448] res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432381.899585] ata4.00: status: { DRDY ERR }
> [7432381.899624] ata4.00: error: { IDNF }
> [7432381.901567] ata4.00: configured for UDMA/133
> [7432381.901587] ata4: EH complete
> [7432388.901393] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
> [7432388.901442] ata4.00: irq_stat 0x40000008
> [7432388.901482] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432388.901526] ata4.00: cmd 61/00:00:78:48:f8/04:00:47:00:00/40 tag 0
> ncq 524288 out
> [7432388.901528] res 41/10:02:78:48:f8/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432388.901660] ata4.00: status: { DRDY ERR }
> [7432388.901696] ata4.00: error: { IDNF }
> [7432388.903708] ata4.00: configured for UDMA/133
> [7432388.903728] ata4: EH complete
> [7432395.903549] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
> [7432395.903598] ata4.00: irq_stat 0x40000008
> [7432395.903634] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432395.903682] ata4.00: cmd 61/00:00:78:48:f8/04:00:47:00:00/40 tag 0
> ncq 524288 out
> [7432395.903684] res 41/10:02:78:48:f8/00:00:47:00:00/40 Emask
> 0x481 (invalid argument) <F>
> [7432395.905818] ata4.00: status: { DRDY ERR }
> [7432395.905859] ata4.00: error: { IDNF }
> [7432395.907981] ata4.00: configured for UDMA/133
> [7432395.908045] sd 3:0:0:0: [sdd] Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [7432395.908050] sd 3:0:0:0: [sdd] Sense Key : Aborted Command
> [current] [descriptor]
> [7432395.908057] Descriptor sense data with sense descriptors (in hex):
> [7432395.908061] 72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> [7432395.908074] 47 f8 48 78
> [7432395.908079] sd 3:0:0:0: [sdd] Add. Sense: Recorded entity not found
> [7432395.908086] sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 47 f8 48 78 00
> 04 00 00
> [7432395.908099] end_request: I/O error, dev sdd, sector 1207453816
> [7432395.908231] ata4: EH complete
> [7432402.906933] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> [7432402.906982] ata4.00: irq_stat 0x40000001
> [7432402.907021] ata4.00: failed command: FLUSH CACHE EXT
> [7432402.907068] ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [7432402.907070] res 51/04:00:38:df:f7/00:00:00:00:00/a7 Emask
> 0x1 (device error)
> [7432402.907177] ata4.00: status: { DRDY ERR }
> [7432402.907216] ata4.00: error: { ABRT }
> [7432402.909334] ata4.00: configured for UDMA/133
> [7432402.909349] ata4: EH complete
> [7432409.907953] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> [7432409.908002] ata4.00: irq_stat 0x40000001
> [7432409.908041] ata4.00: failed command: FLUSH CACHE EXT
> [7432409.908088] ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [7432409.908090] res 51/04:00:38:df:f7/00:00:00:00:00/a7 Emask
> 0x1 (device error)
> [7432409.908202] ata4.00: status: { DRDY ERR }
> [7432409.908266] ata4.00: error: { ABRT }
> [7432409.910976] ata4.00: configured for UDMA/133
> [7432409.910998] ata4: EH complete
> [7432414.078061] md/raid:md1: Disk failure on sdd1, disabling device.
> [7432414.078062] md/raid:md1: Operation continuing on 4 devices.
> [7432414.096961] RAID conf printout:
> [7432414.096964] --- level:5 rd:5 wd:4
> [7432414.096965] disk 0, o:1, dev:sda1
> [7432414.096967] disk 1, o:1, dev:sdb1
> [7432414.096968] disk 2, o:1, dev:sdc1
> [7432414.096969] disk 3, o:0, dev:sdd1
> [7432414.096970] disk 4, o:1, dev:sdf1
> [7432414.097043] RAID conf printout:
> [7432414.097049] --- level:5 rd:5 wd:4
> [7432414.097054] disk 0, o:1, dev:sda1
> [7432414.097057] disk 1, o:1, dev:sdb1
> [7432414.097060] disk 2, o:1, dev:sdc1
> [7432414.097063] disk 4, o:1, dev:sdf1
> ---------------------------------------------------------------------------
>
> What do you think about this problem?
>
> Thanks
> Ethan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-01-15 22:23 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-15 15:31 Bitmap does not account for write errors prior to kicking? Ethan Wilson
2014-01-15 18:14 ` Chris Murphy
2014-01-15 18:35 ` Piergiorgio Sartor
2014-01-15 22:23 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).