All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Ethan Wilson <ethan.wilson@shiftmail.org>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Bitmap does not account for write errors prior to kicking?
Date: Thu, 16 Jan 2014 09:23:28 +1100	[thread overview]
Message-ID: <20140116092328.72c28219@notabene.brown> (raw)
In-Reply-To: <52D6A9DD.7030007@shiftmail.org>

[-- Attachment #1: Type: text/plain, Size: 12260 bytes --]

On Wed, 15 Jan 2014 16:31:41 +0100 Ethan Wilson <ethan.wilson@shiftmail.org>
wrote:

> Hello list,
> I have spotted a problem but I don't know if it's related to MD, libata, 
> SCSI, or the drive firmware. MD seems a bit more likely.

Indeed you have spotted a problem - thanks!

We avoid clearing the bit if any device was failed or missing when the
request started, but if the write fails to any device we don't note the fact
in a way that will stop the bit from being cleared.

This patch should fix it.

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index cc055da02e2a..287892477365 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2110,6 +2110,7 @@ static void raid5_end_write_request(struct bio *bi, int error)
 			set_bit(R5_MadeGoodRepl, &sh->dev[i].flags);
 	} else {
 		if (!uptodate) {
+			set_bit(STRIPE_DEGRADED, &sh->state);
 			set_bit(WriteErrorSeen, &rdev->flags);
 			set_bit(R5_WriteError, &sh->dev[i].flags);
 			if (!test_and_set_bit(WantReplacement, &rdev->flags))

I'll send up upstream in due course.

Thanks for your very thorough report!

NeilBrown


> 
> (kernel is 3.2.0-31-generic #50-Ubuntu SMP)
> 
> I have a raid5 array with a bitmap, like this
> 
> md1 : active raid5 sdd1[3] sdf1[5] sdc1[2] sdb1[1] sda1[0]
>        3867340800 blocks super 1.2 level 5, 1024k chunk, algorithm 2 
> [5/5] [UUUUU]
>        bitmap: 5/58 pages [20KB], 8192KB chunk
> 
> sdd1 failed a few days ago with an IDNF error logged in dmesg (see 
> bottom of email), which was repeated 8 times and eventually, after about 
> 90 seconds, triggered kicking of the drive out of the array.
> 
> After some SMART tests, the sdd disk appeared good to me, so I re-added 
> it to the array.
> There was a brief resync due to the bitmap, then all seemed good.
> 
> However, running a check afterwards, I found 464 mismatches between 
> sectors 1206556120 and 1207194368  (numbers as coming out of sync_completed)
> This is very near to the position of the IDNF error. Also I am quite 
> confident that there were no mismatches before such episode.
> 
> Since the disk was kicked at the 8th IDNF error in dmesg and not before 
> (not at the 1st or 2nd error... why?), and MD didn't complain before the 
> 8th error, I am wondering if something went wrong, such as some bitmaps 
> bits might have cleared when in fact there were write errors pending 
> (e.g. cleared after write errors were logged in dmesg, but prior to 
> kicking the drive; such time frame was 88 seconds).
> 
> One more thing that puzzles me is that the position of the mismatches 
> found by check is not equal, in fact a bit below in term of LBA 
> position, to what I would expect from the IDNF, even if it's very near:
> Consider that partition 1 in sdd starts at 32768 sectors (16MB) from the 
> start of the disk, then there is an 1.2 MD superblock which is 1MB afair.
> The IDNF error is at sector 1207453816 according to dmesg, and this is 
> computed from the start of the disk, so I would have expected the 
> mismatch to be a value 17MB lower than that, to account for partition 
> offset and MD superblock, so it would be at sector 1207419000 , but in 
> fact the mismatches are found between 1206556120 and 1207194368 as I 
> said so they are at least 109MB above the reported IDNF error. 109MB are 
> still a small distance considering the disk is 1TB so I am pretty sure 
> such mismatches are related to the IDNF failure / drive kicking and not 
> purely random.
> 
> But maybe the 109MB distance is due to that I don't know how to read 
> those numbers.
> This is how I found the mismatches offset: i periodically dumped the 
> value of mismatch_cnt and that of sync_completed , while the array was 
> performing the check:
> the file is like:
> .....
> 0,1205239464 / 1933670400
> 0,1205239464 / 1933670400
> 0,1205239464 / 1933670400
> 0,1205239464 / 1933670400
> 0,1205797472 / 1933670400
> 0,1205797472 / 1933670400
> 0,1205797472 / 1933670400
> 0,1205797472 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 0,1206556120 / 1933670400
> 464,1206556120 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207194368 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> 464,1207849472 / 1933670400
> .....
> 
> so you see it jumps from 0 to 464 mismatches between 1206556120 and 
> 1207194368.
> 
> 
> Here is the relevant part of the dmesg:
> ---------------------------------------------------------------------------
> [7432327.406259] ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 
> action 0x0
> [7432327.406333] ata4.00: irq_stat 0x40000008
> [7432327.406374] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432327.406423] ata4.00: cmd 61/00:70:50:61:f6/04:00:47:00:00/40 tag 14 
> ncq 524288 out
> [7432327.406425]          res 41/10:a0:50:61:f6/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432327.406563] ata4.00: status: { DRDY ERR }
> [7432327.406602] ata4.00: error: { IDNF }
> [7432327.408627] ata4.00: configured for UDMA/133
> [7432327.408779] ata4: EH complete
> [7432343.618770] ata4.00: exception Emask 0x0 SAct 0x40000000 SErr 0x0 
> action 0x0
> [7432343.618844] ata4.00: irq_stat 0x40000008
> [7432343.618881] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432343.618929] ata4.00: cmd 61/a8:f0:58:55:f7/02:00:47:00:00/40 tag 30 
> ncq 348160 out
> [7432343.618931]          res 41/10:48:58:55:f7/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432343.619071] ata4.00: status: { DRDY ERR }
> [7432343.619110] ata4.00: error: { IDNF }
> [7432343.621130] ata4.00: configured for UDMA/133
> [7432343.621149] ata4: EH complete
> [7432355.448978] ata4.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 
> action 0x0
> [7432355.449052] ata4.00: irq_stat 0x40000008
> [7432355.449094] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432355.449142] ata4.00: cmd 61/00:b8:78:48:f8/04:00:47:00:00/40 tag 23 
> ncq 524288 out
> [7432355.449144]          res 41/10:c0:78:48:f8/00:01:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432355.449283] ata4.00: status: { DRDY ERR }
> [7432355.449322] ata4.00: error: { IDNF }
> [7432355.451395] ata4.00: configured for UDMA/133
> [7432355.451552] ata4: EH complete
> [7432362.463288] ata4.00: exception Emask 0x0 SAct 0x7fffff81 SErr 0x0 
> action 0x0
> [7432362.463360] ata4.00: irq_stat 0x40000008
> [7432362.463400] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432362.463448] ata4.00: cmd 61/00:38:78:48:f8/04:00:47:00:00/40 tag 7 
> ncq 524288 out
> [7432362.463450]          res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432362.463589] ata4.00: status: { DRDY ERR }
> [7432362.463628] ata4.00: error: { IDNF }
> [7432362.466467] ata4.00: configured for UDMA/133
> [7432362.466589] ata4: EH complete
> [7432374.895106] ata4.00: exception Emask 0x0 SAct 0x1800000 SErr 0x0 
> action 0x0
> [7432374.895157] ata4.00: irq_stat 0x40000008
> [7432374.895197] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432374.895244] ata4.00: cmd 61/00:b8:78:48:f8/04:00:47:00:00/40 tag 23 
> ncq 524288 out
> [7432374.895246]          res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432374.895380] ata4.00: status: { DRDY ERR }
> [7432374.895416] ata4.00: error: { IDNF }
> [7432374.897471] ata4.00: configured for UDMA/133
> [7432374.897498] ata4: EH complete
> [7432381.899309] ata4.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x0
> [7432381.899357] ata4.00: irq_stat 0x40000008
> [7432381.899398] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432381.899446] ata4.00: cmd 61/00:08:78:48:f8/04:00:47:00:00/40 tag 1 
> ncq 524288 out
> [7432381.899448]          res 41/10:01:78:48:f8/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432381.899585] ata4.00: status: { DRDY ERR }
> [7432381.899624] ata4.00: error: { IDNF }
> [7432381.901567] ata4.00: configured for UDMA/133
> [7432381.901587] ata4: EH complete
> [7432388.901393] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
> [7432388.901442] ata4.00: irq_stat 0x40000008
> [7432388.901482] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432388.901526] ata4.00: cmd 61/00:00:78:48:f8/04:00:47:00:00/40 tag 0 
> ncq 524288 out
> [7432388.901528]          res 41/10:02:78:48:f8/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432388.901660] ata4.00: status: { DRDY ERR }
> [7432388.901696] ata4.00: error: { IDNF }
> [7432388.903708] ata4.00: configured for UDMA/133
> [7432388.903728] ata4: EH complete
> [7432395.903549] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
> [7432395.903598] ata4.00: irq_stat 0x40000008
> [7432395.903634] ata4.00: failed command: WRITE FPDMA QUEUED
> [7432395.903682] ata4.00: cmd 61/00:00:78:48:f8/04:00:47:00:00/40 tag 0 
> ncq 524288 out
> [7432395.903684]          res 41/10:02:78:48:f8/00:00:47:00:00/40 Emask 
> 0x481 (invalid argument) <F>
> [7432395.905818] ata4.00: status: { DRDY ERR }
> [7432395.905859] ata4.00: error: { IDNF }
> [7432395.907981] ata4.00: configured for UDMA/133
> [7432395.908045] sd 3:0:0:0: [sdd]  Result: hostbyte=DID_OK 
> driverbyte=DRIVER_SENSE
> [7432395.908050] sd 3:0:0:0: [sdd]  Sense Key : Aborted Command 
> [current] [descriptor]
> [7432395.908057] Descriptor sense data with sense descriptors (in hex):
> [7432395.908061]         72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> [7432395.908074]         47 f8 48 78
> [7432395.908079] sd 3:0:0:0: [sdd]  Add. Sense: Recorded entity not found
> [7432395.908086] sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 47 f8 48 78 00 
> 04 00 00
> [7432395.908099] end_request: I/O error, dev sdd, sector 1207453816
> [7432395.908231] ata4: EH complete
> [7432402.906933] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> [7432402.906982] ata4.00: irq_stat 0x40000001
> [7432402.907021] ata4.00: failed command: FLUSH CACHE EXT
> [7432402.907068] ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [7432402.907070]          res 51/04:00:38:df:f7/00:00:00:00:00/a7 Emask 
> 0x1 (device error)
> [7432402.907177] ata4.00: status: { DRDY ERR }
> [7432402.907216] ata4.00: error: { ABRT }
> [7432402.909334] ata4.00: configured for UDMA/133
> [7432402.909349] ata4: EH complete
> [7432409.907953] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> [7432409.908002] ata4.00: irq_stat 0x40000001
> [7432409.908041] ata4.00: failed command: FLUSH CACHE EXT
> [7432409.908088] ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [7432409.908090]          res 51/04:00:38:df:f7/00:00:00:00:00/a7 Emask 
> 0x1 (device error)
> [7432409.908202] ata4.00: status: { DRDY ERR }
> [7432409.908266] ata4.00: error: { ABRT }
> [7432409.910976] ata4.00: configured for UDMA/133
> [7432409.910998] ata4: EH complete
> [7432414.078061] md/raid:md1: Disk failure on sdd1, disabling device.
> [7432414.078062] md/raid:md1: Operation continuing on 4 devices.
> [7432414.096961] RAID conf printout:
> [7432414.096964]  --- level:5 rd:5 wd:4
> [7432414.096965]  disk 0, o:1, dev:sda1
> [7432414.096967]  disk 1, o:1, dev:sdb1
> [7432414.096968]  disk 2, o:1, dev:sdc1
> [7432414.096969]  disk 3, o:0, dev:sdd1
> [7432414.096970]  disk 4, o:1, dev:sdf1
> [7432414.097043] RAID conf printout:
> [7432414.097049]  --- level:5 rd:5 wd:4
> [7432414.097054]  disk 0, o:1, dev:sda1
> [7432414.097057]  disk 1, o:1, dev:sdb1
> [7432414.097060]  disk 2, o:1, dev:sdc1
> [7432414.097063]  disk 4, o:1, dev:sdf1
> ---------------------------------------------------------------------------
> 
> What do you think about this problem?
> 
> Thanks
> Ethan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

      parent reply	other threads:[~2014-01-15 22:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-15 15:31 Bitmap does not account for write errors prior to kicking? Ethan Wilson
2014-01-15 18:14 ` Chris Murphy
2014-01-15 18:35 ` Piergiorgio Sartor
2014-01-15 22:23 ` NeilBrown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140116092328.72c28219@notabene.brown \
    --to=neilb@suse.de \
    --cc=ethan.wilson@shiftmail.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.