* Promise SATAII150 TX4 or raidreconf broken [not found] ` <4313E433.1080602@pobox.com> @ 2005-09-10 8:23 ` Eyal Lebedinsky [not found] ` <43223709.8090002@eyal.emu.id.au> 1 sibling, 0 replies; 5+ messages in thread From: Eyal Lebedinsky @ 2005-09-10 8:23 UTC (permalink / raw) To: linux-raid I am putting together a four disk array. I want to know if it can be extended so I am building a 3-disk array and then growing it to 4. I spent the last week attempting to build and test the array and I have a problem: the array is thrashed by raidreconf. I need to know if this is a hardware problem (TX4?), a raidreconf problem or a kernel issue. It is now becoming urgent for me to sort this out, any hints will be appreciated. If this is a TX4 issue, which SATA controllers (4-way) are known to be supported and good on Linux? I have a test script that does this: build a 3-disk raid-5 mkfs.ext3 copy data in 200+GB fsck OK raidreconf 3->4 disks fsck failed The disks are 320GB SATA "WDC WD3200JD-00K Rev: 08.0". Kernel 2.6.13 vanilla. Controller is Promise SATAII150 TX4. The test takes about 16h to complete. The rebuild messages: ==================== Sat Sep 10 01:19:07 EST 2005 mdbuild: checking the file system ==================================== /dev/md0: 4136/610560 files (2.7% non-contiguous), 55952642/156285568 blocks Sat Sep 10 01:25:51 EST 2005 mdbuild: reconfiguring RAID ==================================== Parsing /etc/raidtab.old Parsing /etc/raidtab.new Old raid-disk 0 has 1220981 chunks, 312571136 blocks Old raid-disk 1 has 1220981 chunks, 312571136 blocks Old raid-disk 2 has 1220981 chunks, 312571136 blocks New raid-disk 0 has 1220981 chunks, 312571136 blocks New raid-disk 1 has 1220981 chunks, 312571136 blocks New raid-disk 2 has 1220981 chunks, 312571136 blocks New raid-disk 3 has 1220981 chunks, 312571136 blocks Using 256 Kbyte blocks to move from 256 Kbyte chunks to 256 Kbyte chunks. Detected 1035336 KB of physical memory in system A maximum of 1181 outstanding requests is allowed Working with device /dev/md0 Size of old array: 1875427344 blocks, Size of new array: 2500569792 blocks --------------------------------------------------- I will grow your old device /dev/md0 of 2441962 blocks to a new device /dev/md0 of 3662943 blocks using a block-size of 256 KB Is this what you want? (yes/no): yes Converting 2441962 block device to 3662943 block device Allocated free block map for 3 disks 4 unique disks detected. Working (/) [02441962/02441962] [############################################] Source drained, flushing sink. Reconfiguration succeeded, will update superblocks... Maximum friend-freeing depth: 8 Total wishes hooked: 2441962 Maximum wishes hooked: 1181 Total gifts hooked: 2441962 Maximum gifts hooked: 991 Congratulations, your array has been reconfigured, and no errors seem to have occured. Updating superblocks... handling MD device /dev/md0 analyzing super-block disk 0: /dev/sda, 312571224kB, raid superblock at 312571136kB disk 1: /dev/sdb, 312571224kB, raid superblock at 312571136kB disk 2: /dev/sdc, 312571224kB, raid superblock at 312571136kB disk 3: /dev/sdd, 312571224kB, raid superblock at 312571136kB Array is updated with kernel. Disks re-inserted in array... Hold on while starting the array... Sat Sep 10 10:30:19 EST 2005 mdbuild: checking the file system ==================================== /dev/md0: Inode 129 is in use, but has dtime set. FIXED. /dev/md0: Inode 129 has imagic flag set. /dev/md0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) Inspecting the fs shows real corruption. It does not even look like full bad blocks but specific entries are bad. The some directories are completely missing and I (naturally) get errors reading the fs (mounted with errors). /data3/mythtv/tv_grab_au: ======================== total 2968465682 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:56 08092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:56 09092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:59 10092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:57 11092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:58 12092005 ?--xrws--T 31794 3359396242 982138100 1048034695 Oct 9 1972 13092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:59 14092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 9 04:52 15092005 srw-----w- 26765 936675348 473714967 2355711621 Oct 5 1976 16092005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:54 26082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:54 27082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:55 28082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:55 29082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 24 04:57 30082005 -rw-r--r-- 1 mythtv mythtv 362153 Nov 5 2004 guide.xml The original has: ================ drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:56 09092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:57 10092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:57 11092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:58 12092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 10 04:58 13092005 <<<<< drwxr-sr-x 2 mythtv mythtv 8192 Sep 8 04:59 14092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 9 04:52 15092005 drwxr-sr-x 2 mythtv mythtv 8192 Sep 10 05:00 16092005 <<<<< drwxr-sr-x 2 mythtv mythtv 8192 Sep 10 05:00 17092005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:54 26082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:54 27082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:55 28082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 26 04:55 29082005 drwxr-sr-x 2 mythtv mythtv 8192 Aug 24 04:57 30082005 -rw-r--r-- 1 mythtv mythtv 362153 Nov 5 2004 guide.xml -- Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/> attach .zip as .dat ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <43223709.8090002@eyal.emu.id.au>]
[parent not found: <x3kfysdq846.fsf@Psilocybe.Update.UU.SE>]
* Re: Promise SATAII150 TX4 or raidreconf broken [not found] ` <x3kfysdq846.fsf@Psilocybe.Update.UU.SE> @ 2005-09-11 15:50 ` Eyal Lebedinsky 2005-09-12 22:50 ` Promise SATAII150 TX4 or raidreconf broken - answer Eyal Lebedinsky 1 sibling, 0 replies; 5+ messages in thread From: Eyal Lebedinsky @ 2005-09-11 15:50 UTC (permalink / raw) To: Thorild Selen; +Cc: linux-ide, linux-raid list Thorild Selen wrote: > If you search a bit in the linux-ide and linux-kernel mailing list > archives, you will find that several people before have had problems > with SATA150-TX4 and SATAII150-TX4 (see for example posts by Jim > Ramsay, Joerg Sommrey and me). Following up on this information I did more testing. I verified that creating a 2-disk raid-5 and extending it to 3 disks always works. 3-disk to 4-disk end up corrupted. BTW I had to change the check in raidreconf for a minimum of raid5 disks from 3 to 2. It worked just fine. I then moved the first two disks to the motherboard (sd[cd] left on the TX4). The situation remained the same (but I did get better performance). I am now less inclined to blame the TX4 and lean more towards raidreconf. I need to create a final test where I hit the disks concurrently without raidreconf to see how they fair... I did some tests and so far failed to provoke any i/o error. -- Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/> attach .zip as .dat ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Promise SATAII150 TX4 or raidreconf broken - answer [not found] ` <x3kfysdq846.fsf@Psilocybe.Update.UU.SE> 2005-09-11 15:50 ` Eyal Lebedinsky @ 2005-09-12 22:50 ` Eyal Lebedinsky 2005-09-18 8:56 ` Tyler 1 sibling, 1 reply; 5+ messages in thread From: Eyal Lebedinsky @ 2005-09-12 22:50 UTC (permalink / raw) To: linux-raid list; +Cc: linux-ide Executive summary: it is not the TX4. It is not really raidreconf. You must specify the parity-algorithm in raidtab because the raidreconf default is not what one expects. I have now investigated the corrupted 3->4 disk raidreconf and I can see that there is a pattern to the problem. A similar pattern is seen with a 4->5 run. I wrote known values to the raid before the reconf and checked after. The process is create /dev/md0 write to it raidreconf it read it and see which blocks show up where What I see is that the 2nd pair of each 6 blocks is swapped. Here is the error list for a test with 1 cyl (31 blocks) per disk: bad block 2 says it is 3 bad block 3 says it is 2 bad block 8 says it is 9 bad block 9 says it is 8 bad block 14 says it is 15 bad block 15 says it is 14 bad block 20 says it is 21 bad block 21 says it is 20 bad block 26 says it is 27 bad block 27 says it is 26 bad block 32 says it is 33 bad block 33 says it is 32 bad block 38 says it is 39 bad block 39 says it is 38 bad block 44 says it is 45 bad block 45 says it is 44 bad block 50 says it is 51 bad block 51 says it is 50 bad block 56 says it is 57 bad block 57 says it is 56 20 errors in 62 blocks At this point I decided that I must take the TX4 out of the equation. This is just too regular for a hardware problem. I created four partitions on one disk and repeated the test. It failed just the same. I was now reasonably convinced that it is raidreconf that gives me grief. Nevertheless, the pattern is just too regular. Maybe the program does not agree with md on the parity algorithm? The default is said to be left-symmetric (see man mdadm; man raidtab does not say), so I specified this explicitly in the raidtab and it started working. Good, but I needed to understand this. Looking at the raidtools code (where raidreconf is built), I think that it does not default to left-symmetric. It looks to me like the config struct is malloced and zeroed (with memset) meaning the .layout member is set to left-asymmetric (see top of parser.c) and I do not see that it is ever set to any other default (left-symmetric would be numeric 2). -- Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/> attach .zip as .dat ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Promise SATAII150 TX4 or raidreconf broken - answer 2005-09-12 22:50 ` Promise SATAII150 TX4 or raidreconf broken - answer Eyal Lebedinsky @ 2005-09-18 8:56 ` Tyler 2005-09-18 9:56 ` Eyal Lebedinsky 0 siblings, 1 reply; 5+ messages in thread From: Tyler @ 2005-09-18 8:56 UTC (permalink / raw) To: Eyal Lebedinsky; +Cc: linux-raid list, linux-ide Eyal Lebedinsky wrote: >Executive summary: it is not the TX4. It is not really raidreconf. >You must specify the parity-algorithm in raidtab because the >raidreconf default is not what one expects. > >I have now investigated the corrupted 3->4 disk raidreconf and I >can see that there is a pattern to the problem. A similar pattern >is seen with a 4->5 run. > >I wrote known values to the raid before the reconf and checked >after. The process is > create /dev/md0 > write to it > raidreconf it > read it and see which blocks show up where > >What I see is that the 2nd pair of each 6 blocks is swapped. Here >is the error list for a test with 1 cyl (31 blocks) per disk: > >bad block 2 says it is 3 >bad block 3 says it is 2 >bad block 8 says it is 9 >bad block 9 says it is 8 >bad block 14 says it is 15 >bad block 15 says it is 14 >bad block 20 says it is 21 >bad block 21 says it is 20 >bad block 26 says it is 27 >bad block 27 says it is 26 >bad block 32 says it is 33 >bad block 33 says it is 32 >bad block 38 says it is 39 >bad block 39 says it is 38 >bad block 44 says it is 45 >bad block 45 says it is 44 >bad block 50 says it is 51 >bad block 51 says it is 50 >bad block 56 says it is 57 >bad block 57 says it is 56 >20 errors in 62 blocks > >At this point I decided that I must take the TX4 out of the equation. >This is just too regular for a hardware problem. I created four >partitions on one disk and repeated the test. It failed just the same. > >I was now reasonably convinced that it is raidreconf that gives me >grief. Nevertheless, the pattern is just too regular. Maybe the program >does not agree with md on the parity algorithm? The default is said >to be left-symmetric (see man mdadm; man raidtab does not say), so I >specified this explicitly in the raidtab and it started working. > >Good, but I needed to understand this. > >Looking at the raidtools code (where raidreconf is built), I think >that it does not default to left-symmetric. It looks to me like the >config struct is malloced and zeroed (with memset) meaning the .layout >member is set to left-asymmetric (see top of parser.c) and I do not >see that it is ever set to any other default (left-symmetric would >be numeric 2). > >-- >Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/> > attach .zip as .dat > >- >To unsubscribe from this list: send the line "unsubscribe linux-raid" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > Nice work Eyal :) Now all we need is a patch for raid-reconf to fix default behaviour? :D Regards, Tyler. -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.344 / Virus Database: 267.11.1/104 - Release Date: 9/16/2005 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Promise SATAII150 TX4 or raidreconf broken - answer 2005-09-18 8:56 ` Tyler @ 2005-09-18 9:56 ` Eyal Lebedinsky 0 siblings, 0 replies; 5+ messages in thread From: Eyal Lebedinsky @ 2005-09-18 9:56 UTC (permalink / raw) To: Tyler; +Cc: linux-raid list Tyler wrote: > Eyal Lebedinsky wrote: > >> Executive summary: it is not the TX4. It is not really raidreconf. >> You must specify the parity-algorithm in raidtab because the >> raidreconf default is not what one expects. [trim] > Nice work Eyal :) > > Now all we need is a patch for raid-reconf to fix default behaviour? :D > > Regards, > Tyler. Well, as simple as this looks, I have little understanding of what goes on here. For example, looking at raidtools-1.00.3, I see this: This is defined in three source files, as well as in common.h # define RAID5_ALGORITHM_LEFT_SYMMETRIC (2) In prconv.c, readboth.c, rrc_raid5.c and sectors.c the algorithm hard set: static unsigned long raid5_compute_block (... whatever ...) { int algorithm = RAID5_ALGORITHM_LEFT_SYMMETRIC; ... /* Select the parity disk based on the user selected algorithm */ switch (algorithm) { I cannot see how 'algorithm' can be altered at all here, so I do not think that these programs respond to the raidtab algorithm at all. In short, maybe raidreconf should not only default to left-symmetric but refuse any other algorithm if it needs to cooperate with the other programs here. Better someone who knows this area does this. The actual setting of a default is simple. -- Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/> attach .zip as .dat ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-09-18 9:56 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <430E4AB0.2060600@eyal.emu.id.au>
[not found] ` <4313E433.1080602@pobox.com>
2005-09-10 8:23 ` Promise SATAII150 TX4 or raidreconf broken Eyal Lebedinsky
[not found] ` <43223709.8090002@eyal.emu.id.au>
[not found] ` <x3kfysdq846.fsf@Psilocybe.Update.UU.SE>
2005-09-11 15:50 ` Eyal Lebedinsky
2005-09-12 22:50 ` Promise SATAII150 TX4 or raidreconf broken - answer Eyal Lebedinsky
2005-09-18 8:56 ` Tyler
2005-09-18 9:56 ` Eyal Lebedinsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).