From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tomasz Chmielewski Subject: Re: resync starts over after each reboot (2.6.18.1)? Date: Mon, 23 Oct 2006 18:43:01 +0200 Message-ID: <453CF115.4020003@wpkg.org> References: <45349861.4010208@wpkg.org> <17716.41259.239292.413627@cse.unsw.edu.au> <4534A236.4080907@wpkg.org> <17724.19056.601329.815951@cse.unsw.edu.au> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <17724.19056.601329.815951@cse.unsw.edu.au> Sender: linux-raid-owner@vger.kernel.org To: Neil Brown Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids Neil Brown wrote: > On Tuesday October 17, mangoo@wpkg.org wrote: >> Neil Brown wrote: >>> On Tuesday October 17, mangoo@wpkg.org wrote: >>>> I just set up a new Debian unstable box. >>>> It's running 2.6.18.1 kernel. >>>> >>>> I created RAID-1 on two disks, with no data or filesystem on it. >>>> As I'm still experimenting with the box, I reboot it quite frequently. >>>> >>>> I noticed that RAID-1 resync starts from the very beginning after each >>>> reboot. Is it normal? >>> No. >>> Has the resync finished when you shut down? >> No, it was still running. >> > > Ok, so obviously it should start after a reboot, but maybe not at the > very beginning. > >>> How do you shut down? >> I simply type "reboot", so it should shutdown/reboot cleanly. >> > > "should". > Do you have kernel logs of the shutdown process? Do they mention md0 > at all? Looks like they are stopped properly: Deactivating swap...done. Shutting down LVM Volume Groups... 0 logical volume(s) in volume group "LVM2" now active md: md0 stopped. md: unbind md: export_rdev(sda1) md: unbind md: export_rdev(sdb1) Stopping MD array md0...done (stopped). Will now restart. md: stopping all md devices. Synchronizing SCSI cache for disk sdb: Synchronizing SCSI cache for disk sda: Now that it's synchronized for the very first time, it reboots correctly. When I mark the array as faulty, remove it, and add it: # mdadm /dev/md0 -f /dev/sda1 # mdadm /dev/md0 -r /dev/sda1 # mdadm /dev/md0 -a /dev/sda1 It begins to rebuild: cat /proc/mdstat Personalities : [raid1] [raid10] [raid6] [raid5] [raid4] md0 : active raid1 sda1[0] sdb1[1] 312568576 blocks [2/1] [_U] [>....................] recovery = 1.2% (3911680/312568576) finish=103.7min speed=49558K/sec unused devices: superthecus:~# reboot When I reboot, it's not rebuilding anymore: # cat /proc/mdstat Personalities : [raid1] [raid10] [raid6] [raid5] [raid4] md0 : active raid1 sdb1[1] 312568576 blocks [2/1] [_U] unused devices: # cat /proc/mdstat # mdadm --detail /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Mon Oct 16 15:23:12 2006 Raid Level : raid1 Array Size : 312568576 (298.09 GiB 320.07 GB) Device Size : 312568576 (298.09 GiB 320.07 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Oct 23 18:31:45 2006 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : df1bff84:2c0da484:f2202985:660e4c4d Events : 0.6 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 17 1 active sync /dev/sdb1 # mdadm /dev/md0 -a /dev/sda1 md: bind RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:1, o:1, dev:sda1 disk 1, wo:0, o:1, dev:sdb1 mdadm: re-added md: syncing RAID array md0 md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. /dev/sda1 md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction. md: using 128k window, over a total of 312568576 blocks. # cat /proc/mdstat Personalities : [raid1] [raid10] [raid6] [raid5] [raid4] md0 : active raid1 sda1[0] sdb1[1] 312568576 blocks [2/1] [_U] [>....................] recovery = 0.0% (113728/312568576) finish=91.5min speed=56864K/sec unused devices: # dmesg|grep dm md: kicking non-fresh sda1 from array! Why was it kicked? I added it to the array, but it didn't fully sync. Curious. Is it normal? >>> Can you post the kernel logs from boot up to when the resync has >>> started? >> # dmesg|grep md >> > .... > > That looks OK except there is no > > md: resuming recovery of md0 from checkpoint > > as I would expect.... > Are you creating the array with an internal bitmap? Frankly, I don't know... It was created by the Debian installer, and once it's synchronized, it works fine. >> (it has so many raid levels, as I'm still experimenting with it). >> > > Experimentation is good!! It helps you find my bugs :-) :) -- Tomasz Chmielewski http://wpkg.org