From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vyacheslav Dubeyko Subject: Re: very large mount time after unxepected power down Date: Tue, 30 Oct 2012 18:52:54 +0400 Message-ID: <1351608774.2026.6.camel@slavad-ubuntu> References: <1351604965.2069.13.camel@slavad-ubuntu> Mime-Version: 1.0 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dubeyko.com; s=default; h=Mime-Version:Content-Transfer-Encoding:Content-Type:References:In-Reply-To:Date:Cc:To:From:Subject:Message-ID; bh=/i4yXgHzm4xGaEABiMQzqaakVkAGPwfLI6xdCrGz0Qg=; b=CeO3AMgoF5ij5iNaQq8fgRKwJ9JOXMopt17XVc9SyvPhptCgXU0I4/p1qL6UWnqHa8Ymz+VEtEC0/BwlDfbofja87BxBBKi3bWbM1nAO4kQJy8x9LosU0PdUD15RZJQX; In-Reply-To: Sender: linux-nilfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="utf-8" To: =?UTF-8?Q?=D0=A1=D0=B5=D1=80=D0=B3=D0=B5=D0=B9_?= =?UTF-8?Q?=D0=90=D0=BB=D0=B5=D0=BA=D1=81=D0=B0=D0=BD=D0=B4=D1=80=D0=BE?= =?UTF-8?Q?=D0=B2?= Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Tue, 2012-10-30 at 17:30 +0300, =D0=A1=D0=B5=D1=80=D0=B3=D0=B5=D0=B9= =D0=90=D0=BB=D0=B5=D0=BA=D1=81=D0=B0=D0=BD=D0=B4=D1=80=D0=BE=D0=B2 wro= te: > -------------------------------------------------- > =D0=90=D0=BB=D0=B5=D0=BA=D1=81=D0=B0=D0=BD=D0=B4=D1=80=D0=BE=D0=B2 =D0= =A1=D0=B5=D1=80=D0=B3=D0=B5=D0=B9 =D0=92=D0=B0=D1=81=D0=B8=D0=BB=D1=8C=D0= =B5=D0=B2=D0=B8=D1=87 >=20 >=20 > 2012/10/30 Vyacheslav Dubeyko : > > Hi, > > > > On Tue, 2012-10-30 at 16:20 +0300, =D0=A1=D0=B5=D1=80=D0=B3=D0=B5=D0= =B9 =D0=90=D0=BB=D0=B5=D0=BA=D1=81=D0=B0=D0=BD=D0=B4=D1=80=D0=BE=D0=B2 = wrote: > >> Good time of the day! > >> > >> I'v got a nilfs2 partition on a 1TB md RAID1 partition composed of= two > >> HDD's. Kernel 3.5.3, userspace utils v2.1.1. Gentoo linux > >> distribution. > >> Just updated utils to 2.1.4 but no failure since. > >> > >> After power shutdown, mount takes about several hours. > >> > > > > What about RAID1 consistency? Could you describe more about your RA= ID > > configuration? >=20 > # cat /proc/mdstat > Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [ra= id4] > md0 : active raid1 sdb1[0] sdc1[2] > 976760400 blocks super 1.2 [2/2] [UU] >=20 > So, raid is consistent. Reading speed from md device is about 60MB/s > according to iostat. >=20 > >> For the first time I thought that it won't mount at all and tried = to > >> use fsck tool, found somewhere in the internet(don't really rememb= er). > >> It reported that superblock is ok. > > > > So, I am implementing the fsck tool for NILFS2. I guess that you ta= ke > > sources from NILFS2 e-mail list. > > > >> Than I commented the check in the source file and the default numb= er > >> of blocks to check appeared to be too small. It failed to find the > >> next superblock. I've increased the number, but increasing it to *= 100 > >> didn't help. > > > > Sorry, I can't understand about what sources you are talking. Could= you > > describe more details about what and where you commented? > > > I've forced test_latest_log to return negative result. And changed > MAX_SCAN_SEGMENTS to 100000 > That was not enough. It finished without finding the SB. >=20 >=20 > The load from fsck was the same as from mount. > About 60MB/s read from md device and about 30% load on one core. >=20 > >> So, probably the reserved SB is too far from away and it takes too > >> long to find it. > >> > > > > If you try to find the second superblock then it is placed in the b= egin > > of last 4 KB of the volume. Your device size is 1000202649600 bytes= =2E > > > >> Does anybody knows, how can it be speed up? I know, UPS is a solut= ion, > >> but I consider it be a bug. > >> > > > > Could you share more details about situation during mount operation= s? I > > mean: (1) NILFS2-related messages in the system log; (2) "ps ax" ou= tput; > > (3) maybe "top" output can be useful also; (4) "mount" output befor= e > > trying to mount NILFS2 volume. > last situation: >=20 > messages log: > Oct 30 12:18:52 router kernel: [ 159.674579] NILFS warning: mounting > unchecked fs > ..... > ..... > Oct 30 13:03:06 router kernel: [ 2810.304245] NILFS: recovery complet= e. > Oct 30 13:03:06 router kernel: [ 2810.325240] segctord starting. > Construction interval =3D 5 seconds, CP frequency < 30 seconds > Oct 30 13:03:07 router nilfs_cleanerd[15453]: start > Oct 30 13:03:07 router nilfs_cleanerd[15453]: pause (clean check) >=20 Could you share content of your /etc/nilfs_cleanerd.conf file? Could you try to reproduce the issue with log_priority enhanced to debu= g level (I mean option in nilfs_cleanerd.conf) and share messages log again? > It took about 45 minutes. > Previous time it took more than 4 hours. You mean that your console returns input after 45 minutes when you try to execute mount. Am I correct? With the best regards, Vyacheslav Dubeyko. > Both times RAID was consistent. >=20 > top showed one process eating about 27% of cpu (2 cores, AMD Athon II > X2 250 @3000MHz) > Also, about 80% of memory is used for cache. > Sory, have not saved ps output... >=20 > I can repeat the situation if it helps. >=20 > -------------------------------------------------- > Aleksandrov Sergey Vasil'evich -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" = in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html