public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: devzero@web.de
To: Yan Zheng <zheng.yan@oracle.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs filesystem freeze
Date: Tue, 23 Dec 2008 01:26:16 +0100	[thread overview]
Message-ID: <552676848@web.de> (raw)

thanks,=20

i tried it and ran my tests for some hours now - looks really good. no =
crashes, no freezes.=20

anyway, some "minor" glitches remain.

i looked at some "ls -la /btrfs" output via "watch ls -la..", and by ch=
ance i saw this one for a moment.

dr-xr-xr-x  1 root root     126 Dec 22 22:19 snap96
dr-xr-xr-x  1 root root     126 Dec 22 22:19 snap97
dr-xr-xr-x  1 root root     126 Dec 22 22:19 snap98
dr-xr-xr-x  1 root root     110 Dec 22 22:19 snap99
-rw-r--r--  1 root root 1048576 Dec 22 22:49 test.dat
-?????????  ? ?    ?          ?            ? test.tmp
-rw-r--r--  1 root root 7020046 Dec 22 22:48 testfsx
-rw-r--r--  1 root root       0 Dec 22 22:23 testfsx.fsxgood
-rw-r--r--  1 root root       0 Dec 22 22:23 testfsx.fsxlog

that file test.tmp looks weird.=20
it`s constantly created by copying test.dat forth and back (i.e cp test=
=2Edat test.tmp, md5sum test.tmp;rm test.dat;mv test.tmp test.dat  in a=
 loop, to check file consistency)

should i worry here ?


furthermore, after several hours, i got this one (also once):

Tue Dec 23 00:40:33 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:42:59 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:43:17 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:44:28 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:49:43 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:50:37 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:50:57 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:56:44 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 00:59:33 CET 2008 59987747e2568fa81bb38603706eff07  test.tmp
Tue Dec 23 01:00:03 CET 2008 md5sum: test.tmp: Input/output error
cp: reading `test.dat': Input/output error
Tue Dec 23 01:01:44 CET 2008 ca6bbc6a8aa5ec080a2a10d727ecc563  test.tmp
Tue Dec 23 01:08:05 CET 2008 ca6bbc6a8aa5ec080a2a10d727ecc563  test.tmp

how can this happen?=20
"cp test.dat test.tmp" is _always_ something which happens after "mv te=
st.tmp test.dat", and there`s a "sync" in between

looks like a race condition !?
the mv command returned but btfs did not complete the file move and the=
 next command does not yet see the moved file !?


i don`t have timing information for dmesg and cannot tell, if there is =
any relation with those glitches, but here is some messages in dmesg :

device fsid 7c4ee06dc0149bc8-44e376a69f9aa08b devid 1 transid 9 /dev/sd=
b1
btrfs: use compression
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs csum failed ino 39502 off 524288 csum 1261748817 private 18134416=
08
btrfs: unlinked 1 orphans
btrfs: unlinked 1 orphans

so, apparently also some checksum was wrong !?


ah, and some more (please forgive - this is all for making btfs better!=
)

i ran some posix regression test suite (http://www.ntfs-3g.org/pjd-fste=
st.html) , which also reported some problems.=20

i`m not sure if this are false positives because the testsuite does not=
 officially support btrfs.=20

i`m posting it here for review.

linux-uqw0:/btrfs/pjd-fstest-20080917-RC # prove -r .
tests/chflags/00.....ok
tests/chflags/01.....ok
tests/chflags/02.....ok
tests/chflags/03.....ok
tests/chflags/04.....ok
tests/chflags/05.....ok
tests/chflags/06.....ok
tests/chflags/07.....ok
tests/chflags/08.....ok
tests/chflags/09.....ok
tests/chflags/10.....ok
tests/chflags/11.....ok
tests/chflags/12.....ok
tests/chflags/13.....ok
tests/chmod/00.......ok
tests/chmod/01.......ok
tests/chmod/02.......ok
tests/chmod/03.......ok
tests/chmod/04.......ok
tests/chmod/05.......ok
tests/chmod/06.......ok
tests/chmod/07.......ok
tests/chmod/08.......ok
tests/chmod/09.......ok
tests/chmod/10.......ok
tests/chmod/11.......ok
tests/chown/00.......ok
tests/chown/01.......ok
tests/chown/02.......ok
tests/chown/03.......ok
tests/chown/04.......ok
tests/chown/05.......ok
tests/chown/06.......ok
tests/chown/07.......ok
tests/chown/08.......ok
tests/chown/09.......ok
tests/chown/10.......ok
tests/link/00........FAILED tests 56, 63
        Failed 2/82 tests, 97.56% okay
tests/link/01........ok
tests/link/02........ok
tests/link/03........ok
tests/link/04........ok
tests/link/05........ok
tests/link/06........ok
tests/link/07........ok
tests/link/08........ok
tests/link/09........ok
tests/link/10........ok
tests/link/11........ok
tests/link/12........ok
tests/link/13........ok
tests/link/14........ok
tests/link/15........ok
tests/link/16........ok
tests/link/17........ok
tests/mkdir/00.......ok
tests/mkdir/01.......ok
tests/mkdir/02.......ok
tests/mkdir/03.......ok
tests/mkdir/04.......ok
tests/mkdir/05.......ok
tests/mkdir/06.......ok
tests/mkdir/07.......ok
tests/mkdir/08.......ok
tests/mkdir/09.......ok
tests/mkdir/10.......ok
tests/mkdir/11.......ok
tests/mkdir/12.......ok
tests/mkfifo/00......ok
tests/mkfifo/01......ok
tests/mkfifo/02......ok
tests/mkfifo/03......ok
tests/mkfifo/04......ok
tests/mkfifo/05......ok
tests/mkfifo/06......ok
tests/mkfifo/07......ok
tests/mkfifo/08......ok
tests/mkfifo/09......ok
tests/mkfifo/10......ok
tests/mkfifo/11......ok
tests/mkfifo/12......ok
tests/open/00........ok
tests/open/01........ok
tests/open/02........ok
tests/open/03........ok
tests/open/04........ok
tests/open/05........ok
tests/open/06........ok
tests/open/07........ok
tests/open/08........ok
tests/open/09........ok
tests/open/10........ok
tests/open/11........ok
tests/open/12........ok
tests/open/13........ok
tests/open/14........ok
tests/open/15........ok
tests/open/16........ok
tests/open/17........ok
tests/open/18........ok
tests/open/19........ok
tests/open/20........ok
tests/open/21........ok
tests/open/22........ok
tests/open/23........ok
tests/rename/00......ok
tests/rename/01......ok
tests/rename/02......ok
tests/rename/03......ok
tests/rename/04......ok
tests/rename/05......ok
tests/rename/06......ok
tests/rename/07......ok
tests/rename/08......ok
tests/rename/09......ok
tests/rename/10......ok
tests/rename/11......ok
tests/rename/12......ok
tests/rename/13......ok
tests/rename/14......ok
tests/rename/15......ok
tests/rename/16......ok
tests/rename/17......ok
tests/rename/18......ok
tests/rename/19......ok
tests/rename/20......ok
tests/rmdir/00.......ok
tests/rmdir/01.......ok
tests/rmdir/02.......ok
tests/rmdir/03.......ok
tests/rmdir/04.......ok
tests/rmdir/05.......ok
tests/rmdir/06.......ok
tests/rmdir/07.......ok
tests/rmdir/08.......ok
tests/rmdir/09.......ok
tests/rmdir/10.......ok
tests/rmdir/11.......ok
tests/rmdir/12.......ok
tests/rmdir/13.......ok
tests/rmdir/14.......ok
tests/rmdir/15.......ok
tests/symlink/00.....ok
tests/symlink/01.....ok
tests/symlink/02.....ok
tests/symlink/03.....ok
tests/symlink/04.....ok
tests/symlink/05.....ok
tests/symlink/06.....ok
tests/symlink/07.....ok
tests/symlink/08.....ok
tests/symlink/09.....ok
tests/symlink/10.....ok
tests/symlink/11.....ok
tests/symlink/12.....ok
tests/truncate/00....FAILED test 15
        Failed 1/21 tests, 95.24% okay
tests/truncate/01....ok
tests/truncate/02....ok
tests/truncate/03....ok
tests/truncate/04....ok
tests/truncate/05....ok
tests/truncate/06....ok
tests/truncate/07....ok
tests/truncate/08....ok
tests/truncate/09....ok
tests/truncate/10....ok
tests/truncate/11....ok
tests/truncate/12....ok
tests/truncate/13....ok
tests/truncate/14....ok
tests/unlink/00......ok
tests/unlink/01......ok
tests/unlink/02......ok
tests/unlink/03......ok
tests/unlink/04......ok
tests/unlink/05......ok
tests/unlink/06......ok
tests/unlink/07......ok
tests/unlink/08......ok
tests/unlink/09......ok
tests/unlink/10......ok
tests/unlink/11......ok
tests/unlink/12......ok
tests/unlink/13......ok
tests/xacl/00........FAILED test 2
        Failed 1/42 tests, 97.62% okay
tests/xacl/01........FAILED tests 2, 22
        Failed 2/32 tests, 93.75% okay
tests/xacl/02........FAILED tests 2, 41
        Failed 2/80 tests, 97.50% okay
tests/xacl/03........FAILED tests 2, 31, 35, 38, 43, 47
        Failed 6/57 tests, 89.47% okay
tests/xacl/04........FAILED tests 2, 52
        Failed 2/53 tests, 96.23% okay
tests/xacl/05........ok
tests/xacl/06........FAILED tests 16, 26-27, 30, 33-34, 36-38, 40-42
        Failed 12/42 tests, 71.43% okay
=46ailed Test         Stat Wstat Total Fail  List of Failed
-----------------------------------------------------------------------=
--------
tests/link/00.t                   82    2  56 63
tests/truncate/00.t               21    1  15
tests/xacl/00.t                   42    1  2
tests/xacl/01.t                   32    2  2 22
tests/xacl/02.t                   80    2  2 41
tests/xacl/03.t                   57    6  2 31 35 38 43 47
tests/xacl/04.t                   53    2  2 52
tests/xacl/06.t                   42   12  16 26-27 30 33-34 36-38 40-4=
2
=46ailed 8/191 test scripts. 28/2284 subtests failed.
=46iles=3D191, Tests=3D2284, 2762 wallclock secs (18.56 cusr + 179.63 c=
sys =3D 198.19 CPU)
=46ailed 8/191 test programs. 28/2284 subtests failed.



besides all that, i=B4m really impressed by btrfs and i think it`s at r=
eally good condition and quite stable.
i think i dare using it for a server to do some pre-production testing.=
=2E..

Keep up the good work!

regards
roland





> devzero@web.de wrote:
> > thank you.
> >=20
> > i tried your patch and did another test run.
> >=20
> > first, it looked better as i could do much more snapshots than befo=
re, but then it froze again.
> >=20
> > i don`t really have a clue if your patch enhanced anything, as my t=
est setup isn`t exactly  reproducable for now and i did not check for e=
xact "testing lab conditions".
> >=20
> > after /btrfs froze again, i tried to unmount by forcibly unloading =
btrfs module.=20
> >=20
> > after reloading the module and trying to mount again, it failed wit=
h the following  kernel message:
> >=20
> I hope the new patch can solve the problem.
>=20
> Yan Zheng=20
>=20
> ---
> diff -urp 1/fs/btrfs/inode.c 2/fs/btrfs/inode.c
> --- 1/fs/btrfs/inode.c	2008-12-18 08:09:16.062111805 +0800
> +++ 2/fs/btrfs/inode.c	2008-12-22 08:47:06.000000000 +0800
> @@ -2891,7 +2891,7 @@ void btrfs_delete_inode(struct inode *in
>  	btrfs_wait_ordered_range(inode, 0, (u64)-1);
> =20
>  	btrfs_i_size_write(inode, 0);
> -	trans =3D btrfs_start_transaction(root, 1);
> +	trans =3D btrfs_join_transaction(root, 1);
> =20
>  	btrfs_set_trans_block_group(trans, inode);
>  	ret =3D btrfs_truncate_inode_items(trans, root, inode, inode->i_siz=
e, 0);
> diff -urp 1/fs/btrfs/transaction.c 2/fs/btrfs/transaction.c
> --- 1/fs/btrfs/transaction.c	2008-12-13 12:35:29.487886730 +0800
> +++ 2/fs/btrfs/transaction.c	2008-12-21 19:09:09.000000000 +0800
> @@ -804,7 +804,7 @@ static noinline int finish_pending_snaps
> =20
>  	parent_inode =3D pending->dentry->d_parent->d_inode;
>  	parent_root =3D BTRFS_I(parent_inode)->root;
> -	trans =3D btrfs_start_transaction(parent_root, 1);
> +	trans =3D btrfs_join_transaction(parent_root, 1);
> =20
>  	/*
>  	 * insert the directory item
>=20


_______________________________________________________________________
T=E4glich 1.000.000 Euro gewinnen! Jetzt kostenlos WEB.DE MillionenKlic=
k=20
spielen! https://millionenklick.web.de/?mc=3Dmail@footer.mklick@home

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2008-12-23  0:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-23  0:26 devzero [this message]
  -- strict thread matches above, loose matches on Subject: below --
2008-12-21 14:09 btrfs filesystem freeze devzero
2008-12-22  0:55 ` Yan Zheng
2009-01-05 15:44   ` Chris Mason
2009-01-05 21:14     ` Yan Zheng
2009-01-05 21:24       ` Chris Mason
2008-12-20 23:26 devzero
2008-12-21 11:13 ` Yan Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=552676848@web.de \
    --to=devzero@web.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=zheng.yan@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox