* NilFS cleanerd bugreport
@ 2009-01-28 20:52 Reinoud Zandijk
[not found] ` <20090128205223.GA416-5cYspOl2ggRz6xQTk39kMVfVdRo2wo/d@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Reinoud Zandijk @ 2009-01-28 20:52 UTC (permalink / raw)
To: nilfs.org
Dear folks, dear Ryusuke,
I've found a bug in the cleanerd/nilfs interaction that might give rise to the
various problems we've seen recently with the cleanerd. It comes down to the
wrong counting of the number of dirty segments and the wrong counting of the
number of checkpoints.
I created this disc using the NiLFS version 2.05 with 2.06 userland (AFAIK)
with mkfs.nilfs and created a sparse file on it with my sparse file generator
I created for UDF testing. It dismounted fine giving a nilfs_dump
`vnd0e-dump-3'. When i remounted it again, the cleanerd started after a while
and after unmounting it gives `vnd03-dump-3-cleanerd'. A diff shows:
(superblock)
--- /root/luiaard/root/vnd0e-dump-3 2009-01-25 17:10:22.000000000 +0100
+++ /root/luiaard/root/vnd0e-dump-3-cleanerd 2009-01-28 17:24:07.000000000 +0100
@@ -7,7 +7,7 @@
Flags 0x0000
CRC seed 0xd4dd3d5a
- Checksum (CRC) 0x05ec6c58 (OK)
+ Checksum (CRC) 0xddd0a2f7 (OK)
Blocksize 4096
Number of segments 499
@@ -17,15 +17,15 @@
Blocks per segment 2048
Reserved segments percent 5
- Last checkpoint number 8
- Last pseg blocknr writen 12288
+ Last checkpoint number 11
+ Last pseg blocknr writen 13726
Seq. number last segment 6
- Free blocks count 1005568
+ Free blocks count 1015808
FS Creation time Sun Jan 25 17:05:10 2009
- FS last mount time Sun Jan 25 17:05:14 2009
- FS last write time Sun Jan 25 17:06:02 2009
+ FS last mount time Wed Jan 28 17:21:25 2009
+ FS last write time Wed Jan 28 17:21:44 2009
- Mount count 1
+ Mount count 2
Max mount count 50
FS state 0x1<VALID_FS>
Error behaviour flags 0x0001
And the su and cp files give:
@@ -30743,34 +31480,34 @@
Reading file `SU.out` for 1 blocks (4 Kb)
SU file dump
- nclean 491
- ndirty 8
+ nclean 496
+ ndirty 21474836483
last alloced 7
Segment 0
- Last modified Sun Jan 25 17:05:28 2009
- Containing nblks 2047
- Flags 0x2<DIRTY>
+ Last modified Thu Jan 1 01:00:00 1970
+ Containing nblks 0
+ Flags 0x0
......
@@ -30789,136 +31526,72 @@
Reading file `CP.out` for 1 blocks (4 Kb)
CP file dump
- Number of checkpoints 8
+ Number of checkpoints 8589934596
Number of snapshots 0
Checkpoint number 1
- Flags 0x0
+ Flags 0x2<INVALID>
Checkpoints in block 0
Created at Sun Jan 25 17:05:10 2009
Blocks incremented 11
Inodes count 3
Blocks count (red.) 9
---------------------
ny idea as to if and why this can happen? Has it been fixed in the meantime?
or could this be a clue as to the wierd behaviour seen by others including the
corruption?
With regards,
Reinoud
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: NilFS cleanerd bugreport
[not found] ` <20090128205223.GA416-5cYspOl2ggRz6xQTk39kMVfVdRo2wo/d@public.gmane.org>
@ 2009-01-30 14:18 ` Ryusuke Konishi
[not found] ` <20090130.231853.99024523.ryusuke-sG5X7nlA6pw@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Ryusuke Konishi @ 2009-01-30 14:18 UTC (permalink / raw)
To: reinoud-S783fYmB3Ccdnm+yROfE0A; +Cc: users-JrjvKiOkagjYtjvyW6yDsg
Hi Reinoud,
On Wed, 28 Jan 2009 21:52:23 +0100, Reinoud Zandijk wrote:
> Dear folks, dear Ryusuke,
>
> I've found a bug in the cleanerd/nilfs interaction that might give rise to the
> various problems we've seen recently with the cleanerd. It comes down to the
> wrong counting of the number of dirty segments and the wrong counting of the
> number of checkpoints.
>
> I created this disc using the NiLFS version 2.05 with 2.06 userland (AFAIK)
> with mkfs.nilfs and created a sparse file on it with my sparse file generator
> I created for UDF testing. It dismounted fine giving a nilfs_dump
> `vnd0e-dump-3'. When i remounted it again, the cleanerd started after a while
> and after unmounting it gives `vnd03-dump-3-cleanerd'. A diff shows:
<snip>
> And the su and cp files give:
>
> @@ -30743,34 +31480,34 @@
> Reading file `SU.out` for 1 blocks (4 Kb)
>
> SU file dump
> - nclean 491
> - ndirty 8
> + nclean 496
> + ndirty 21474836483
> last alloced 7
>
> Segment 0
> - Last modified Sun Jan 25 17:05:28 2009
> - Containing nblks 2047
> - Flags 0x2<DIRTY>
> + Last modified Thu Jan 1 01:00:00 1970
> + Containing nblks 0
> + Flags 0x0
>
> ......
>
> @@ -30789,136 +31526,72 @@
>
> Reading file `CP.out` for 1 blocks (4 Kb)
> CP file dump
> - Number of checkpoints 8
> + Number of checkpoints 8589934596
> Number of snapshots 0
>
> Checkpoint number 1
> - Flags 0x0
> + Flags 0x2<INVALID>
> Checkpoints in block 0
> Created at Sun Jan 25 17:05:10 2009
> Blocks incremented 11
> Inodes count 3
> Blocks count (red.) 9
>
> ny idea as to if and why this can happen?
looks underflow or collision of updates.
> Has it been fixed in the meantime?
Not yet, I think.
> or could this be a clue as to the wierd behaviour seen by others including the
> corruption?
I don't know. As I remember, the cleanerd does not depend on these
values, but it may be indirectly-induced.
Anyway, thanks for reporting this issue.
I'll review the cpfile and sufile again.
Regards,
Ryusuke Konishi
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: NilFS cleanerd bugreport
[not found] ` <20090130.231853.99024523.ryusuke-sG5X7nlA6pw@public.gmane.org>
@ 2009-02-02 7:24 ` Ryusuke Konishi
0 siblings, 0 replies; 3+ messages in thread
From: Ryusuke Konishi @ 2009-02-02 7:24 UTC (permalink / raw)
To: reinoud-S783fYmB3Ccdnm+yROfE0A; +Cc: users-JrjvKiOkagjYtjvyW6yDsg
Hi,
On Fri, 30 Jan 2009 23:18:53 +0900 (JST), Ryusuke Konishi wrote:
> Hi Reinoud,
> On Wed, 28 Jan 2009 21:52:23 +0100, Reinoud Zandijk wrote:
> > Dear folks, dear Ryusuke,
> >
> > I've found a bug in the cleanerd/nilfs interaction that might give rise to the
> > various problems we've seen recently with the cleanerd. It comes down to the
> > wrong counting of the number of dirty segments and the wrong counting of the
> > number of checkpoints.
> >
> > I created this disc using the NiLFS version 2.05 with 2.06 userland (AFAIK)
> > with mkfs.nilfs and created a sparse file on it with my sparse file generator
> > I created for UDF testing. It dismounted fine giving a nilfs_dump
> > `vnd0e-dump-3'. When i remounted it again, the cleanerd started after a while
> > and after unmounting it gives `vnd03-dump-3-cleanerd'. A diff shows:
> <snip>
> > And the su and cp files give:
> >
> > @@ -30743,34 +31480,34 @@
> > Reading file `SU.out` for 1 blocks (4 Kb)
> >
> > SU file dump
> > - nclean 491
> > - ndirty 8
> > + nclean 496
> > + ndirty 21474836483
> > last alloced 7
> >
> > Segment 0
> > - Last modified Sun Jan 25 17:05:28 2009
> > - Containing nblks 2047
> > - Flags 0x2<DIRTY>
> > + Last modified Thu Jan 1 01:00:00 1970
> > + Containing nblks 0
> > + Flags 0x0
> >
> > ......
> >
> > @@ -30789,136 +31526,72 @@
> >
> > Reading file `CP.out` for 1 blocks (4 Kb)
> > CP file dump
> > - Number of checkpoints 8
> > + Number of checkpoints 8589934596
> > Number of snapshots 0
> >
> > Checkpoint number 1
> > - Flags 0x0
> > + Flags 0x2<INVALID>
> > Checkpoints in block 0
> > Created at Sun Jan 25 17:05:10 2009
> > Blocks incremented 11
> > Inodes count 3
> > Blocks count (red.) 9
> >
> > ny idea as to if and why this can happen?
>
> looks underflow or collision of updates.
This turn out to be the bug of counter operations on the cpfile and
sufile.
Here, I attach a test patch to fix the problem.
After some tests and submission to -mm tree, I'll push it to the git
repo.
Reinoud, thank you for finding this problem.
Regards,
Ryusuke Konishi
---
fs/cpfile.c | 2 +-
fs/sufile.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/cpfile.c b/fs/cpfile.c
index 1e9ce4c..45bfe82 100644
--- a/fs/cpfile.c
+++ b/fs/cpfile.c
@@ -357,7 +357,7 @@ int nilfs_cpfile_delete_checkpoints(struct inode *cpfile,
kaddr = kmap_atomic(header_bh->b_page, KM_USER0);
header = nilfs_cpfile_block_get_header(cpfile, header_bh,
kaddr);
- le64_add_cpu(&header->ch_ncheckpoints, -tnicps);
+ le64_add_cpu(&header->ch_ncheckpoints, -(u64)tnicps);
nilfs_mdt_mark_buffer_dirty(header_bh);
nilfs_mdt_mark_dirty(cpfile);
kunmap_atomic(kaddr, KM_USER0);
diff --git a/fs/sufile.c b/fs/sufile.c
index 7b73a5f..9f0a988 100644
--- a/fs/sufile.c
+++ b/fs/sufile.c
@@ -331,7 +331,7 @@ int nilfs_sufile_freev(struct inode *sufile, __u64 *segnum, size_t nsegs)
kaddr = kmap_atomic(header_bh->b_page, KM_USER0);
header = nilfs_sufile_block_get_header(sufile, header_bh, kaddr);
le64_add_cpu(&header->sh_ncleansegs, nsegs);
- le64_add_cpu(&header->sh_ndirtysegs, -nsegs);
+ le64_add_cpu(&header->sh_ndirtysegs, -(u64)nsegs);
kunmap_atomic(kaddr, KM_USER0);
nilfs_mdt_mark_buffer_dirty(header_bh);
nilfs_mdt_mark_dirty(sufile);
--
1.5.6.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-02-02 7:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-28 20:52 NilFS cleanerd bugreport Reinoud Zandijk
[not found] ` <20090128205223.GA416-5cYspOl2ggRz6xQTk39kMVfVdRo2wo/d@public.gmane.org>
2009-01-30 14:18 ` Ryusuke Konishi
[not found] ` <20090130.231853.99024523.ryusuke-sG5X7nlA6pw@public.gmane.org>
2009-02-02 7:24 ` Ryusuke Konishi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox