* Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
@ 2009-03-17 10:45 Roman Mamedov
[not found] ` <20090317154529.1aeeecdc-2Ve/5xEMxL0@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Roman Mamedov @ 2009-03-17 10:45 UTC (permalink / raw)
To: users-JrjvKiOkagjYtjvyW6yDsg
Hello.
I am seeing strange behaviour of nilfs. Namely, a new checkpoint with
significant NBLKINC value is created on every cleanerd step, even though
the file system is not accessed or written to by any apps at the time.
Also, lssu shows that new segments are being continuously appended to
the tail, whereas segments at the top are disappearing.
I am not completely sure, but this might have begun after doing a
bind mount, i.e. `mount --bind /mnt/nilfs/somedir /some/other/dir`.
Timestamp of the first strange checkpoint roughly corresponds to the
time when I did that mount. And when I was trying to unmount it, the
mount point was "busy", and `fuser` reported that the process using it
is nilfs_cleanerd.
Hmm, now I see that `/mnt/nilfs/somedir` also has a zero-sized ".nilfs"
file, just like the root directory of that filesystem. Perhaps this is
the cause of the problem?
My kernel is 2.6.28. Initially, nilfs module and tools were versions
2.0.6, but an upgrade to 2.0.11 did not help.
If it matters, nilfs is created over a LUKS encrypted partition
(/dev/mapper/device).
--
With respect,
Roman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <20090317154529.1aeeecdc-2Ve/5xEMxL0@public.gmane.org>
@ 2009-03-17 13:43 ` Ryusuke Konishi
[not found] ` <20090317.224324.58871500.ryusuke-sG5X7nlA6pw@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Ryusuke Konishi @ 2009-03-17 13:43 UTC (permalink / raw)
To: users-JrjvKiOkagjYtjvyW6yDsg, roman-ohbefDXYbNo
Hi,
On Tue, 17 Mar 2009 15:45:29 +0500, Roman Mamedov wrote:
> Hello.
>
> I am seeing strange behaviour of nilfs. Namely, a new checkpoint with
> significant NBLKINC value is created on every cleanerd step, even though
> the file system is not accessed or written to by any apps at the time.
> Also, lssu shows that new segments are being continuously appended to
> the tail, whereas segments at the top are disappearing.
The garbage collection of nilfs is ``copying GC'', and it creates
checkpoints (it's normal for nilfs). In addition, if segments
selected for cleaning have few reclaimable blocks, most remaining
blocks are copied into new segments.
If you are using nilfs-2.0.10 or later, you can see ``i'' flag in FLG
field of lscp. The ``i'' flag means the checkpoint is created only
for such internal operation including GC.
> I am not completely sure, but this might have begun after doing a
> bind mount, i.e. `mount --bind /mnt/nilfs/somedir /some/other/dir`.
> Timestamp of the first strange checkpoint roughly corresponds to the
> time when I did that mount. And when I was trying to unmount it, the
> mount point was "busy", and `fuser` reported that the process using it
> is nilfs_cleanerd.
Did you specify an fstype option? The behavior seems different
between:
$ mount --bind /nilfs2 /test
and
$ mount --bind -t nilfs2 /nilfs2 /test
In the former case, the result was as follows:
/dev/sdb1 on /nilfs2 type nilfs2 (rw,gcpid=12667)
/nilfs2 on /test type none (rw,bind)
In the latter case, it became
/dev/sdb1 on /nilfs2 type nilfs2 (rw,gcpid=12667)
/nilfs2 on /test type nilfs2 (rw,bind,gcpid=12673)
and, the second cleanerd has gone:
nilfs_cleanerd[12673]: start
nilfs_cleanerd[12673]: cannot create cleanerd on /nilfs2
nilfs_cleanerd[12673]: shutdown
nilfs_cleanerd[12667]: wake up
nilfs_cleanerd[12667]: wait 21.000000000
Either example is safe. (the latter is a mere lucky)
I couldn't confirm your case that the clearnerd grabs different
mount point and prevent unmount.
> Hmm, now I see that `/mnt/nilfs/somedir` also has a zero-sized ".nilfs"
> file, just like the root directory of that filesystem. Perhaps this is
> the cause of the problem?
``.nilfs'' is just a regular file used to allow locking between
cleanerd and other userland tools. So it will appear both mount
points.
Maybe clearnerd mistook the mount point, but the background checkpoint
creation itself is normal behavior.
Regards,
Ryusuke Konishi
> My kernel is 2.6.28. Initially, nilfs module and tools were versions
> 2.0.6, but an upgrade to 2.0.11 did not help.
>
> If it matters, nilfs is created over a LUKS encrypted partition
> (/dev/mapper/device).
>
> --
> With respect,
> Roman
> _______________________________________________
> users mailing list
> users-JrjvKiOkagjYtjvyW6yDsg@public.gmane.org
> https://www.nilfs.org/mailman/listinfo/users
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <20090317.224324.58871500.ryusuke-sG5X7nlA6pw@public.gmane.org>
@ 2009-03-17 14:19 ` Roman Mamedov
[not found] ` <20090317191924.1924576b-2Ve/5xEMxL0@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Roman Mamedov @ 2009-03-17 14:19 UTC (permalink / raw)
To: Ryusuke Konishi, users-JrjvKiOkagjYtjvyW6yDsg
On Tue, 17 Mar 2009 22:43:24 +0900 (JST)
Ryusuke Konishi <ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
> The garbage collection of nilfs is ``copying GC'', and it creates
> checkpoints (it's normal for nilfs). In addition, if segments
> selected for cleaning have few reclaimable blocks, most remaining
> blocks are copied into new segments.
I thought the NBLKINC count should not increase much when there are
no user writes to the filesystem. But it was increasing by several
thousands on every step.
This filesystem has size of about 300 gigabytes, and all I did, is
copied about 150 gigabytes of data to it, with no deletions, no
modifications or rewrites. But even about 6-8 hours after that, the
cleanerd process still has work to do, so much, that it interferes with
normal usage of the filesystem (i.e. playing a FLAC audio file from it
was skipping at times, even counting a bit increased load from going
through kcryptd, this shouldn't happen).
> If you are using nilfs-2.0.10 or later, you can see ``i'' flag in FLG
> field of lscp. The ``i'' flag means the checkpoint is created only
> for such internal operation including GC.
Yes, after the upgrade to 2.0.11 I see "i" marks on these checkpoints.
> Did you specify an fstype option?
I had the fstype specified as "none" (in fstab).
> /nilfs2 on /test type none (rw,bind)
Yes, this is what I have.
> and, the second cleanerd has gone:
I have only one cleanderd in process list.
What is really strange, is that it looks like I can unmount the main
NILFS, but the bind mount will continue to be accessible and writable.
And when I remount the main mount point back, cleanerd grabs the bind
mount. See below:
# mount -t nilfs2 /dev/mapper/hi320data /mnt/hi320data
mount.nilfs2: WARNING! - The NILFS on-disk format may change at any
time.
mount.nilfs2: WARNING! - Do not place critical data on a NILFS
filesystem.
# mount --bind /mnt/hi320data/test/ /mnt/bind-test/
# ls /mnt/bind-test/
# touch /mnt/bind-test/abc
# umount /mnt/hi320data
# umount /mnt/hi320data
umount: /mnt/hi320data: not mounted
# touch /mnt/bind-test/def
# ls /mnt/bind-test/
abc def
# mount -t nilfs2 /dev/mapper/hi320data /mnt/hi320data
mount.nilfs2: WARNING! - The NILFS on-disk format may change at any
time.
mount.nilfs2: WARNING! - Do not place critical data on a NILFS
filesystem.
# ls /mnt/hi320data/test/
abc def
# umount /mnt/bind-test
umount: /mnt/bind-test: device is busy
umount: /mnt/bind-test: device is busy
# fuser -c /mnt/bind-test/
/mnt/bind-test/: 23203
# ps -Af | grep 23203
root 23203 1 1 19:06 ? 00:00:00 /sbin/nilfs_cleanerd
-n /dev/mapper/hi320data /mnt/hi320data
-----------------------
Maybe it misdetects the bind mount as another mount of the same
filesystem (e.g. like when mounting a snapshot)?
--
With respect,
Roman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <20090317191924.1924576b-2Ve/5xEMxL0@public.gmane.org>
@ 2009-03-22 17:53 ` Ryusuke Konishi
2009-03-31 14:44 ` Ryusuke Konishi
1 sibling, 0 replies; 8+ messages in thread
From: Ryusuke Konishi @ 2009-03-22 17:53 UTC (permalink / raw)
To: roman-ohbefDXYbNo; +Cc: users-JrjvKiOkagjYtjvyW6yDsg
On Tue, 17 Mar 2009 19:19:24 +0500, Roman Mamedov <roman-ohbefDXYbNo@public.gmane.org> wrote:
> On Tue, 17 Mar 2009 22:43:24 +0900 (JST)
> Ryusuke Konishi <ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
>
> > The garbage collection of nilfs is ``copying GC'', and it creates
> > checkpoints (it's normal for nilfs). In addition, if segments
> > selected for cleaning have few reclaimable blocks, most remaining
> > blocks are copied into new segments.
>
> I thought the NBLKINC count should not increase much when there are
> no user writes to the filesystem. But it was increasing by several
> thousands on every step.
Uum, sound logical. But the interpretation will fairly change the
semantics and require some clarification about what the increment is
supposed to be:
- blocks duplicated for copy-on-write
--> should be treated as increments (maybe)
- Segment headers and super root block.
--> treated as increment at present (should be removed for the
above standpoint)
- Some nilfs meta data files like sufile, cpfile, and the DAT file.
--> treated as increment at present.
These files even do not have past versions. So, they slightly
differ from regular files in that sense.
Fortunately, this value does not affect FS-internal operation nor
cleanerd. Only lscp uses the field. So, it could be changed if no
one disputes it.
> This filesystem has size of about 300 gigabytes, and all I did, is
> copied about 150 gigabytes of data to it, with no deletions, no
> modifications or rewrites. But even about 6-8 hours after that, the
> cleanerd process still has work to do, so much, that it interferes with
> normal usage of the filesystem (i.e. playing a FLAC audio file from it
> was skipping at times, even counting a bit increased load from going
> through kcryptd, this shouldn't happen).
Sorry. The current cleanerd is far from intelligent as you pointed
out, and I feel the same about that. One of my colleagues is recently
trying to improve the cleanerd behavior.
> > Did you specify an fstype option?
>
> I had the fstype specified as "none" (in fstab).
>
> > /nilfs2 on /test type none (rw,bind)
>
> Yes, this is what I have.
>
> > and, the second cleanerd has gone:
>
> I have only one cleanderd in process list.
>
> What is really strange, is that it looks like I can unmount the main
> NILFS, but the bind mount will continue to be accessible and writable.
This is the same as other Linux file systems.
> And when I remount the main mount point back, cleanerd grabs the bind
> mount. See below:
>
> # mount -t nilfs2 /dev/mapper/hi320data /mnt/hi320data
> mount.nilfs2: WARNING! - The NILFS on-disk format may change at any
> time.
> mount.nilfs2: WARNING! - Do not place critical data on a NILFS
> filesystem.
<snip>
> # umount /mnt/bind-test
> umount: /mnt/bind-test: device is busy
> umount: /mnt/bind-test: device is busy
>
> # fuser -c /mnt/bind-test/
> /mnt/bind-test/: 23203
>
> # ps -Af | grep 23203
> root 23203 1 1 19:06 ? 00:00:00 /sbin/nilfs_cleanerd
> -n /dev/mapper/hi320data /mnt/hi320data
>
> -----------------------
> Maybe it misdetects the bind mount as another mount of the same
> filesystem (e.g. like when mounting a snapshot)?
Thanks! I could finally reproduce the problem.
Nilfs snapshots are readonly, but bind mounts could make multiple
r/w-mounts. And, cleanerd prevents umount in such cases. I have to
take bind mounts into account, as was expected, in the mount/umount
helper programs. I'd like to fix this by the next release.
Regards,
Ryusuke Konishi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <20090317191924.1924576b-2Ve/5xEMxL0@public.gmane.org>
2009-03-22 17:53 ` Ryusuke Konishi
@ 2009-03-31 14:44 ` Ryusuke Konishi
[not found] ` <20090331.234437.112904649.ryusuke-sG5X7nlA6pw@public.gmane.org>
1 sibling, 1 reply; 8+ messages in thread
From: Ryusuke Konishi @ 2009-03-31 14:44 UTC (permalink / raw)
To: roman-ohbefDXYbNo; +Cc: users-JrjvKiOkagjYtjvyW6yDsg
Hi,
On Tue, 17 Mar 2009 19:19:24 +0500, Roman Mamedov <roman-ohbefDXYbNo@public.gmane.org> wrote:
> What is really strange, is that it looks like I can unmount the main
> NILFS, but the bind mount will continue to be accessible and writable.
> And when I remount the main mount point back, cleanerd grabs the bind
> mount. See below:
>
> # mount -t nilfs2 /dev/mapper/hi320data /mnt/hi320data
> mount.nilfs2: WARNING! - The NILFS on-disk format may change at any
> time.
> mount.nilfs2: WARNING! - Do not place critical data on a NILFS
> filesystem.
>
> # mount --bind /mnt/hi320data/test/ /mnt/bind-test/
>
> # ls /mnt/bind-test/
>
> # touch /mnt/bind-test/abc
>
> # umount /mnt/hi320data
>
> # umount /mnt/hi320data
> umount: /mnt/hi320data: not mounted
>
> # touch /mnt/bind-test/def
>
> # ls /mnt/bind-test/
> abc def
>
> # mount -t nilfs2 /dev/mapper/hi320data /mnt/hi320data
> mount.nilfs2: WARNING! - The NILFS on-disk format may change at any
> time.
> mount.nilfs2: WARNING! - Do not place critical data on a NILFS
> filesystem.
>
> # ls /mnt/hi320data/test/
> abc def
>
> # umount /mnt/bind-test
> umount: /mnt/bind-test: device is busy
> umount: /mnt/bind-test: device is busy
>
> # fuser -c /mnt/bind-test/
> /mnt/bind-test/: 23203
>
> # ps -Af | grep 23203
> root 23203 1 1 19:06 ? 00:00:00 /sbin/nilfs_cleanerd
> -n /dev/mapper/hi320data /mnt/hi320data
>
> -----------------------
> Maybe it misdetects the bind mount as another mount of the same
> filesystem (e.g. like when mounting a snapshot)?
Now, I'm considering not to invoke cleanerd for bind-mounts. It is
too complex to switch over the base directory of cleanerd along with
mount/umount events as above. (It is possible but I'd rather avoid
complicating life management of cleanerd any further.)
In addition, the bind mount can attach only part of a single
filesystem, and therefore the .nilfs file, which is required for
proper cleaner operation, is not assured to exist on the top of mount
point for bind mounts.
Do you have any inconvenience if I disable GC for bind mounts?
Usually bind mount has an original rw-mount though Linux can
detach original mount prior to the bind-mount.
Regards,
Ryusuke Konishi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <20090331.234437.112904649.ryusuke-sG5X7nlA6pw@public.gmane.org>
@ 2009-03-31 15:10 ` Ryusuke Konishi
[not found] ` <20090401.001059.06956465.ryusuke-sG5X7nlA6pw@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Ryusuke Konishi @ 2009-03-31 15:10 UTC (permalink / raw)
To: roman-ohbefDXYbNo; +Cc: users-JrjvKiOkagjYtjvyW6yDsg
On Tue, 31 Mar 2009 23:44:37 +0900 (JST), Ryusuke Konishi <ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
> Now, I'm considering not to invoke cleanerd for bind-mounts. It is
> too complex to switch over the base directory of cleanerd along with
> mount/umount events as above. (It is possible but I'd rather avoid
> complicating life management of cleanerd any further.)
>
> In addition, the bind mount can attach only part of a single
> filesystem, and therefore the .nilfs file, which is required for
> proper cleaner operation, is not assured to exist on the top of mount
> point for bind mounts.
>
> Do you have any inconvenience if I disable GC for bind mounts?
>
> Usually bind mount has an original rw-mount though Linux can
> detach original mount prior to the bind-mount.
Let me add one thing. Since the bind-mount and original r/w-mount
share the same device, only one cleanerd is enough for them. (In
fact, two cleanerds conflict with one another). So, we can give over
the GC functionality to the original mount point without incident.
Ryusuke
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <20090401.001059.06956465.ryusuke-sG5X7nlA6pw@public.gmane.org>
@ 2009-03-31 15:14 ` Alex Bitney
[not found] ` <591bc86b0903310814t2814ab95ie71077d5bb59f99d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Alex Bitney @ 2009-03-31 15:14 UTC (permalink / raw)
To: NILFS Users mailing list
[-- Attachment #1.1: Type: text/plain, Size: 664 bytes --]
On Tue, Mar 31, 2009 at 3:10 PM, Ryusuke Konishi <ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
> On Tue, 31 Mar 2009 23:44:37 +0900 (JST), Ryusuke Konishi <
> ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
>
> Let me add one thing. Since the bind-mount and original r/w-mount
> share the same device, only one cleanerd is enough for them. (In
> fact, two cleanerds conflict with one another). So, we can give over
> the GC functionality to the original mount point without incident.
Just a question; why can't you use single cleaner? It would detect all
mounted partitions and perform needed tasks, maybe in parallel, as needed.
>
> Ryusuke
>
Regards,
Alex
[-- Attachment #1.2: Type: text/html, Size: 1222 bytes --]
[-- Attachment #2: Type: text/plain, Size: 158 bytes --]
_______________________________________________
users mailing list
users-JrjvKiOkagjYtjvyW6yDsg@public.gmane.org
https://www.nilfs.org/mailman/listinfo/users
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ?
[not found] ` <591bc86b0903310814t2814ab95ie71077d5bb59f99d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-03-31 16:09 ` Ryusuke Konishi
0 siblings, 0 replies; 8+ messages in thread
From: Ryusuke Konishi @ 2009-03-31 16:09 UTC (permalink / raw)
To: users-JrjvKiOkagjYtjvyW6yDsg, san-UZOCzndLCy9BDgjK7y7TUQ
Hi,
On Tue, 31 Mar 2009 15:14:48 +0000, Alex Bitney <san-UZOCzndLCy9BDgjK7y7TUQ@public.gmane.org> wrote:
> On Tue, Mar 31, 2009 at 3:10 PM, Ryusuke Konishi <ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
>
> > On Tue, 31 Mar 2009 23:44:37 +0900 (JST), Ryusuke Konishi <
> > ryusuke-sG5X7nlA6pw@public.gmane.org> wrote:
> >
> > Let me add one thing. Since the bind-mount and original r/w-mount
> > share the same device, only one cleanerd is enough for them. (In
> > fact, two cleanerds conflict with one another). So, we can give over
> > the GC functionality to the original mount point without incident.
>
>
> Just a question; why can't you use single cleaner? It would detect all
> mounted partitions and perform needed tasks, maybe in parallel, as needed.
Well, single (shared) cleanerd approach indeed sounds better in the
long term. The merit of the current pid and signal based (individual
cleaner) approach is its simplicity. So, I think we should make the
shift to the single cleaner approach if things get unmanageable within
the current way.
The reason why we are not using the shared cleaner, is just we took
the present way when we first designed it.
Regards,
Ryusuke Konishi
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-03-31 16:09 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-17 10:45 Constant checkpointing and cleanerd activity - doesn't play nice with mount --bind ? Roman Mamedov
[not found] ` <20090317154529.1aeeecdc-2Ve/5xEMxL0@public.gmane.org>
2009-03-17 13:43 ` Ryusuke Konishi
[not found] ` <20090317.224324.58871500.ryusuke-sG5X7nlA6pw@public.gmane.org>
2009-03-17 14:19 ` Roman Mamedov
[not found] ` <20090317191924.1924576b-2Ve/5xEMxL0@public.gmane.org>
2009-03-22 17:53 ` Ryusuke Konishi
2009-03-31 14:44 ` Ryusuke Konishi
[not found] ` <20090331.234437.112904649.ryusuke-sG5X7nlA6pw@public.gmane.org>
2009-03-31 15:10 ` Ryusuke Konishi
[not found] ` <20090401.001059.06956465.ryusuke-sG5X7nlA6pw@public.gmane.org>
2009-03-31 15:14 ` Alex Bitney
[not found] ` <591bc86b0903310814t2814ab95ie71077d5bb59f99d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-03-31 16:09 ` Ryusuke Konishi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox