From: dexen deVries <dexen.devries-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: nilfs2 doesn't garbage collect checkpoints for me
Date: Thu, 26 May 2011 20:32:53 +0200 [thread overview]
Message-ID: <201105262032.54257.dexen.devries@gmail.com> (raw)
In-Reply-To: <BANLkTim4BBKwFJUzbnsKw0_Ru2k8ZW3MYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Hi,
On Thursday 26 of May 2011 20:11:55 you wrote:
> I'm testing nilfs2 and other fs's for use on cheap flash cards, trying
> to avoid writing same location all the time.
I'm using nilfs2 on a small server with a cheap-o 16GB SSD extracted from
Eeepc for the same reason; works great.
> My test program makes lots of small sqlite transactions which sqlite
> syncs to disk.
> In less than 2000 transaction 1GB nilfs2 volume ran out of disk space.
> tried unmount, mount again, didn't help
> block device is nbd, works with with other fs's
>
> lscp shows there are 7121 checkpoints and somehow old ones are not
> removed automatically.
First off, the default configuration of nilfs_cleanerd is to keep all
checkpoints for at least one hour (3600 seconds). See file
/etc/nilfs_cleanerd.conf, option `protection_period'. For testing you may want
to change the protection period to just a few seconds and see if that helps.
Either via the config file (and issue a SIGHUP so it reloads the config) or via
the `-p SECONDS' argument (see manpage).
To see what's going on, you may want to change (temporarily) the
`log_priority' in config file to `debug'; in /var/log/debug you should then see
statements describing actions of the nilfs_cleanerd.
Example:
May 26 20:23:53 blitz nilfs_cleanerd[3198]: wake up
May 26 20:23:53 blitz nilfs_cleanerd[3198]: ncleansegs = 1175
May 26 20:23:53 blitz nilfs_cleanerd[3198]: 4 segments selected to be cleaned
May 26 20:23:53 blitz nilfs_cleanerd[3198]: protected checkpoints =
[156725,157003] (protection period >= 1306430633)
May 26 20:23:53 blitz nilfs_cleanerd[3198]: segment 1844 cleaned
May 26 20:23:53 blitz nilfs_cleanerd[3198]: segment 1845 cleaned
May 26 20:23:53 blitz nilfs_cleanerd[3198]: segment 1846 cleaned
May 26 20:23:53 blitz nilfs_cleanerd[3198]: segment 1847 cleaned
May 26 20:23:53 blitz nilfs_cleanerd[3198]: wait 0.488223000
where the `ncleansegs' is the number of clean (free) segments you already
have, and `protected checkpoints' indicates range of checkpoint numbers that
are still under protection (due to the `protection_period' setting)
In any case, my understanding is that in typical DB, each transaction (which
may be each command, if you don't begin/commit transaction explicitly) causes
an fsync() which creates a new checkpoint. On a small drive that *may* cause
creation of so many checkpoints in a short time they don't get GC'd before the
drive fills up. Not sure yet how to work around that.
Two more possible sources of the problem:
1) GC used to break in certain scenario: the FS could become internally
inconsistent (no data loss, but it wouldn't perform GC anymore) if two or more
nilfs_cleanerds were processing it at the same time. It's probably fixed with
the most recent patches. To check if that's the case, see output of `dmesg'
command; it would indicate problems in NILFS.
2) new `nilfs_cleanerd' process may become stuck on semaphore if you kill the
old one hard (for example, kill -9). That used to leave aux file in /dev/shm/,
like /dev/shm/sem.nilfs-cleaner-2067. To check if that's the case, run
nilfs_cleanred through strace, like:
# strace -f nilfs_cleanerd /dev/YOUR_FILESYSTEM
if it hangs at one point on futex() call, that's it. A brute-force, but sure-
fire way is to kill all instances of nilfs_cleanerd and remove files matching
/dev/shm/sem.nilfs-cleaner-*
Hope that helps somehow~
--
dexen deVries
``One can't proceed from the informal to the formal by formal means.''
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-05-26 18:32 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-26 18:11 nilfs2 doesn't garbage collect checkpoints for me Dima Tisnek
[not found] ` <BANLkTim4BBKwFJUzbnsKw0_Ru2k8ZW3MYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-26 18:32 ` dexen deVries [this message]
[not found] ` <201105262032.54257.dexen.devries-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2011-05-26 20:24 ` Dima Tisnek
[not found] ` <BANLkTimNm6QcNOmc1Gwp2K+SVKoRV8+8Cg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-27 4:38 ` Ryusuke Konishi
[not found] ` <20110527.133803.215389018.ryusuke-sG5X7nlA6pw@public.gmane.org>
2011-05-31 22:08 ` Dima Tisnek
[not found] ` <BANLkTinnpFyrxeO2_DF5gXLgas2WLdqw4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-01 8:03 ` dexen deVries
[not found] ` <201106011003.24656.dexen.devries-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2011-06-01 17:32 ` Dima Tisnek
[not found] ` <BANLkTimhLPM3TU1uw10Ub5t-GEW_s-B_tQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-01 17:34 ` dexen deVries
[not found] ` <201106011934.30725.dexen.devries-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2011-06-01 18:21 ` Dima Tisnek
[not found] ` <BANLkTinsZZeRDmQR8sAWrqUP1N4UK5ktAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-01 19:19 ` dexen deVries
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201105262032.54257.dexen.devries@gmail.com \
--to=dexen.devries-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.