public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: "Jörn Engel" <joern@logfs.org>
To: Jamie Lokier <jamie@shareable.org>
Cc: "Jörn Engel" <joern@logfs.org>,
	linux-mtd@lists.infradead.org,
	"Glenn Henshaw" <thraxisp@logicaloutcome.ca>
Subject: Re: Jffs2 and big file = very slow jffs2_garbage_collect_pass
Date: Sat, 19 Jan 2008 03:38:39 +0100	[thread overview]
Message-ID: <20080119023838.GA17136@lazybastard.org> (raw)
In-Reply-To: <20080119002302.GA567@shareable.org>

On Sat, 19 January 2008 00:23:02 +0000, Jamie Lokier wrote:
> Jörn Engel wrote:
> > 
> > There are two ways to solve this problem:
> > 1. Reserve some amount of free space for GC performance.
> 
> The real difficulty is that it's not clear how much to reserve for
> _reliable_ performance.  We're left guessing based on experience, and
> that gives only limited confidence.  The 5 blocks suggested in JFFS2
> docs seemed promising, but didn't work out.  Perhaps it does work with
> 5 blocks, but you have to count all potential metadata overhead and
> misalignment overhead when working out how much free "file" data that
> translates to?

The five blocks work well enough if your goal is that GC will return
_eventually_.  Now you come along and even want it to return within a
reasonable amount of time.  That is a different problem. ;)

Math is fairly simple.  The worst case is when the write pattern is
completely random and every block contains the same amount of data.  Let
us pick a 99% full filesystem for starters.

In order to write one block worth of data, GC need to move 99 blocks
worth of old data around, before it has freed a full block.  So on
average 99% of all writes handle GC data and only 1% handly the data you
- the user - care about.  If your filesystem is 80% full, 80% of all
writes are GC data and 20% are user data.  Very simple.

Latency is a different problem.  Depending on your design, those 80% or
99% GC writes can happen continuously or in huge batches.

> Really, some of us just want JFFS2 to return -ENOSPC
> at _some_ sensible deterministic point before the GC might behave
> peculiarly, rather than trying to squeeze as much as possible onto the
> partition.

Logfs has a field defined for GC reserve space.  I know the problem and
I care about it.  Although I have to admit that mkfs doesn't allow
setting this field yet.

> > 2. Write in some non-random fashion.
> > 
> > Solution 2 works even better if the filesystem actually sorts data
> > very roughly by life expectency.  That requires writing to several
> > blocks in parallel, i.e. one for long-lived data, one for short-lived
> > data.  Made an impressive difference in logfs when I implemented that.
> 
> Ah, a bit like generational GC :-)

Actually, no.  The different levels of the tree, which JFFS2 doesn't
store on the medium, also happen to have vastly different lifetimes.
Generational GC is the logical next step, which I haven't done yet.

Jörn

-- 
Science is like sex: sometimes something useful comes out,
but that is not the reason we are doing it.
-- Richard Feynman

  reply	other threads:[~2008-01-19  2:47 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-17 16:12 Jffs2 and big file = very slow jffs2_garbage_collect_pass Matthieu CASTET
2008-01-17 16:26 ` Jörn Engel
2008-01-17 17:43   ` Josh Boyer
2008-01-18  9:39     ` Matthieu CASTET
2008-01-18 12:48       ` Josh Boyer
2008-01-18 16:17         ` Matthieu CASTET
2008-01-18 17:55           ` Josh Boyer
2008-01-18 18:17             ` Jörn Engel
2008-01-21 15:57               ` Matthieu CASTET
2008-01-21 21:25                 ` Jörn Engel
2008-01-21 22:16                   ` Josh Boyer
2008-01-21 22:29                     ` Jörn Engel
2008-01-22  8:57                       ` Matthieu CASTET
2008-01-22 12:03                         ` Jörn Engel
2008-01-22 13:24                           ` Ricard Wanderlof
2008-01-22 15:05                             ` Jörn Engel
2008-01-23  9:23                               ` Ricard Wanderlof
2008-01-23 10:19                                 ` Jörn Engel
2008-01-23 10:41                                   ` Ricard Wanderlof
2008-01-23 10:57                                     ` Jörn Engel
2008-01-23 11:57                                       ` Ricard Wanderlof
2008-01-23 13:01                                         ` Jörn Engel
2008-01-23 13:16                                           ` Ricard Wanderlof
2008-01-23 14:06                                             ` Jörn Engel
2008-01-23 14:25                                               ` Ricard Wanderlof
2008-01-21 22:36                   ` Glenn Henshaw
2008-01-18 17:20     ` Glenn Henshaw
2008-01-18 18:39       ` Jamie Lokier
2008-01-18 21:00         ` Jörn Engel
2008-01-19  0:23           ` Jamie Lokier
2008-01-19  2:38             ` Jörn Engel [this message]
2008-01-17 23:22   ` David Woodhouse
2008-01-18  9:45   ` Matthieu CASTET
2008-01-18 18:20   ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080119023838.GA17136@lazybastard.org \
    --to=joern@logfs.org \
    --cc=jamie@shareable.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=thraxisp@logicaloutcome.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox