public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: Pavel Machek <pavel@ucw.cz>
Cc: linux-arm-kernel@lists.infradead.org,
	Andrei Warkentin <andreiw@motorola.com>,
	linux-fsdevel@vger.kernel.org, linux-mmc@vger.kernel.org
Subject: Re: MMC quirks relating to performance/lifetime.
Date: Tue, 8 Mar 2011 15:03:41 +0100	[thread overview]
Message-ID: <201103081503.42297.arnd@arndb.de> (raw)
In-Reply-To: <20110308065911.GC1357@ucw.cz>

On Tuesday 08 March 2011, Pavel Machek wrote:
> > > 
> > > How big is performance difference?
> > 
> > Several orders of magnitude. It is very easy to get a card that can write
> > 12 MB/s into a case where it writes no more than 30 KB/s, doing only
> > things that happen frequently with ext3.
> 
> Ungood.
> 
> I guess we should create something like loopback device, which knows
> about flash specifics, and does the right coalescing so that card
> stays in the fast mode?

I have listed a few suggestions for areas to work in my article
at https://lwn.net/Articles/428584/. My idea was to use a device mapper
target, as described in https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashDeviceMapper
but a loopback device might work as well.

The other area that I think will help a lot is to make the I/O
scheduler aware of the erase block size and the preferred access
patterns.
 
> ...or, do we need to create new, simple filesystem with layout similar
> to fat32, for use on mmc cards?

It doesn't need to be similar to fat32, but creating a new file system
could fix this, too. Microsoft seems to have built ExFAT around
cheap flash devices, though they don't document what that does exactly.
I think we can do better than that, and I still want to find out
how close nilfs2 and btrfs can actually get to the optimum.

Note that it's not just MMC cards though, you get the exact same
effects on some low-end SSDs (which are basically repackaged CF
cards) and most USB sticks. The best USB sticks I have seen
can hide some effects with a bit of caching, and they have a higher
number of open segments than the cheap ones, but the basic
problems are unchanged.

The requirements for a good low-end flash optimized file system
would be roughly:

1. Do all writes is chunks of 32 or 64 KB. If there is less
   data to write, fill the chunk with zeroes and clean up later,
   but don't write more data to the same chunk.
2. Start writing on a segment (e.g. 4 MB, configurable) boundary,
   then write that segment to the end using the chunks mentioned
   above.
3. Erase full segments using trim/erase/discard before writing
   to them, if supported by the drive.
4. Have a configurable number of segments open for writing, i.e.
   you have written blocks at the start of the segment but not
   filled the segment to the end. Typical hardware limitations
   are between 1 and 10 open segments.
5. Keep all metadata within a single 4 MB segment. Drives that cannot
   do random access within normal segments can do it in the area
   that holds the FAT. If 4 MB is not enough, the FAT area can be
   used as a journal or cache, for a larger metadata area that gets
   written less frequently.
6. Because of the requirement to erase 4 MB chunks at once, there
   needs to be garbage collection to free up space. The quality
   of the garbage collection algorithm directly relates to the
   performance on full file systems and/or the space overhead.
7. Some static wear levelling is required to increase the expected
   life of consumer devices that only do dynamic wear levelling,
   i.e. the segments that contain purely static data need to
   be written occasionally so they make it back into the
   wear leveling pool of the hardware.

	Arnd

      parent reply	other threads:[~2011-03-08 14:03 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTikh4vfS7SLKAa-aUXhbTxcHzYHmBuaXj1qHHYN9@mail.gmail.com>
2011-02-08 21:38 ` MMC quirks relating to performance/lifetime Wolfram Sang
2011-02-09  8:37 ` Linus Walleij
2011-02-09  9:13   ` Arnd Bergmann
2011-02-11 22:33     ` Andrei Warkentin
2011-02-12 17:05       ` Arnd Bergmann
2011-02-12 17:33         ` Andrei Warkentin
2011-02-12 18:22           ` Arnd Bergmann
2011-02-18  1:10       ` Andrei Warkentin
2011-02-18 13:44         ` Arnd Bergmann
2011-02-18 19:47           ` Andrei Warkentin
2011-02-18 22:40             ` Andrei Warkentin
2011-02-18 23:17               ` Andrei Warkentin
2011-02-19 11:20                 ` Arnd Bergmann
2011-02-20  5:56                   ` Andrei Warkentin
2011-02-20 15:23                     ` Arnd Bergmann
2011-02-22  7:05                       ` Andrei Warkentin
2011-02-22 16:49                         ` Arnd Bergmann
2011-02-19  9:54               ` Arnd Bergmann
2011-02-20  4:39                 ` Andrei Warkentin
2011-02-20 15:03                   ` Arnd Bergmann
2011-02-22  6:42                     ` Andrei Warkentin
2011-02-22 16:42                       ` Arnd Bergmann
2011-02-11 23:23     ` Linus Walleij
2011-02-12 10:45       ` Arnd Bergmann
2011-02-12 10:59         ` Russell King - ARM Linux
2011-02-12 16:28           ` Arnd Bergmann
2011-02-12 16:37             ` Russell King - ARM Linux
2011-02-11 22:27   ` Andrei Warkentin
2011-02-12 18:37     ` Arnd Bergmann
2011-02-13  0:10       ` Andrei Warkentin
2011-02-13 17:39         ` Arnd Bergmann
2011-02-14 19:29           ` Andrei Warkentin
2011-02-14 20:22             ` Arnd Bergmann
2011-02-14 22:25               ` Andrei Warkentin
2011-02-15 17:16                 ` Arnd Bergmann
2011-02-17  2:08                   ` Andrei Warkentin
2011-02-17 15:47                     ` Arnd Bergmann
2011-02-20 11:27                       ` Andrei Warkentin
2011-02-20 14:39                         ` Arnd Bergmann
2011-02-22  7:46                           ` Andrei Warkentin
2011-02-22 17:00                             ` Arnd Bergmann
2011-02-23 10:19                               ` Andrei Warkentin
2011-02-23 16:09                                 ` Arnd Bergmann
2011-02-23 22:26                                   ` Andrei Warkentin
2011-02-24  9:24                                     ` Arnd Bergmann
2011-02-25 11:02                                       ` Andrei Warkentin
2011-02-25 12:21                                         ` Arnd Bergmann
2011-03-01 18:48                                           ` Jens Axboe
2011-03-01 19:11                                             ` Arnd Bergmann
2011-03-01 19:15                                               ` Jens Axboe
2011-03-01 19:51                                                 ` Arnd Bergmann
2011-03-01 21:33                                                   ` Andrei Warkentin
2011-03-02 10:34                                               ` Andrei Warkentin
2011-03-05  9:23                                                 ` Andrei Warkentin
     [not found] ` <201102111551.15508.arnd@arndb.de>
     [not found]   ` <20110308065911.GC1357@ucw.cz>
2011-03-08 14:03     ` Arnd Bergmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201103081503.42297.arnd@arndb.de \
    --to=arnd@arndb.de \
    --cc=andreiw@motorola.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=pavel@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox