public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Frank Haverkamp <haver@vnet.ibm.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: [PATCH 00/22 take 3] UBI: Unsorted Block Images
Date: Mon, 19 Mar 2007 14:54:42 -0500	[thread overview]
Message-ID: <20070319195442.GT4892@waste.org> (raw)
In-Reply-To: <1174328188.30079.46.camel@zod.rchland.ibm.com>

On Mon, Mar 19, 2007 at 01:16:28PM -0500, Josh Boyer wrote:
> On Mon, 2007-03-19 at 12:08 -0500, Matt Mackall wrote:
> > 
> > > > If the end goal is to end up with something that looks like a block
> > > > device (which seems to be implied by adding transparent wear leveling
> > > 
> > > Nope, not the end goal.  It's more about wear-leveling across the entire
> > > flash chip than it is presenting a "block like" device.
> > 
> > It seems to be about spanning devices and repartitioning as well.
> > Hence the analogy with LVM.
> 
> Yes, it can span multiple MTDs which spreads the wear-leveling even
> more.  Yes, it can create/resize/remove volumes.  It does that
> differently than LVM, but the ideas are related.  I don't see the issue
> here I guess.

The issue is 14000 lines of patch to make a parallel subsystem.

> (UBI also has static volumes which LVM doesn't but that is an aside.)

If a static volume is simply a non-dynamic volume, then device mapper
can do that too. And countless other things. Which is not an aside.
UBI growing to do all the things that device mapper does is exactly
the thing we should be seeking to avoid.
 
> > > > and bad block remapping), then I don't see any reason it can't be done
> > > > in device mapper. The 'smarts' of mtdblock could in fact be pulled up
> > > 
> > > There is nothing smart about mtdblock.  And mtdblock has nothing to do
> > > with UBI.
> > 
> > Note the scare quotes. Device mapper runs on top of a block device.
> > And mtdblock is currently the block interface that MTD exports. And it
> > has 'smarts' that hide handling of sub-eraseblock I/O. I'm clearly
> > talking about an approach that doesn't involve UBI at all.
> 
> Ok, but what I'm saying is that using device mapper on top of mtdblock
> is not a good solution.  mtdblock caches writes within an eraseblock to
> a DRAM buffer of eraseblock size.  If you get a power failure before
> that is flushed out, you lose an entire eraseblock's worth of data.

Sigh. That's precisely why I talked about moving said smarts. This is
nothing that a higher level remapping layer can't address.
 
> That's why I suggested fixing the MTD layers that present block devices
> first in the part of my reply that you cut off.  It seems to me that
> you're really after getting flash to look like a block device, which
> would enable device mapper to be used for something similar to UBI.
> That's fine, but until someone does that work UBI fills a need, has
> users, and has an existing implementation.

False starts that get mainlined delay or prevent things getting done
right. The question is and remains "is UBI the right way to do
things?" Not "is UBI the easiest way to do things?" or "is UBI
something people have already adopted?"

If the right way is instead to extend the block layer and device
mapper to encompass the quirks of NAND in a sensible fashion, then UBI
should not go in.

Let me draw a picture so we have something to argue about:

                     iSCSI/nbd(6)
                          |
filesystem {        swap  |  ext3        ext3     jffs2
                      \   |   |            |       /
               /       \  | dm-crypt->snapshot(5) /
device mapper -|        \ \   |                  /
               |         partitioning           /
               |              |          partitioning(4)
               |        wear leveling(3)  /
               |              |          /
               |      block concatenation
               |       |    |    |     |
               \      bad block remapping(2)   
                       |    |    |     |
MTD raw block {     raw block devices with no smarts(1)
                      /     |     \      \
hardware {         NAND    NAND   NAND   NAND

Notes:
1. This would provide a block device that allowed writing pages and
   a secondary method for erasing whole blocks as well as a method for
   querying/setting out of band information.
2. This would hide erase blocks either by using an embedded table or
   out of band info. This could stack on top of block concatenation if
   desired.
3. This would provide wear leveling, and probably simultaneously
   provide relatively efficient and safe access to write sector 
   and page-sized I/O. Below this level, things had better be
   comfortable with the limitations of NAND if they want to work well.
4. JFFS2 has its own wear-leving scheme, as do several other
   filesystems, so they probably want to bypass this piece of the stack.
5. We don't reimplement higher pieces of the stack (dm-crypt,
   snapshot, etc.).
6. We make some things possible that simply aren't otherwise.

And this picture isn't even interesting yet. Imagine a dm-cache layer
that caches data read from disks in high-speed flash. Or using
dm-mirror to mirror writes to local flash over NBD or to a USB drive.
Neither of these can be done 'right' in a stack split between device
mapper and UBI.

-- 
Mathematics is the supreme nostalgia of our time.

  reply	other threads:[~2007-03-19 20:08 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-14 15:19 [PATCH 00/22 take 3] UBI: Unsorted Block Images Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 01/22 take 3] UBI: on-flash data structures header Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 02/22 take 3] UBI: user-space API header Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 03/22 take 3] UBI: kernel-space " Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 04/22 take 3] UBI: internal header Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 05/22 take 3] UBI: startup code Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 06/22 take 3] UBI: scanning unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 07/22 take 3] UBI: I/O unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 08/22 take 3] UBI: volume table unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 09/22 take 3] UBI: wear-leveling unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 10/22 take 3] UBI: EBA unit Artem Bityutskiy
2007-03-15 19:07   ` Andrew Morton
2007-03-15 21:24     ` Randy Dunlap
2007-03-15 23:29       ` Josh Boyer
2007-03-16  1:49         ` Randy Dunlap
2007-03-16 10:23           ` Artem Bityutskiy
2007-03-16 10:21       ` Artem Bityutskiy
2007-03-16 14:55         ` Randy Dunlap
2007-03-16 10:14     ` Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 11/22 take 3] UBI: user-interfaces unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 12/22 take 3] UBI: update functionality Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 13/22 take 3] UBI: accounting unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 14/22 take 3] UBI: volume management functionality Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 15/22 take 3] UBI: sysfs functionality Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 16/22 take 3] UBI: character devices functionality Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 17/22 take 3] UBI: gluebi functionality Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 18/22 take 3] UBI: misc stuff Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 19/22 take 3] UBI: debugging stuff Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 20/22 take 3] UBI: JFFS2 UBI support Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 21/22 take 3] UBI: update MAINTAINERS Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 22/22 take 3] UBI: Linux build integration Artem Bityutskiy
2007-03-18 16:27 ` [PATCH 00/22 take 3] UBI: Unsorted Block Images Matt Mackall
2007-03-18 16:49   ` Artem Bityutskiy
2007-03-18 19:18     ` Matt Mackall
2007-03-18 20:31       ` Josh Boyer
2007-03-19 17:08         ` Matt Mackall
2007-03-19 18:16           ` Josh Boyer
2007-03-19 19:54             ` Matt Mackall [this message]
2007-03-19 20:18               ` Artem Bityutskiy
2007-03-19 21:05               ` Thomas Gleixner
2007-03-19 22:32                 ` Matt Mackall
2007-03-20  0:42                   ` Thomas Gleixner
2007-03-20  1:05                     ` Matt Mackall
2007-03-20  6:28                       ` Thomas Gleixner
2007-03-21 11:05                     ` Jörn Engel
2007-03-21 11:25                       ` Thomas Gleixner
2007-03-21 11:35                         ` Jörn Engel
2007-03-21 11:57                           ` Thomas Gleixner
2007-03-21 12:31                             ` Jörn Engel
2007-03-21 12:39                               ` Artem Bityutskiy
2007-03-21 11:36                         ` Artem Bityutskiy
2007-03-25 20:08                         ` Jörn Engel
2007-03-25 21:49                           ` David Lang
2007-03-25 22:55                             ` Jörn Engel
2007-03-25 23:46                               ` David Woodhouse
2007-03-26  0:01                                 ` Jörn Engel
2007-03-26  0:21                                   ` David Woodhouse
2007-03-26  1:04                                     ` Jörn Engel
2007-03-26  9:45                                       ` David Woodhouse
2007-03-26  9:51                                         ` Jörn Engel
2007-03-26 10:07                                           ` David Woodhouse
2007-03-26 10:02                                         ` Thomas Gleixner
2007-03-26 10:49                           ` Artem Bityutskiy
2007-03-26 11:30                             ` Jörn Engel
2007-03-19 21:06               ` Artem Bityutskiy
2007-03-19 21:36                 ` Matt Mackall
2007-03-20  0:43                   ` Thomas Gleixner
2007-03-20 12:25                   ` Artem Bityutskiy
2007-03-20 13:52                     ` Theodore Tso
2007-03-20 15:14                       ` Artem Bityutskiy
2007-03-20 15:59                       ` Josh Boyer
2007-03-20 18:58                         ` David Lang
2007-03-20 20:05                           ` Artem Bityutskiy
2007-03-20 21:36                             ` David Woodhouse
2007-03-21  8:54                               ` Artem Bityutskiy
2007-03-20 21:32                           ` David Woodhouse
2007-03-21 13:03                             ` Jörn Engel
2007-03-20 22:03                         ` Theodore Tso
2007-03-21  8:44                           ` Artem Bityutskiy
2007-03-21 13:50                             ` Theodore Tso
2007-03-21 13:59                               ` Josh Boyer
2007-03-21 14:02                               ` Artem Bityutskiy
2007-03-21 15:38                               ` Frank Haverkamp
2007-03-21 20:26                                 ` David Lang
2007-03-20 12:13               ` Josh Boyer
2007-03-19 19:03           ` Thomas Gleixner
2007-03-19 20:12             ` Matt Mackall
2007-03-19 21:04               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070319195442.GT4892@waste.org \
    --to=mpm@selenic.com \
    --cc=dedekind@infradead.org \
    --cc=dwmw2@infradead.org \
    --cc=haver@vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=jwboyer@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox