public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>,
	Artem Bityutskiy <dedekind@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Frank Haverkamp <haver@vnet.ibm.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: [PATCH 00/22 take 3] UBI: Unsorted Block Images
Date: Mon, 19 Mar 2007 20:05:29 -0500	[thread overview]
Message-ID: <20070320010529.GC4892@waste.org> (raw)
In-Reply-To: <1174351366.13341.739.camel@localhost.localdomain>

On Tue, Mar 20, 2007 at 01:42:46AM +0100, Thomas Gleixner wrote:
> On Mon, 2007-03-19 at 17:32 -0500, Matt Mackall wrote:
> > > > If a static volume is simply a non-dynamic volume, then device mapper
> > > > can do that too. And countless other things. Which is not an aside.
> > > > UBI growing to do all the things that device mapper does is exactly
> > > > the thing we should be seeking to avoid.
> > > 
> > > No it can't and device mapper sits on top of block devices. FLASH is no
> > > block device. Period.
> > 
> > Which of the following two properties does it lack?
> > 
> > - discrete blocks
> > - non-sequential access to blocks
> > 
> > When you do the obvious s/blocks/eraseblocks/, this appears to be
> > true.
> 
> It appears to be, but it is not. You enforce semantics on a device,
> which it does not have.
> 
> > Saying "but I can't do I/O smaller than the blocksize" doesn't change
> > this any more than it would for disks.
> 
> There is a huge difference. Disk block size is 512 byte and FLASH block
> size is min 16KiB and up to 256KiB.
> 
> Just do the math:
> 
> Write sampling data streams in 2KiB chunks to your uber devicemapper on
> a 1GiB device with 64KiB erase block size:
> 
> Fine grained FLASH aware writes allow 32 chunks in a block without
> erasing the block.
> 
> Your method erases the block 32 times to write the same amount of data.

Sigh. That's the current /dev/mtdblock method, not my method. You're too
fixated on what you think I'm saying to hear what I'm saying.

> > Saying "but I can do smaller I/O efficiently in some circumstances"
> > also doesn't change it.
> 
> We can do it under _any_ circumstances and that _does_ change it.
> Implementing a clever block device layer on top of UBI is simple and
> would provide FLASH page sized I/O, i.e. 2Kib in the above example.

Yes. I know. I've written a complete (non-Linux) FTL. I know what's
entailed.
 
> > In historical UNIX, some tapes were block devices too. Because they
> > supported seek().
> 
> I'm impressed. How exactly are "some tapes" comparable to FLASH chips ?
> 
> Your next proposal is to throw away MTD-utils and use "mt" instead ?

Don't be an ass. I'm pointing out that not all block devices are disks.
 
> > > Device mapper can not provide a simple easy to decode scheme for boot
> > > loaders. We need to be able to boot out of 512 - 2048 byte of NAND FLASH
> > > and be able to find the kernel or second stage boot loader in this
> > > unordered device.
> > > 
> > > And no, fixed addresses do not work. Do you want to implement device
> > > mapper into your Initialial Bootloader stage ?
> > 
> > This is exactly the same problem as booting on a desktop PC. But
> > somehow LILO manages. My first Linux box had a hell of a lot less disk
> > than the platform I bootstrapped (and wrote NAND drivers for) last
> > month had in NAND.
> 
> No, it is not. You get the absolute sector address of your second stage
> and this is a complete nobrainer. The translation is done in the DISK
> device.

LILO and friends manage to boot systems that use software RAID and
LVM. There are multiple methods. Some use block lists, some use tiny
boot partitions, etc. All of them are applicable to controllerless NAND.

> You simply ignore the fact, that inside each disk, USB Stick, CF-CARD,
> whatever - there is a more or less intellegent controller device, which
> does the mapping to the physical storage location. There is _NO_ such
> thing on a bare FLASH chip.

How many times do I have to tell you that I wrote a driver for
controllerless NAND just last month?

> How exactly does device mapper:
> 
> A) across device wear levelling ?

The same way UBI does, but encapsulated in a device mapper layer.

> B) dynamic partitioning for FLASH aware file systems ?

See above.

> C) across device wear levelling for FLASH aware file systems ?

See above.

> D) background bit-flip corrections (copying affected blocks and recylce
> the old one) ?

See above.

> E) allow position independent placement of the second stage bootloader ?

See way above to my LILO response.

> > > You need to implement a clever journalling block device
> > > emulator in order to keep the data alive and the FLASH not weared out
> > > within no time. You need the wear levelling, otherwise you can throw
> > > away your FLASH in no time.
> > 
> > And that's why it's in my picture.
> 
> Yes, it is in your picture, but:
> 
> 1) it excludes FLASH aware file systems and UBI does not.
> 2) your picture does still not explain how it does achive the above A),
> B), C), D) and E)
> 
> Your extra path for partitioning(4) and JFFS2 is just a weird hack,
> which makes your proposal completely absurd.

No, it's just there to show the flexibility of device mapper. But I have
the sneaking suspicion you have no idea how device mapper works.

In brief: device mapper takes one or more devices, applies a mapping
to them, and returns a new device. For example, take various spans of
/dev/hda1 and /dev/sda3 and present them as new-device1. Take
new-device1 and transform it with dm-crypt to get new-device2. The
kernel doesn't decide how to do this, any more than it decides where
to mount your filesystems. Userspace does.
 
> > > > 5. We don't reimplement higher pieces of the stack (dm-crypt,
> > > >    snapshot, etc.).
> > > 
> > > Why should we reimplement that ?
> > 
> > So that you can get encryption and snapshot, etc.?
> 
> 1. On top of a clever block device.
> 
> 2. UBI can do snapshots by design.

Oh, so you HAVE reimplemented it.

> 3. Encryption should be done on the VFS layer and not below the
> filesystem layer. Doing it inside the block layer or the device mapper
> is broken by design.

That's highly debatable and not a topic for this thread.

-- 
Mathematics is the supreme nostalgia of our time.

  reply	other threads:[~2007-03-20  1:18 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-14 15:19 [PATCH 00/22 take 3] UBI: Unsorted Block Images Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 01/22 take 3] UBI: on-flash data structures header Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 02/22 take 3] UBI: user-space API header Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 03/22 take 3] UBI: kernel-space " Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 04/22 take 3] UBI: internal header Artem Bityutskiy
2007-03-14 15:19 ` [PATCH 05/22 take 3] UBI: startup code Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 06/22 take 3] UBI: scanning unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 07/22 take 3] UBI: I/O unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 08/22 take 3] UBI: volume table unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 09/22 take 3] UBI: wear-leveling unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 10/22 take 3] UBI: EBA unit Artem Bityutskiy
2007-03-15 19:07   ` Andrew Morton
2007-03-15 21:24     ` Randy Dunlap
2007-03-15 23:29       ` Josh Boyer
2007-03-16  1:49         ` Randy Dunlap
2007-03-16 10:23           ` Artem Bityutskiy
2007-03-16 10:21       ` Artem Bityutskiy
2007-03-16 14:55         ` Randy Dunlap
2007-03-16 10:14     ` Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 11/22 take 3] UBI: user-interfaces unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 12/22 take 3] UBI: update functionality Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 13/22 take 3] UBI: accounting unit Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 14/22 take 3] UBI: volume management functionality Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 15/22 take 3] UBI: sysfs functionality Artem Bityutskiy
2007-03-14 15:20 ` [PATCH 16/22 take 3] UBI: character devices functionality Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 17/22 take 3] UBI: gluebi functionality Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 18/22 take 3] UBI: misc stuff Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 19/22 take 3] UBI: debugging stuff Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 20/22 take 3] UBI: JFFS2 UBI support Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 21/22 take 3] UBI: update MAINTAINERS Artem Bityutskiy
2007-03-14 15:21 ` [PATCH 22/22 take 3] UBI: Linux build integration Artem Bityutskiy
2007-03-18 16:27 ` [PATCH 00/22 take 3] UBI: Unsorted Block Images Matt Mackall
2007-03-18 16:49   ` Artem Bityutskiy
2007-03-18 19:18     ` Matt Mackall
2007-03-18 20:31       ` Josh Boyer
2007-03-19 17:08         ` Matt Mackall
2007-03-19 18:16           ` Josh Boyer
2007-03-19 19:54             ` Matt Mackall
2007-03-19 20:18               ` Artem Bityutskiy
2007-03-19 21:05               ` Thomas Gleixner
2007-03-19 22:32                 ` Matt Mackall
2007-03-20  0:42                   ` Thomas Gleixner
2007-03-20  1:05                     ` Matt Mackall [this message]
2007-03-20  6:28                       ` Thomas Gleixner
2007-03-21 11:05                     ` Jörn Engel
2007-03-21 11:25                       ` Thomas Gleixner
2007-03-21 11:35                         ` Jörn Engel
2007-03-21 11:57                           ` Thomas Gleixner
2007-03-21 12:31                             ` Jörn Engel
2007-03-21 12:39                               ` Artem Bityutskiy
2007-03-21 11:36                         ` Artem Bityutskiy
2007-03-25 20:08                         ` Jörn Engel
2007-03-25 21:49                           ` David Lang
2007-03-25 22:55                             ` Jörn Engel
2007-03-25 23:46                               ` David Woodhouse
2007-03-26  0:01                                 ` Jörn Engel
2007-03-26  0:21                                   ` David Woodhouse
2007-03-26  1:04                                     ` Jörn Engel
2007-03-26  9:45                                       ` David Woodhouse
2007-03-26  9:51                                         ` Jörn Engel
2007-03-26 10:07                                           ` David Woodhouse
2007-03-26 10:02                                         ` Thomas Gleixner
2007-03-26 10:49                           ` Artem Bityutskiy
2007-03-26 11:30                             ` Jörn Engel
2007-03-19 21:06               ` Artem Bityutskiy
2007-03-19 21:36                 ` Matt Mackall
2007-03-20  0:43                   ` Thomas Gleixner
2007-03-20 12:25                   ` Artem Bityutskiy
2007-03-20 13:52                     ` Theodore Tso
2007-03-20 15:14                       ` Artem Bityutskiy
2007-03-20 15:59                       ` Josh Boyer
2007-03-20 18:58                         ` David Lang
2007-03-20 20:05                           ` Artem Bityutskiy
2007-03-20 21:36                             ` David Woodhouse
2007-03-21  8:54                               ` Artem Bityutskiy
2007-03-20 21:32                           ` David Woodhouse
2007-03-21 13:03                             ` Jörn Engel
2007-03-20 22:03                         ` Theodore Tso
2007-03-21  8:44                           ` Artem Bityutskiy
2007-03-21 13:50                             ` Theodore Tso
2007-03-21 13:59                               ` Josh Boyer
2007-03-21 14:02                               ` Artem Bityutskiy
2007-03-21 15:38                               ` Frank Haverkamp
2007-03-21 20:26                                 ` David Lang
2007-03-20 12:13               ` Josh Boyer
2007-03-19 19:03           ` Thomas Gleixner
2007-03-19 20:12             ` Matt Mackall
2007-03-19 21:04               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070320010529.GC4892@waste.org \
    --to=mpm@selenic.com \
    --cc=dedekind@infradead.org \
    --cc=dwmw2@infradead.org \
    --cc=haver@vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=jwboyer@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox