public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Should flash hardware look like UBI?
@ 2009-10-05 14:05 David Woodhouse
  2009-10-06 11:21 ` Thomas Gleixner
  2009-10-06 14:40 ` Bill Gatliff
  0 siblings, 2 replies; 6+ messages in thread
From: David Woodhouse @ 2009-10-05 14:05 UTC (permalink / raw)
  To: linux-mtd

I just posted http://www.advogato.org/person/dwmw2/diary/211.html which
laments the common view that future flash storage will just look like a
disk (SSD/eMMC/etc.). I don't think we want that, for the reasons given
there.

But I've also been thinking about exactly what we _do_ want. We
certainly want something a little more capable than the raw interface to
NAND flash that ONFI provides.

For a start, we want ECC to be built-in. We don't want to have to do
correction in software, and we want a 'copy page' command that's
actually useful -- we want it to apply ECC corrections rather than
copying any flipped bits faithfully. Although we probably do want the
_option_ to read the raw data still, without attempting to correct it.

We _also_ want the capability to do GC and block replacement for
ourselves, taking the opportunity to optimise the file system layout as
we do so. So we want to be _told_ about how many errors were fixed when
we read a block, and we also probably want to be told if we have to
recycle an eraseblock for other reasons (write disturb, etc.)

However, it's acceptable for the device to automatically move data for
us, if the OS lacks that capability or doesn't want to bother.

We don't _necessarily_ mind if the device hides bad blocks for us, and
if it maintains a logical<->physical mapping of eraseblocks. That kind
of thing has been done by hard drives for years, and it's simple enough
that they _ought_ to be able to get it right. Although the reliability
testing on Toshiba LBA-NAND shows that it's obviously possible to screw
it up too :)

It occurs to me that the interface that UBI presents to the higher
layers is actually a fairly close fit to what we want from hardware.

And where it _isn't_, there's something to be said for changing UBI to
match what we want. With notifications for 'block recycle needed', for
reading blocks without actually returning the data, just to check if the
ECC shows any bitflips, etc.)

Discuss.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should flash hardware look like UBI?
  2009-10-05 14:05 Should flash hardware look like UBI? David Woodhouse
@ 2009-10-06 11:21 ` Thomas Gleixner
  2009-10-06 12:05   ` David Woodhouse
  2009-10-06 14:40 ` Bill Gatliff
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2009-10-06 11:21 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd

David,

On Mon, 5 Oct 2009, David Woodhouse wrote:

> I just posted http://www.advogato.org/person/dwmw2/diary/211.html which
> laments the common view that future flash storage will just look like a
> disk (SSD/eMMC/etc.). I don't think we want that, for the reasons given
> there.

You forgot to mention that the pre SSD FLASH storage (USB sticks, CF
Cards ...) already have provided a proven track record of
unreliability. I still have the recommendation of keeping the power
supply stable for min. 400ms after the last access to the device which
I got from a well known "industrial grade" manufacturer.

I admit that certain SSD manufacturers seem to have slightly more
clue, but that does not change the fact that we need to implement a
filesystem on top of another filesystem.

> But I've also been thinking about exactly what we _do_ want. We
> certainly want something a little more capable than the raw interface to
> NAND flash that ONFI provides.
> 
> For a start, we want ECC to be built-in. We don't want to have to do
> correction in software, and we want a 'copy page' command that's
> actually useful -- we want it to apply ECC corrections rather than
> copying any flipped bits faithfully. Although we probably do want the
> _option_ to read the raw data still, without attempting to correct it.

Ack.
 
> We _also_ want the capability to do GC and block replacement for
> ourselves, taking the opportunity to optimise the file system layout as
> we do so. So we want to be _told_ about how many errors were fixed when
> we read a block, and we also probably want to be told if we have to
> recycle an eraseblock for other reasons (write disturb, etc.)
> 
> However, it's acceptable for the device to automatically move data for
> us, if the OS lacks that capability or doesn't want to bother.

Hmm, that's going to be a nightmare. Either let the OS sort it out or
let the Firmware deal with it.
 
> We don't _necessarily_ mind if the device hides bad blocks for us, and
> if it maintains a logical<->physical mapping of eraseblocks. That kind
> of thing has been done by hard drives for years, and it's simple enough
> that they _ought_ to be able to get it right. Although the reliability
> testing on Toshiba LBA-NAND shows that it's obviously possible to screw
> it up too :)

Hmm, we have seen in the past how poorly implemented the logical <->
physical mappings have been and how naive the wear levelling approach
was in such devices. I'm not too confident that the manufacturers
actually have improved that in a significant way.
 
> It occurs to me that the interface that UBI presents to the higher
> layers is actually a fairly close fit to what we want from hardware.
>
> And where it _isn't_, there's something to be said for changing UBI to
> match what we want. With notifications for 'block recycle needed', for
> reading blocks without actually returning the data, just to check if the
> ECC shows any bitflips, etc.)

Which makes me wonder whether we could have access to the CPU which
runs inside of SSDs and just implement UBI on it. :) 

Open Source Firmware for SSDs would allow us to optimize the
filesystem <-> storage device interaction even better.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should flash hardware look like UBI?
  2009-10-06 11:21 ` Thomas Gleixner
@ 2009-10-06 12:05   ` David Woodhouse
  2009-10-06 12:56     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2009-10-06 12:05 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-mtd

On Tue, 2009-10-06 at 13:21 +0200, Thomas Gleixner wrote:
> David,
> 
> On Mon, 5 Oct 2009, David Woodhouse wrote:
> 
> > I just posted http://www.advogato.org/person/dwmw2/diary/211.html which
> > laments the common view that future flash storage will just look like a
> > disk (SSD/eMMC/etc.). I don't think we want that, for the reasons given
> > there.
> 
> You forgot to mention that the pre SSD FLASH storage (USB sticks, CF
> Cards ...) already have provided a proven track record of
> unreliability. I still have the recommendation of keeping the power
> supply stable for min. 400ms after the last access to the device which
> I got from a well known "industrial grade" manufacturer.

Right. I don't think anyone who's been paying _any_ attention to the
situation can fail to be aware of how unreliable the existing technology
is.

> I admit that certain SSD manufacturers seem to have slightly more
> clue, but that does not change the fact that we need to implement a
> filesystem on top of another filesystem.
> 
> > But I've also been thinking about exactly what we _do_ want. We
> > certainly want something a little more capable than the raw interface to
> > NAND flash that ONFI provides.
> > 
> > For a start, we want ECC to be built-in. We don't want to have to do
> > correction in software, and we want a 'copy page' command that's
> > actually useful -- we want it to apply ECC corrections rather than
> > copying any flipped bits faithfully. Although we probably do want the
> > _option_ to read the raw data still, without attempting to correct it.
> 
> Ack.
>  
> > We _also_ want the capability to do GC and block replacement for
> > ourselves, taking the opportunity to optimise the file system layout as
> > we do so. So we want to be _told_ about how many errors were fixed when
> > we read a block, and we also probably want to be told if we have to
> > recycle an eraseblock for other reasons (write disturb, etc.)
> > 
> > However, it's acceptable for the device to automatically move data for
> > us, if the OS lacks that capability or doesn't want to bother.
> 
> Hmm, that's going to be a nightmare. Either let the OS sort it out or
> let the Firmware deal with it.

I'd definitely prefer to let the OS do it -- as long as there's a decent
ECC-safe copy command where the data never leave the device, of course.

I was just anticipating that some people won't _want_ to push that
requirement up to the OS, so I was prepared to concede and allow them
the _option_ of letting the device do it, as long as the OS could do it
for itself when it wants to.

But you're probably right. They'll only screw it up, and it's not
exactly hard for the OS to do it. So I'll retract my reluctant "it's
acceptable for the device to..." above.

> > We don't _necessarily_ mind if the device hides bad blocks for us, and
> > if it maintains a logical<->physical mapping of eraseblocks. That kind
> > of thing has been done by hard drives for years, and it's simple enough
> > that they _ought_ to be able to get it right. Although the reliability
> > testing on Toshiba LBA-NAND shows that it's obviously possible to screw
> > it up too :)
> 
> Hmm, we have seen in the past how poorly implemented the logical <->
> physical mappings have been and how naive the wear levelling approach
> was in such devices. I'm not too confident that the manufacturers
> actually have improved that in a significant way.

True, but if we keep it _simple_ then maybe they're less likely to screw
it up so comprehensively. I think that practically speaking, we're going
to have to let them do _something_. We can't just revoke their licence
to write code, much as we'd like to.

> > It occurs to me that the interface that UBI presents to the higher
> > layers is actually a fairly close fit to what we want from hardware.
> >
> > And where it _isn't_, there's something to be said for changing UBI to
> > match what we want. With notifications for 'block recycle needed', for
> > reading blocks without actually returning the data, just to check if the
> > ECC shows any bitflips, etc.)
> 
> Which makes me wonder whether we could have access to the CPU which
> runs inside of SSDs and just implement UBI on it. :) 
>
> Open Source Firmware for SSDs would allow us to optimize the
> filesystem <-> storage device interaction even better.

Yeah, but they have a fit if you even hint at that. It's stupid, but
they think there's something special in the code that we're all swearing
about and trying to get _out_ of the way.

But getting _them_ to implement something UBI-like might be possible.
It's much simpler than pretending to be a disk, so it's more likely that
they'll be able to get it right. And it gives us the capability to do
decent file systems which don't share the disadvantages of the SSD
model.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should flash hardware look like UBI?
  2009-10-06 12:05   ` David Woodhouse
@ 2009-10-06 12:56     ` Thomas Gleixner
  2009-10-06 13:07       ` David Woodhouse
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2009-10-06 12:56 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd

David,

On Tue, 6 Oct 2009, David Woodhouse wrote:
> On Tue, 2009-10-06 at 13:21 +0200, Thomas Gleixner wrote:
> > On Mon, 5 Oct 2009, David Woodhouse wrote:
> > >  
> > > We don't _necessarily_ mind if the device hides bad blocks for us, and
> > > if it maintains a logical<->physical mapping of eraseblocks. That kind
> > > of thing has been done by hard drives for years, and it's simple enough
> > > that they _ought_ to be able to get it right. Although the reliability
> > > testing on Toshiba LBA-NAND shows that it's obviously possible to screw
> > > it up too :)
> > 
> > Hmm, we have seen in the past how poorly implemented the logical <->
> > physical mappings have been and how naive the wear levelling approach
> > was in such devices. I'm not too confident that the manufacturers
> > actually have improved that in a significant way.
> 
> True, but if we keep it _simple_ then maybe they're less likely to screw
> it up so comprehensively. I think that practically speaking, we're going
> to have to let them do _something_. We can't just revoke their licence
> to write code, much as we'd like to.

I guess the bad block hiding might work, but I'm concerned about the
wear levelling aspect. That's what they usually fail to do in a
sensible way. When the resulting carefully hidden and obscured
algorithm is just contrary to what the filesystem expects you are
again digging holes in your device within no time.
 
> > > It occurs to me that the interface that UBI presents to the higher
> > > layers is actually a fairly close fit to what we want from hardware.
> > >
> > > And where it _isn't_, there's something to be said for changing UBI to
> > > match what we want. With notifications for 'block recycle needed', for
> > > reading blocks without actually returning the data, just to check if the
> > > ECC shows any bitflips, etc.)
> > 
> > Which makes me wonder whether we could have access to the CPU which
> > runs inside of SSDs and just implement UBI on it. :) 
> >
> > Open Source Firmware for SSDs would allow us to optimize the
> > filesystem <-> storage device interaction even better.
> 
> Yeah, but they have a fit if you even hint at that. It's stupid, but
> they think there's something special in the code that we're all swearing
> about and trying to get _out_ of the way.

Gah, they should concentrate on implementing useful and functional
hardware. I'm not convinced that companies who fail to implement a
working timer for more than 10 years might write code which is not a
complete desaster.

> But getting _them_ to implement something UBI-like might be possible.
> It's much simpler than pretending to be a disk, so it's more likely that
> they'll be able to get it right. And it gives us the capability to do
> decent file systems which don't share the disadvantages of the SSD
> model.

>From your mouth to God's ear !

     tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should flash hardware look like UBI?
  2009-10-06 12:56     ` Thomas Gleixner
@ 2009-10-06 13:07       ` David Woodhouse
  0 siblings, 0 replies; 6+ messages in thread
From: David Woodhouse @ 2009-10-06 13:07 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-mtd

On Tue, 2009-10-06 at 14:56 +0200, Thomas Gleixner wrote:
> 
> I guess the bad block hiding might work, but I'm concerned about the
> wear levelling aspect. That's what they usually fail to do in a
> sensible way. When the resulting carefully hidden and obscured
> algorithm is just contrary to what the filesystem expects you are
> again digging holes in your device within no time.

Yeah. It makes a lot of sense for the wear levelling to be done
explicitly by the OS, not in the device.

So maybe we don't want the interface to be _quite_ like UBI.

But if we can let the device handle _some_ of the aspects of the
logical<->physical translation, that would be interesting. Partly
because it could do the full scan of the device at startup _internally_,
and it wouldn't slow down the OS.

I'd be happy with limiting it to _just_ what hard drives to; ECC and a
1:1 remapping of blocks. Basically 'hardware acceleration for UBI'.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Should flash hardware look like UBI?
  2009-10-05 14:05 Should flash hardware look like UBI? David Woodhouse
  2009-10-06 11:21 ` Thomas Gleixner
@ 2009-10-06 14:40 ` Bill Gatliff
  1 sibling, 0 replies; 6+ messages in thread
From: Bill Gatliff @ 2009-10-06 14:40 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd

David Woodhouse wrote:
> But I've also been thinking about exactly what we _do_ want. We
> certainly want something a little more capable than the raw interface to
> NAND flash that ONFI provides.
>   

Actually, I prefer that we make the hardware as dumb as possible.  I'm 
ok with the chip timing its own flash erase and programming cycles, but 
everything else I want to keep up in our software.  That way we can 
adapt the functionality to whatever level of control that we want.

ECC can be difficult to get right, and the specific algorithm chosen can 
be application- and environment-dependent.  If it's baked into the chip 
then the algorithm decision is already made, and any bugs in the 
implementation are impossible to correct.  Also, like other posters have 
mentioned regarding SD/MMC firmware, any firmware running in a NAND 
flash will require thorough documentation so that users of the chip can 
accommodate its needs for power management, etc.

And on top of that, one really needs to know if the bits that they are 
reading back have required ECC to recover.  If a region of flash is 
failing, a sudden increase in the amount of error correction needed will 
tell you that some evasive action is necessary.  By its very nature, ECC 
can never be completely transparent to the user.

I don't see any substantial benefit to a chip having an inbuilt, 
hardware "memcpy" to allow one to move data around without reading it 
out.  That just sounds like more undocumented stuff to go wrong.  :)  
And more design decisions that have been made without the user's consent.


Just my US$0.02.


b.g.

-- 
Bill Gatliff
bgat@billgatliff.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-06 14:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-05 14:05 Should flash hardware look like UBI? David Woodhouse
2009-10-06 11:21 ` Thomas Gleixner
2009-10-06 12:05   ` David Woodhouse
2009-10-06 12:56     ` Thomas Gleixner
2009-10-06 13:07       ` David Woodhouse
2009-10-06 14:40 ` Bill Gatliff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox