Re: mirroring in JFFS2

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* Re: mirroring in JFFS2
       [not found] <3DD0D966.E6D877EC@procsys.com>
@ 2002-11-12 16:08 ` Jörn Engel
  2002-11-12 16:18   ` David Woodhouse
  0 siblings, 1 reply; 6+ messages in thread
From: Jörn Engel @ 2002-11-12 16:08 UTC (permalink / raw)
  To: Miraj Mohamed; +Cc: jffs-dev, linux-mtd

On Tue, 12 November 2002 16:05:18 +0530, Miraj Mohamed wrote:
> 
>         Our system needs error detection and recovery of data on Flash.
> We selected Jffs2 which has in built CRC. And planning to modify
> Jffs2 code for mirroring. This means...each write and erase will be
> duplicated
> on mirror partition . While reading if a CRC error is detected,
> the mirror data will be read.
> 
>                     Can any one say if this implementation is ok (or is
> possible)?
> Have anyone implemented a similar system before? Any other way to attain
> redundancy?

Are you trying to put the mirroring stuff into jffs2?

In the hard disk world, people use md for this, which uses two devices
and returns one. The filesystem does not need to worry about anything.

Writing an mtd driver in that fashion should be pretty easy. Robert
Kaiser has done something similar already, have a look at the concat
layer.

Jörn

-- 
Data expands to fill the space available for storage.
-- Parkinson's Law

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mirroring in JFFS2
  2002-11-12 16:08 ` mirroring in JFFS2 Jörn Engel
@ 2002-11-12 16:18   ` David Woodhouse
  2002-11-12 17:15     ` Jörn Engel
  2002-11-12 17:22     ` Alan Cox
  0 siblings, 2 replies; 6+ messages in thread
From: David Woodhouse @ 2002-11-12 16:18 UTC (permalink / raw)
  To: Jörn Engel; +Cc: Miraj Mohamed, jffs-dev, linux-mtd

joern@wohnheim.fh-wedel.de said:
>  Are you trying to put the mirroring stuff into jffs2?
> In the hard disk world, people use md for this, which uses two devices
> and returns one. The filesystem does not need to worry about anything.

RAID is done at the wrong layer. The file system knows stuff about the
contents of the media which a block device driver cannot possibly know. So
you end up having a RAID rebuild take ages to reconstruct parts of the disc
which the file system _knows_ are currently unused, etc. 

You can have journalled RAID to help alleviate this problem -- or you could 
just let the file system do it because that already has a journal anyway.

So, for example, you scribble it to your journal, then to both your mirrors,
and mark the journal transaction complete only when it's hit both discs. 

Getting back to JFFS2, the same applies -- if you have a bad block in one 
of your flash chips, what do you do about it? Refrain from using the 
equivalent block in the other chip? Have some kind of block remapper 
underneath JFFS2, which keeps a whole lot of address information which is 
in fact entirely superfluous to the file system?

I think it does want to be done in the file system, or possibly even in a 
layer _above_ the individual file system, which duplicates writes to two or 
more underlying file systems of a mountpoint, and do whatever's deemed 
appropriate for reads. Doing it in the individual file system is probably 
easier, if less interesting :)

--
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mirroring in JFFS2
  2002-11-12 16:18   ` David Woodhouse
@ 2002-11-12 17:15     ` Jörn Engel
  2002-11-12 17:22     ` Alan Cox
  1 sibling, 0 replies; 6+ messages in thread
From: Jörn Engel @ 2002-11-12 17:15 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Miraj Mohamed, jffs-dev, linux-mtd

On Tue, 12 November 2002 16:18:37 +0000, David Woodhouse wrote:
> 
> RAID is done at the wrong layer. The file system knows stuff about the
> contents of the media which a block device driver cannot possibly know. So
> you end up having a RAID rebuild take ages to reconstruct parts of the disc
> which the file system _knows_ are currently unused, etc. 

This is an implementation problem, the RAID driver could as well
reconstruct on the fly and give pending requests priority. No need to
duplicate the code in all the filesystems.

> Getting back to JFFS2, the same applies -- if you have a bad block in one 
> of your flash chips, what do you do about it? Refrain from using the 
> equivalent block in the other chip? Have some kind of block remapper 
> underneath JFFS2, which keeps a whole lot of address information which is 
> in fact entirely superfluous to the file system?

The bad block point does make sense. Hard disks usually work
completely or fail completely. Point taken.

> I think it does want to be done in the file system, or possibly even in a 
> layer _above_ the individual file system, which duplicates writes to two or 
> more underlying file systems of a mountpoint, and do whatever's deemed 
> appropriate for reads. Doing it in the individual file system is probably 
> easier, if less interesting :)

RAID over filesystems would be fun, for sure. But in this case, you
have me convinced, jffs2 is the best place to put it into.

Jörn

-- 
Geld macht nicht glücklich.
Glück macht nicht satt.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mirroring in JFFS2
  2002-11-12 16:18   ` David Woodhouse
  2002-11-12 17:15     ` Jörn Engel
@ 2002-11-12 17:22     ` Alan Cox
  2002-11-12 17:07       ` David Woodhouse
  1 sibling, 1 reply; 6+ messages in thread
From: Alan Cox @ 2002-11-12 17:22 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Jörn Engel, Miraj Mohamed, jffs-dev, linux-mtd

On Tue, 2002-11-12 at 16:18, David Woodhouse wrote:
> RAID is done at the wrong layer. The file system knows stuff about the
> contents of the media which a block device driver cannot possibly know. So

And vice versa. Its not quite as simple as it looks. For flash I think
you are right.

> you end up having a RAID rebuild take ages to reconstruct parts of the disc
> which the file system _knows_ are currently unused, etc. 
> 
> You can have journalled RAID to help alleviate this problem -- or you could 
> just let the file system do it because that already has a journal anyway.

The fs doesnt know enough about the block I/O layer to do that

> I think it does want to be done in the file system, or possibly even in a 
> layer _above_ the individual file system, which duplicates writes to two or 
> more underlying file systems of a mountpoint, and do whatever's deemed 
> appropriate for reads. Doing it in the individual file system is probably 
> easier, if less interesting :)

A dupfs layer is probably quite doable too yes. However for JFFS2 surely
all you actually want to do is write each log entry including its ID
number to both journals. When you hit a bad block you can play back that
bit of the journal from the other flash and then mark it bad. The only
fun case is working out the size of your journal since its effectively
the smaller of two journals can shrink online.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mirroring in JFFS2
  2002-11-12 17:22     ` Alan Cox
@ 2002-11-12 17:07       ` David Woodhouse
  2002-11-12 17:27         ` Jörn Engel
  0 siblings, 1 reply; 6+ messages in thread
From: David Woodhouse @ 2002-11-12 17:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jörn Engel, Miraj Mohamed, jffs-dev, linux-mtd

alan@lxorguk.ukuu.org.uk said:
>  The fs doesnt know enough about the block I/O layer to do that

Certainly it doesn't when it's all hidden by RAID. It's feasible that it 
_could_ though. It looked like the nwfs code did something like this -- you 
told the file system explicitly about all the individual block devices it 
was supposed to be using. I never did investigate it much though.

To be honest, in a lot of cases I'd settle for a way for a file system to 
tell the block device 'this sector is now unused'. Not so much for RAID but 
for the "block-based file system on flash translation layer" case.

> A dupfs layer is probably quite doable too yes. 

There are a few interesting cases about what you do when you get write 
errors (or -ENOSPC) after your write already succeeded to the other device, 
but yeah -- it shouldn't be too horrible.

> However for JFFS2 surely all you actually want to do is write each log 
> entry including its ID number to both journals. When you hit a bad block 
> you can play back that bit of the journal from the other flash and then 
> mark it bad. 

Yep, that basically works.

> The only fun case is working out the size of your journal since its 
> effectively the smaller of two journals can shrink online.

Well that bit is quite fun already :) 

But it's not too bad -- you GC on _both_ media till you have enough space
on them both for the write you want to do, then you allow the allocation 
call to return, do the write to both media and return. Some detail in 
sorting out the case where a page write crosses an eraseblock boundary and 
ends up split into two on one or both media, but that's not really too hard 
conceptually -- I suspect it'd make an ugly mess of the code though.

--
dwmw2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mirroring in JFFS2
  2002-11-12 17:07       ` David Woodhouse
@ 2002-11-12 17:27         ` Jörn Engel
  0 siblings, 0 replies; 6+ messages in thread
From: Jörn Engel @ 2002-11-12 17:27 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Alan Cox, Miraj Mohamed, jffs-dev, linux-mtd

On Tue, 12 November 2002 17:07:22 +0000, David Woodhouse wrote:
> 
> But it's not too bad -- you GC on _both_ media till you have enough space
> on them both for the write you want to do, then you allow the allocation 
> call to return, do the write to both media and return. Some detail in 
> sorting out the case where a page write crosses an eraseblock boundary and 
> ends up split into two on one or both media, but that's not really too hard 
> conceptually -- I suspect it'd make an ugly mess of the code though.

I wonder if it would make sense to expand the erase marker a bit in
this case. For two devices, the erase marker on each device holds the
number of the corresponding block on the other. This would allow you
to prioritize bad block recovery, once you find one.

This is quite a special case, though. More devices and the procedure
is pointless. Different device types and the procedure is not
possible.

Jörn

-- 
"Translations are and will always be problematic. They inflict violence 
upon two languages." (translation from German)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-11-12 16:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3DD0D966.E6D877EC@procsys.com>
2002-11-12 16:08 ` mirroring in JFFS2 Jörn Engel
2002-11-12 16:18   ` David Woodhouse
2002-11-12 17:15     ` Jörn Engel
2002-11-12 17:22     ` Alan Cox
2002-11-12 17:07       ` David Woodhouse
2002-11-12 17:27         ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox