Re: JFFS2 & NAND failure - Estelle HAMMACHE

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

From: Estelle HAMMACHE <estelle.hammache@st.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Subject: Re: JFFS2 & NAND failure
Date: Thu, 18 Nov 2004 18:54:48 +0100	[thread overview]
Message-ID: <419CE1E8.F20DD890@st.com> (raw)
In-Reply-To: 1100795260.8191.7333.camel@hades.cambridge.redhat.com

David Woodhouse wrote:
> >  - during wbuf flushing, if the previously written node filled
> >    wbuf exactly, "buf" may be used instead of "wbuf"
> >    in jffs2_wbuf_recover (access to null pointer)
> 
> Hmmm. But now with your change we can end up with a completely full wbuf
> which we don't actually flush; we leave it in memory. We should write
> that immediately, surely?

Not necessarily I think. If we write it immediately and the write fails,
we have a fatal error. But if we leave it be, the next call to 
jffs2_flash_writev will flush the buffer and we get 1 more try. 
Unless there is something I don't understand, there is no harm in 
leaving the wbuf full.

> >  - there is no refiling of nextblock if a write error occurs
> >    during writev_ecc (direct write, not using wbuf),
> >    so the next write may occur on the failed block
> 
> OK. Priorities on printk though please.

OK. (was actually a copy/paste so I'll check the wbuf recover too
and maybe make a function for the refiling).

> >  - if a write error occurs, but part of the data was written,
> >    an obsolete raw node ref is added but nextblock has changed.
> 
> Where? You mean in jffs2_wbuf_recover()? That's an obsolete raw node ref
> in the _new_ nextblock, surely?

No, it is the "Mark the space as dirtied" case in jffs2_write_dirent and
jffs2_write_dnode. I think this happens only if the write error occurs
on mtd->writev_ecc and part of the data was successfully written by
jffs2_flush_wbuf or writev_ecc previously so jffs2_flash_writev says 
some data was written. In this case, when jffs2_write_dirent/dnode
adds this obsolete raw node ref for the dirty space, nextblock was 
modified during the refiling and / or recovery.

> > Additionally I have a question about the retry cases in
> > jffs2_write_dnode and jffs2_write_dirent.
> > If the write error occurs during an API call (not GC),
> > jffs2_reserve_space is called again to find space to rewrite
> > the node. However since we refiled the bad block, the free space
> > was reduced. Is it not possible then that jffs2_reserve_space
> > will need to gc to find some space ?
> 
> It is possible, yes.
> 
> > In the case of a deletion dirent, the previous dirent may be
> > GCed so its version number is increased. When the deletion
> > dirent is finally written in the retry, its version number
> > is older than the GCed node, so it is (quietly) obsoleted.
> 
> Hmm, true. We should check f->highest_version after we reallocate space,
> and update ri->version or rd->version accordingly.

This was my first idea too but I found it ugly to tamper with
structures which are clearly the responsibility of the caller.
However this is a recovery case so... maybe it is necessary.

> > In the case of a dnode, I think there may be a deadlock
> > if a dnode belonging to the same file is GCed, since the
> > GC will attempt to lock f->sem.
> 
> No, because we unlock f->sem before calling jffs2_reserve_space.

Right, I got confused.

> > I actually saw the dirent case in my tests but since I wrote
> > some nodes manually I am not sure this is a valid problem in
> > ordinary JFFS2 processing ?
> 
> I think you're right -- the version number is indeed a real problem.
> 
> > Could the solution be to call jffs2_reserve_space_gc instead
> > of jffs2_reserve_space ?
> 
> No, because if the write fails and then we're short of space, we want to
> return -ENOSPC. We don't want to keep eating into the space qwhich is
> reserved for GC.

If we use jffs2_reserve_space_gc both in jffs2_wbuf_recover and the
dnode/dirent retry cases I believe we will write at most 2*4KB = 8KB 
this way ? Then further API calls will do the GC. Is this unacceptable ?

> Thanks for the report and the patches. Would you like to send me a SSH
> public key so that I can give you an account and you can commit
> directly?

We have a firewall here so I don't think CVS will work. I will ask.
Like many people I rely on snapshots. BTW if JFFS2 and JFFS3 are 
maintained in parallel do we get 2 sets of snapshots ? or is there
another way to get both versions ?
Anyway I'll send a corrected patch to the list when decision is made
about the node version problem - I guess the version check/update ?

Bye
Estelle

next prev parent reply	other threads:[~2004-11-18 17:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-17 17:15 JFFS2 & NAND failure Estelle HAMMACHE
2004-11-18 16:27 ` David Woodhouse
2004-11-18 17:54   ` Estelle HAMMACHE [this message]
2004-11-19 13:17     ` David Woodhouse
2004-11-19 16:22       ` Estelle HAMMACHE
2004-11-20 18:57         ` David Woodhouse
2004-11-20 19:19         ` David Woodhouse
2004-11-20 22:13           ` David Woodhouse
2004-12-09 14:57             ` David Woodhouse
2004-12-09 17:06               ` Estelle HAMMACHE
2004-12-15 12:33                 ` Estelle HAMMACHE
2005-02-02 16:21                   ` Estelle HAMMACHE
2005-04-04 12:58                   ` Artem B. Bityuckiy
2005-04-04 13:58                     ` Estelle HAMMACHE
2005-04-04 14:47                       ` Artem B. Bityuckiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=419CE1E8.F20DD890@st.com \
    --to=estelle.hammache@st.com \
    --cc=dwmw2@infradead.org \
    --cc=linux-mtd@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox