All of lore.kernel.org
 help / color / mirror / Atom feed
From: Estelle HAMMACHE <estelle.hammache@st.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Subject: Re: JFFS2 & NAND failure
Date: Thu, 18 Nov 2004 18:54:48 +0100	[thread overview]
Message-ID: <419CE1E8.F20DD890@st.com> (raw)
In-Reply-To: 1100795260.8191.7333.camel@hades.cambridge.redhat.com

David Woodhouse wrote:
> >  - during wbuf flushing, if the previously written node filled
> >    wbuf exactly, "buf" may be used instead of "wbuf"
> >    in jffs2_wbuf_recover (access to null pointer)
> 
> Hmmm. But now with your change we can end up with a completely full wbuf
> which we don't actually flush; we leave it in memory. We should write
> that immediately, surely?

Not necessarily I think. If we write it immediately and the write fails,
we have a fatal error. But if we leave it be, the next call to 
jffs2_flash_writev will flush the buffer and we get 1 more try. 
Unless there is something I don't understand, there is no harm in 
leaving the wbuf full.

> >  - there is no refiling of nextblock if a write error occurs
> >    during writev_ecc (direct write, not using wbuf),
> >    so the next write may occur on the failed block
> 
> OK. Priorities on printk though please.

OK. (was actually a copy/paste so I'll check the wbuf recover too
and maybe make a function for the refiling).

> >  - if a write error occurs, but part of the data was written,
> >    an obsolete raw node ref is added but nextblock has changed.
> 
> Where? You mean in jffs2_wbuf_recover()? That's an obsolete raw node ref
> in the _new_ nextblock, surely?

No, it is the "Mark the space as dirtied" case in jffs2_write_dirent and
jffs2_write_dnode. I think this happens only if the write error occurs
on mtd->writev_ecc and part of the data was successfully written by
jffs2_flush_wbuf or writev_ecc previously so jffs2_flash_writev says 
some data was written. In this case, when jffs2_write_dirent/dnode
adds this obsolete raw node ref for the dirty space, nextblock was 
modified during the refiling and / or recovery.


> > Additionally I have a question about the retry cases in
> > jffs2_write_dnode and jffs2_write_dirent.
> > If the write error occurs during an API call (not GC),
> > jffs2_reserve_space is called again to find space to rewrite
> > the node. However since we refiled the bad block, the free space
> > was reduced. Is it not possible then that jffs2_reserve_space
> > will need to gc to find some space ?
> 
> It is possible, yes.
> 
> > In the case of a deletion dirent, the previous dirent may be
> > GCed so its version number is increased. When the deletion
> > dirent is finally written in the retry, its version number
> > is older than the GCed node, so it is (quietly) obsoleted.
> 
> Hmm, true. We should check f->highest_version after we reallocate space,
> and update ri->version or rd->version accordingly.

This was my first idea too but I found it ugly to tamper with
structures which are clearly the responsibility of the caller.
However this is a recovery case so... maybe it is necessary.

> > In the case of a dnode, I think there may be a deadlock
> > if a dnode belonging to the same file is GCed, since the
> > GC will attempt to lock f->sem.
> 
> No, because we unlock f->sem before calling jffs2_reserve_space.

Right, I got confused.

> > I actually saw the dirent case in my tests but since I wrote
> > some nodes manually I am not sure this is a valid problem in
> > ordinary JFFS2 processing ?
> 
> I think you're right -- the version number is indeed a real problem.
> 
> > Could the solution be to call jffs2_reserve_space_gc instead
> > of jffs2_reserve_space ?
> 
> No, because if the write fails and then we're short of space, we want to
> return -ENOSPC. We don't want to keep eating into the space qwhich is
> reserved for GC.

If we use jffs2_reserve_space_gc both in jffs2_wbuf_recover and the
dnode/dirent retry cases I believe we will write at most 2*4KB = 8KB 
this way ? Then further API calls will do the GC. Is this unacceptable ?

> Thanks for the report and the patches. Would you like to send me a SSH
> public key so that I can give you an account and you can commit
> directly?

We have a firewall here so I don't think CVS will work. I will ask.
Like many people I rely on snapshots. BTW if JFFS2 and JFFS3 are 
maintained in parallel do we get 2 sets of snapshots ? or is there
another way to get both versions ?
Anyway I'll send a corrected patch to the list when decision is made
about the node version problem - I guess the version check/update ?

Bye
Estelle

  reply	other threads:[~2004-11-18 17:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-17 17:15 JFFS2 & NAND failure Estelle HAMMACHE
2004-11-18 16:27 ` David Woodhouse
2004-11-18 17:54   ` Estelle HAMMACHE [this message]
2004-11-19 13:17     ` David Woodhouse
2004-11-19 16:22       ` Estelle HAMMACHE
2004-11-20 18:57         ` David Woodhouse
2004-11-20 19:19         ` David Woodhouse
2004-11-20 22:13           ` David Woodhouse
2004-12-09 14:57             ` David Woodhouse
2004-12-09 17:06               ` Estelle HAMMACHE
2004-12-15 12:33                 ` Estelle HAMMACHE
2005-02-02 16:21                   ` Estelle HAMMACHE
2005-04-04 12:58                   ` Artem B. Bityuckiy
2005-04-04 13:58                     ` Estelle HAMMACHE
2005-04-04 14:47                       ` Artem B. Bityuckiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=419CE1E8.F20DD890@st.com \
    --to=estelle.hammache@st.com \
    --cc=dwmw2@infradead.org \
    --cc=linux-mtd@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.