From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from beta.dmz-eu.st.com ([164.129.1.35]) by canuck.infradead.org with esmtps (Exim 4.42 #1 (Red Hat Linux)) id 1CVBXM-0002si-S3 for linux-mtd@lists.infradead.org; Fri, 19 Nov 2004 11:23:07 -0500 Sender: Estelle HAMMACHE Message-ID: <419E1DE0.FDD5DEC0@st.com> Date: Fri, 19 Nov 2004 17:22:56 +0100 From: Estelle HAMMACHE MIME-Version: 1.0 To: David Woodhouse References: <419B8715.4036BDBB@st.com> <1100795260.8191.7333.camel@hades.cambridge.redhat.com> <419CE1E8.F20DD890@st.com> <1100870238.8191.7368.camel@hades.cambridge.redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: linux-mtd@lists.infradead.org Subject: Re: JFFS2 & NAND failure List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , David Woodhouse wrote: > > On Thu, 2004-11-18 at 18:54 +0100, Estelle HAMMACHE wrote: > > David Woodhouse wrote: > > > > - during wbuf flushing, if the previously written node filled > > > > wbuf exactly, "buf" may be used instead of "wbuf" > > > > in jffs2_wbuf_recover (access to null pointer) > > > > > > Hmmm. But now with your change we can end up with a completely full wbuf > > > which we don't actually flush; we leave it in memory. We should write > > > that immediately, surely? > > > > Not necessarily I think. If we write it immediately and the write fails, > > we have a fatal error. But if we leave it be, the next call to > > jffs2_flash_writev will flush the buffer and we get 1 more try. > > Unless there is something I don't understand, there is no harm in > > leaving the wbuf full. > > Generally we should flush the wbuf as soon as we can. And if there's > going to be an error when we retry, surely we want that to happen > _immediately_ so that the _current_ write() call returns an error, > rather than leaving it in the wbuf and then losing it later? Whether jffs2_wbuf_recover succeeds or not, __jffs2_flush_wbuf always returns an error so that jffs2_flash_writev returns immediately, and the caller knows it needs to reallocate space to try again. Then, since wbuf is full, it will be flushed before the new (retry) node can be written. The node which was in wbuf was outstanding anyway. It is not part of the data from the current call to jffs2_flash_writev. What I meant is that if we write wbuf in jffs2_write_recover, like we write buf, 1 write error means we never write the node, and we lose data. Whereas if we leave the full wbuf, the node still has 2 regular write attempts (flush and recover) before some data is lost. I can see no difference between this "full wbuf" case and the usual jffs2_wbuf_recover where wbuf is left partly filled ? I feel that writing wbuf immediately adds code without necessity (or maybe I'm just too lazy to test it). On the other hand, maybe jffs2_wbuf_recover should be tweaked to resist to failure to write buf ? in this case writing a filled-up wbuf immediately too would be logical. How many blocks can we afford to refile to the bad block lists in a single call to jffs2_flash_writev ? > > > > - if a write error occurs, but part of the data was written, > > > > an obsolete raw node ref is added but nextblock has changed. > > > > > > Where? You mean in jffs2_wbuf_recover()? That's an obsolete raw node ref > > > in the _new_ nextblock, surely? > > > > No, it is the "Mark the space as dirtied" case in jffs2_write_dirent and > > jffs2_write_dnode. I think this happens only if the write error occurs > > on mtd->writev_ecc and part of the data was successfully written by > > jffs2_flush_wbuf or writev_ecc previously so jffs2_flash_writev says > > some data was written. In this case, when jffs2_write_dirent/dnode > > adds this obsolete raw node ref for the dirty space, nextblock was > > modified during the refiling and / or recovery. > > OK... you mean the case where there was already a node in the wbuf and > the wbuf flush failed, so we rewrote that node to a new block, along > with the start of the new node? Then the write of the _rest_ of the new > node failed, and we return with 'retlen' set to the amount of the new > node that fitted into the wbuf and we actually written? I should have mentionned that the problem in jffs2_add_physical_node_ref cannot happen with the original code - it only happens with the modification of jffs2_flash_writev to refile nextblock on mtd->writev_ecc failure. With the original code, nextblock it not refiled so there is no problem in jffs2_add_physical_node_ref, but the retry occurs on the same page which already failed in mtd->writev_ecc. Here is an example of the obsolete node/refiled nextblock pb: - page size 512 bytes - start condition: wbuf contains 500 bytes jffs2_write_dnode call, total node size 700 bytes -->jffs2_flash_writev ------>fill up wbuf ------>donelen = 12 ------>jffs2_flush_wbuf (succeeds) ------>c->mtd->writev_ecc to write 1 page - fails ------>nextblock is refiled (new code!) ------>*retlen = donelen (= 12) -->add obsolete frag for dirtied space ------> free raw node & return early because c->nextblock == 0 -->probable segfault on (redundant?) jffs2_mark_node_obsolete -->retry. Maybe I ought to test nextblock == 0 in addition to the obsolete flag to make the exclusion case more precise in jffs2_add_physical_node_ref ? Or is the "dirty" node really necessary ? what happens if we don't list it in the raw node refs ? > > > Hmm, true. We should check f->highest_version after we reallocate space, > > > and update ri->version or rd->version accordingly. > > > > This was my first idea too but I found it ugly to tamper with > > structures which are clearly the responsibility of the caller. > > However this is a recovery case so... maybe it is necessary. > > Well we can always turn it around and make them always the > responsibility of the callee instead -- set them _always_ in > jffs2_write_{dirent,dnode}? Ok, let's do that. bye Estelle