public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* NAND and JFFS2 crash
@ 2003-04-24 10:22 simon
  2003-04-24 10:50 ` David Woodhouse
  2003-04-28  8:54 ` Thomas Gleixner
  0 siblings, 2 replies; 4+ messages in thread
From: simon @ 2003-04-24 10:22 UTC (permalink / raw)
  To: linux-mtd

Thomas,  

I checked into what you had said. The filesystem in question is the 
root filesystem and it gets mounted and dismounted at startup and  
shutdown. I cannot see how I this could be my problem. As you  
seem to be a busy man I thought I would not bother you again and  
I would try an update at a later date. 

Last week I downloaded a new CVS tree. I create my SMC data by 
booting the system off a hard disk running Linux. I first use dd to  
copy the hard disk boot partition to the SMC. I noticed all these  
messages basically saying writing NAND witout ECC was a bad  
idea. In my NAND specific driver I set up the mtd_info structure for  
soft ecc. However there appears to be a new field useecc which  
only appears to be used by jffs2. I did not know what I was  
expected to do here so I modified my driver to set this and the  
associated bit positions. Beacuse I use partitions I had to modify  
mtdpart to copy this information to the mtd_info structure which is  
set up on a partition basis. Now I could boot from the hard disk and 
 copy my boot disk to the SMC with no problem. I then erased and  
created a new JFFS2 filesystem, on another partition, and copied  
all the files for the root filesystem.  

I then booted from the smc and although I got a few 

Empty flash at 0x00469ffcb ends at 0x0046a000 

messages all seemed ok. The root file system was mounted and I  
got the login prompt. However when I started to log in I got a crash. 

kernel BUG at gc.c:140!                                 
               
invalid operand: 0000                                   
                              
CPU:    0                                               
                              
EIP:    0010:[<c018bb28>]    Not tainted                
                              
EFLAGS: 00010296                                        
                              
eax: 0000003f   ebx: 000000d4   ecx: c0262220  edx:  
0000c200                        
esi: 000000d4   edi: 0000106e   ebp: cffc04cc   esp:  
cfbc5f1c                        
ds: 0018   es: 0018   ss: 0018                          
                              
Process jffs2_gcd_mtd2 (pid: 22, stackpage=cfbc5000)    
                              
Stack: 00000000 c0111ce6 cfbc5f50 cfbc4000 cfe6a120  
cfe6a120 cfbc4000 00000000       
       cfbc4000 00000000 cfbc4000 cffc04cc cfbc4564  
c018ea16 cffc04cc cfbc4574       
       cffc04cc 00000001 00000000 00000080 00000000  
00000000 00000000 00000000       
Call Trace:    [<c0111ce6>] [<c018ea16>] [<c0108be6>]  
[<c018e890>] [<c01073f6>]      
  [<c018e890>]                                          
                              
                                                        
                              
Code: 0f 0b 8c 00 b9 8f 25 c0 8b 45 08 8b 55 08 40 52  
89 45 08 55                    


I have noticed someone else post a similar crash in the list and  
you suggest sending a dump of the SMC.  

I would like to know if you could assist me in the same way. If so  
do you need a dump of the whole SMC or just the JFFS2 partition 
? 

During playing about with this I also noticed 
a message similar to  

jffs2_scan_dirent_node(): Node CRC failed on node at 0x0046a7f0  
read  0xffffffff calculated 0xdec8161b 

but the routine was jffs2_scan_inode_node, so I guess I am still  
loosing data somewhere ? 

To be able to use this technology I need to make it reliable. Can  
you suggest how I might find the cause of this problem ? 

Enable a specific debug level ? 
Check hardware by writing patterns via the raw device ? 




Many Thanks 


Simon 


On 6 Jan 2003, at 19:59, Thomas Gleixner wrote: 

> On Monday 06 January 2003 18:04, simon@baydel.com wrote: 
> > I download the CVS stuff mid December and again today. The 
> > hardware ran ok before and could use jffs2 without errors but 
as I 
> > added files it was slow and I could not make file systems on 
> > partitions which contained bad blocks. 
> > 
> > The new CVS code seems to be much quicker and I can 
erase, 
> > mount and copy files to my new filesystem without error. I have 
set 
> > up the specific driver to do soft ecc. I noticed that when I 
reboot 
> > the system and the filesystem gets mounted I get errors. The 
more 
> > writes that occur the more errors I seem to get. I ran a test for 
a 
> > week or so over the break which generated log files. A reboot 
after 
> > this produced thousands of errors but the filesystem seemed 
ok. 
> > 
> > The errors are something like 
> > 
> > Empty flash at 0x00469ffcb ends at 0x0046a000 
> This happens due to NAND specific timed buffer flushing. JFFS2 
fills 
> up the write buffer to a full page boundary with 0xff and writes out 
> the buffer to the chip, if you have no consecutive write within 2 
> seconds. This is done to ensure, that data is written to FLASH. 
This 
> fill looks like empty FLASH on mount. So JFFS2 is wondering 
why there 
> is data after the "empty" FLASH. No reason to worry. 
>  
> > or 
> > 
> > jffs2_scan_dirent_node(): Node CRC failed on node at 
0x0046a7f0 read 
> > 0xffffffff calculated 0xdec8161b 
> This happens, if the write buffer is not written to FLASH before 
you 
> power down your system without umount. Then the write buffer is 
lost 
> and you get this error on mount. This indicates, that you may 
have 
> lost data. 
>  
> > I was wondering if any of you could shed any light on this. 
>  
> --  
> Thomas 
> 
________________________________________________________
______________ 
> __ linutronix - competence in embedded & realtime linux 
> http://www.linutronix.de mail: tglx@linutronix.de 
>  
>  
> 
______________________________________________________ 
> Linux MTD discussion mailing list 
> http://lists.infradead.org/mailman/listinfo/linux-mtd/ 


__________________________

Simon Haynes - Baydel 
Phone : 44 (0) 1372 378811
Email : simon@baydel.com
__________________________

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NAND and JFFS2 crash
  2003-04-24 10:22 NAND and JFFS2 crash simon
@ 2003-04-24 10:50 ` David Woodhouse
  2003-04-24 21:08   ` David Woodhouse
  2003-04-28  8:54 ` Thomas Gleixner
  1 sibling, 1 reply; 4+ messages in thread
From: David Woodhouse @ 2003-04-24 10:50 UTC (permalink / raw)
  To: simon; +Cc: linux-mtd, postmaster

You are now barred from posting directly to the linux-mtd list. You can
still post -- but all mail you send will require manual approval until
you're taken off the blacklist.

I posted a message to the list asking members to obey the guidelines,
which are just a documentation of normal acceptable practice. 

I rejected your broken messages without References headers, with a URL
explaining why.

Yet you deliberately mangled the subject of your message to get round
the heuristics which trapped your previous broken messages, so you could
post a message to the list which broke the threading.

This behaviour is not acceptable.

If you have genuine difficulty in finding a mail client which isn't
broken, approach me privately and I'll help -- there are plenty of them
out there, even for Windows. I think that even newer versions of Pegasus
are OK. But don't attempt to work around the filters.

Working around the need for manual approval by posting from a different
address will be considered an abuse of my system and reported as such to
your ISP.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NAND and JFFS2 crash
  2003-04-24 10:50 ` David Woodhouse
@ 2003-04-24 21:08   ` David Woodhouse
  0 siblings, 0 replies; 4+ messages in thread
From: David Woodhouse @ 2003-04-24 21:08 UTC (permalink / raw)
  To: simon; +Cc: linux-mtd, postmaster

On Thu, 2003-04-24 at 11:50, David Woodhouse wrote:
> Yet you deliberately mangled the subject of your message to get round
> the heuristics which trapped your previous broken messages, so you could
> post a message to the list which broke the threading.
> 
> This behaviour is not acceptable.

Upon further investigation, it seems that this wasn't a deliberate
attempt to circumvent the filters, but a genuine mistake or
misunderstanding of their nature.

So I was perhaps a little unfair to single you out in the interest of
making an example for others not to follow -- the filter is removed and
you are again only subject to the normal sanity checking on incoming
list mail. I'm sorry for being quite so harsh.

Please do refer to the 'top-posting' and 'quote-selectively' sections of
the URL to which I referred, though. 

-- 
dwmw2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NAND and JFFS2 crash
  2003-04-24 10:22 NAND and JFFS2 crash simon
  2003-04-24 10:50 ` David Woodhouse
@ 2003-04-28  8:54 ` Thomas Gleixner
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2003-04-28  8:54 UTC (permalink / raw)
  To: simon, linux-mtd

On Thursday 24 April 2003 12:22, simon@baydel.com wrote:
> Last week I downloaded a new CVS tree. I create my SMC data by
> booting the system off a hard disk running Linux. I first use dd to
> copy the hard disk boot partition to the SMC. I noticed all these
> messages basically saying writing NAND witout ECC was a bad
> idea. In my NAND specific driver I set up the mtd_info structure for
> soft ecc. However there appears to be a new field useecc which
> only appears to be used by jffs2. I did not know what I was
> expected to do here so I modified my driver to set this and the
> associated bit positions. Beacuse I use partitions I had to modify
> mtdpart to copy this information to the mtd_info structure which is
> set up on a partition basis. Now I could boot from the hard disk and
>  copy my boot disk to the SMC with no problem. I then erased and
> created a new JFFS2 filesystem, on another partition, and copied
> all the files for the root filesystem.

As I mentioned before and as documented in the nand docs, either use 
bootloader or use nandwrite.c to write a fs-image with ecc. There is no need 
to hack around.

> I then booted from the smc and although I got a few
> Empty flash at 0x00469ffcb ends at 0x0046a000
> messages all seemed ok. The root file system was mounted and I
> got the login prompt. However when I started to log in I got a crash.
> kernel BUG at gc.c:140!

I have a binary image now to check this. Will post info, if I have solved the 
problem.

-- 
Thomas
________________________________________________________________________
linutronix - competence in embedded & realtime linux
http://www.linutronix.de
mail: tglx@linutronix.de

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-04-28  7:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-24 10:22 NAND and JFFS2 crash simon
2003-04-24 10:50 ` David Woodhouse
2003-04-24 21:08   ` David Woodhouse
2003-04-28  8:54 ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox