* jffs2_gcd_mtd0 invoked oom-killer @ 2007-03-12 12:28 Igor Marnat 2007-03-12 13:10 ` Artem Bityutskiy 0 siblings, 1 reply; 11+ messages in thread From: Igor Marnat @ 2007-03-12 12:28 UTC (permalink / raw) To: linux-mtd Dear Sirs! I work with the embedded system based on PowerPC 405 EP with 16Mb RAM and 32 Mb NAND FLASH. Rootfs lives on NAND FLASH and is formatted using jffs2. A few months it worked just fine then suddenly the system became unusable. Since root fs locates on NAND flash, system became unloadable because of error. Anyway I can boot the system, loading the kernel by TFTP and having mounted root fs by NFS. When the system works with the kernel loaded by tftp and fs mounted by NFS, it all seems to be fine until the command "mount -t jffs2 /dev/mtdblock0 /mnt" issued. After this command system tells me "check_node_data: wrong data CRC" and then "jffs2_gcd_mtd0 invoked oom-killer". Then it goes to reboot (see the log below). The problem occurs with 2.6.18-rc4 kernel and with 2.6.19.2-1 (the latest one I can build for my board). Kernel config (those of 2.6.19.2-1) is attached at the very bottom of the letter. Please advise the steps to fix the problem or direction to begin debugging. Thank you, best regards, Log: bash-2.05b# mount -t jffs2 /dev/mtdblock0 /mnt bash-2.05b# cat /proc/meminfo MemTotal: 14088 kB MemFree: 1356 kB Buffers: 0 kB Cached: 5332 kB SwapCached: 0 kB Active: 4384 kB Inactive: 2336 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 1404 kB Mapped: 2224 kB Slab: 5024 kB SReclaimable: 588 kB SUnreclaim: 4436 kB PageTables: 152 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 7044 kB Committed_AS: 15080 kB VmallocTotal: 743424 kB VmallocUsed: 4684 kB VmallocChunk: 738600 kB bash-2.05b# cd /mnt/ bash-2.05b# ls -l total 0 drwxr-xr-x 2 500 500 0 Nov 22 2005 bin drwxrwxr-x 2 500 500 0 Nov 15 2005 boot drwxr-xr-x 2 500 500 0 Dec 31 1969 dev drwxr-xr-x 20 500 500 0 Dec 31 1969 etc drwxrwxr-x 2 root root 0 Nov 22 2005 info drwxr-xr-x 5 500 500 0 Dec 31 1969 lib drwxrwxr-x 4 500 500 0 Nov 22 2005 man drwxr-xr-x 2 root root 0 Dec 31 1969 mnt drwxrwxr-x 2 root root 0 Jul 28 2005 proc drwxrwxr-x 3 500 500 0 Dec 31 1969 root drwxr-xr-x 2 500 500 0 Dec 31 1969 sbin drwxrwxr-x 2 500 500 0 Nov 15 2005 share drwxr-xr-x 307 root root 0 Jan 1 1970 tmp drwxr-xr-x 2 root root 0 Dec 31 1969 tz-64 drwxr-xr-x 10 500 500 0 Nov 21 2005 usr drwxr-xr-x 8 500 500 0 Dec 31 1969 var bash-2.05b# cat /proc/meminfo MemTotal: 14088 kB MemFree: 796 kB Buffers: 0 kB Cached: 4556 kB SwapCached: 0 kB Active: 4252 kB Inactive: 1700 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 1412 kB Mapped: 2244 kB Slab: 6352 kB SReclaimable: 592 kB SUnreclaim: 5760 kB PageTables: 152 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 7044 kB Committed_AS: 15088 kB VmallocTotal: 743424 kB VmallocUsed: 4684 kB VmallocChunk: 738600 kB bash-2.05b# JFFS2 notice: (450) check_node_data: wrong data CRC in data node at 0x03b4bc00: read 0xc7ce044f, calculated 0xaba67eba. Dec 31 19:01:30 10 kernel: JFFS2 notice: (450) check_node_data: wrong data CRC in data node at 0x03b4bc00: read 0xc7ce044f, calculated 0xaba67eba. bash-2.05b# jffs2_gcd_mtd0 invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 Call Trace: [C06A1D30] [C00089FC] (unreliable) [C06A1D60] [C0039014] [C06A1D90] [C003A3A0] [C06A1DE0] [C004D044] [C06A1E10] [C004CD58] [C06A1E20] [C00CA314] [C06A1E30] [C00CBD9C] [C06A1EA0] [C00CC8D4] [C06A1F10] [C00D02EC] [C06A1F50] [C00D18F8] [C06A1FF0] [C0004BB8] Mem-info: DMA per-cpu: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Active:336 inactive:0 dirty:0 writeback:0 unstable:0 free:127 slab:2824 mapped:0 pagetables:34 DMA free:508kB min:508kB low:632kB high:760kB active:1344kB inactive:0kB present:16256kB pages_scanned:2351 all_unreclaimable? yes lowmem_reserve[]: 0 0 DMA: 1*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508kB Free swap: 0kB 4096 pages of RAM 0 pages of HIGHMEM 148 free pages 575 reserved pages 0 pages shared 0 pages swap cached Out of Memory: Kill process 391 (portmap) score 466 and children. Out of memory: Killed process 391 (portmap). klogd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Call Trace: [C0D1BCD0] [C00089FC] (unreliable) [C0D1BD00] [C0039014] [C0D1BD30] [C003A3A0] [C0D1BD80] [C003C248] [C0D1BE00] [C0037808] [C0D1BE30] [C0041A00] [C0D1BE80] [C0009CF4] [C0D1BF40] [C0003094] Mem-info: DMA per-cpu: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Active:336 inactive:0 dirty:0 writeback:0 unstable:0 free:126 slab:2825 mapped:0 pagetables:34 DMA free:504kB min:508kB low:632kB high:760kB active:1344kB inactive:0kB present:16256kB pages_scanned:2351 all_unreclaimable? yes lowmem_reserve[]: 0 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 504kB Free swap: 0kB 4096 pages of RAM 0 pages of HIGHMEM 147 free pages 575 reserved pages 0 pages shared 0 pages swap cached klogd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Call Trace: [C0D1BCD0] [C00089FC] (unreliable) [C0D1BD00] [C0039014] [C0D1BD30] [C003A3A0] [C0D1BD80] [C003C248] [C0D1BE00] [C0037808] [C0D1BE30] [C0041A00] [C0D1BE80] [C0009CF4] [C0D1BF40] [C0003094] Mem-info: DMA per-cpu: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Active:313 inactive:26 dirty:0 writeback:0 unstable:0 free:127 slab:2826 mapped:1 pagetables:30 DMA free:508kB min:508kB low:632kB high:760kB active:1252kB inactive:104kB present:16256kB pages_scanned:2040 all_unreclaimable? yes lowmem_reserve[]: 0 0 DMA: 1*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508kB Free swap: 0kB 4096 pages of RAM 0 pages of HIGHMEM 148 free pages 575 reserved pages 2 pages shared 0 pages swap cached Out of Memory: Kill process 422 (login) score 77 and children. Out of memory: Killed process 423 (bash). Dec 31 19:01:48 10 kernel: jffs2_gcd_mtd0 invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 Dec 31 19:01:48 10 kernel: Call Trace: Dec 31 19:01:48 10 kernel: [C06A1D30] [C00089FC] (unreliable) Dec 31 19:01:48 10 kernel: [C06A1D60] [C0039014] Dec 31 19:01:48 10 kernel: [C06A1D90] [C003A3A0] Dec 31 19:01:48 10 kernel: [C06A1DE0] [C004D044] Dec 31 19:01:48 10 kernel: [C06A1E10] [C004CD58] Dec 31 19:01:48 10 kernel: [C06A1E20] [C00CA314] Dec 31 19:01:48 10 kernel: [C06A1E30] [C00CBD9C] Dec 31 19:01:48 10 kernel: [C06A1EA0] [C00CC8D4] Dec 31 19:01:48 10 kernel: [C06A1F10] [C00D02EC] Dec 31 19:01:48 10 kernel: [C06A1F50] [C00D18F8] Dec 31 19:01:48 10 kernel: [C06A1FF0] [C0004BB8] Dec 31 19:01:48 10 kernel: Mem-info: Dec 31 19:01:48 10 kernel: DMA per-cpu: Dec 31 19:01:48 10 kernel: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Dec 31 19:01:48 10 kernel: Active:336 inactive:0 dirty:0 writeback:0 unstable:0 free:127 slab:2824 mapped:0 pagetables:34 Dec 31 19:01:48 10 kernel: DMA free:508kB min:508kB low:632kB high:760kB active:1344kB inactive:0kB present:16256kB pages_scanned:2351 all_unreclaimable? yes Dec 31 19:01:48 10 kernel: lowmem_reserve[]: 0 0 Dec 31 19:01:48 10 kernel: DMA: 1*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508kB Dec 31 19:01:48 10 kernel: Free swap: 0kB Dec 31 19:01:48 10 kernel: 4096 pages of RAM Dec 31 19:01:48 10 kernel: 0 pages of HIGHMEM Dec 31 19:01:48 10 kernel: 148 free pages Dec 31 19:01:48 10 kernel: 575 reserved pages Dec 31 19:01:48 10 kernel: 0 pages shared Dec 31 19:01:48 10 kernel: 0 pages swap cached Dec 31 19:01:49 10 kernel: Out of Memory: Kill process 391 (portmap) score 466 and children. Dec 31 19:01:49 10 kernel: Out of memory: Killed process 391 (portmap). Dec 31 19:01:49 10 kernel: klogd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Dec 31 19:01:49 10 kernel: Call Trace: Dec 31 19:01:49 10 kernel: [C0D1BCD0] [C00089FC] (unreliable) Dec 31 19:01:49 10 kernel: [C0D1BD00] [C0039014] Dec 31 19:01:49 10 kernel: [C0D1BD30] [C003A3A0] Dec 31 19:01:49 10 kernel: [C0D1BD80] [C003C248] Dec 31 19:01:49 10 kernel: [C0D1BE00] [C0037808] Dec 31 19:01:49 10 kernel: [C0D1BE30] [C0041A00] Dec 31 19:01:49 10 kernel: [C0D1BE80] [C0009CF4] Dec 31 19:01:49 10 kernel: [C0D1BF40] [C0003094] Dec 31 19:01:49 10 kernel: Mem-info: Dec 31 19:01:49 10 kernel: DMA per-cpu: Dec 31 19:01:49 10 kernel: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Dec 31 19:01:49 10 kernel: Active:336 inactive:0 dirty:0 writeback:0 unstable:0 free:126 slab:2825 mapped:0 pagetables:34 Dec 31 19:01:49 10 kernel: DMA free:504kB min:508kB low:632kB high:760kB active:1344kB inactive:0kB present:16256kB pages_scanned:2351 all_unreclaimable? yes Dec 31 19:01:49 10 kernel: lowmem_reserve[]: 0 0 Dec 31 19:01:49 10 kernel: DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 504kB Dec 31 19:01:49 10 kernel: Free swap: 0kB Dec 31 19:01:49 10 kernel: 4096 pages of RAM Dec 31 19:01:49 10 kernel: 0 pages of HIGHMEM Dec 31 19:01:49 10 kernel: 147 free pages Dec 31 19:01:49 10 kernel: 575 reserved pages Dec 31 19:01:49 10 kernel: 0 pages shared Dec 31 19:01:49 10 kernel: 0 pages swap cached Dec 31 19:01:49 10 kernel: klogd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0 Dec 31 19:01:49 10 kernel: Call Trace: Dec 31 19:01:49 10 kernel: [C0D1BCD0] [C00089FC] (unreliable) Dec 31 19:01:49 10 kernel: [C0D1BD00] [C0039014] Dec 31 19:01:49 10 kernel: [C0D1BD30] [C003A3A0] Dec 31 19:01:49 10 kernel: [C0D1BD80] [C003C248] Dec 31 19:01:49 10 kernel: [C0D1BE00] [C0037808] Dec 31 19:01:49 10 kernel: [C0D1BE30] [C0041A00] Dec 31 19:01:49 10 kernel: [C0D1BE80] [C0009CF4] Dec 31 19:01:49 10 kernel: [C0D1BF40] [C0003094] Dec 31 19:01:49 10 kernel: Mem-info: Dec 31 19:01:49 10 kernel: DMA per-cpu: Dec 31 19:01:49 10 kernel: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Dec 31 19:01:49 10 kernel: Active:313 inactive:26 dirty:0 writeback:0 unstable:0 free:127 slab:2826 mapped:1 pagetables:30 Dec 31 19:01:49 10 kernel: DMA free:508kB min:508kB low:632kB high:760kB active:1252kB inactive:104kB present:16256kB pages_scanned:2040 all_unreclaimable? yes Dec 31 19:01:49 10 kernel: lowmem_reserve[]: 0 0 Dec 31 19:01:49 10 kernel: DMA: 1*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508kB Dec 31 19:01:49 10 kernel: Free swap: 0kB Dec 31 19:01:49 10 kernel: 4096 pages of RAM Dec 31 19:01:49 10 kernel: 0 pages of HIGHMEM Dec 31 19:01:49 10 kernel: 148 free pages Dec 31 19:01:49 10 kernel: 575 reserved pages Dec 31 19:01:49 10 kernel: 2 pages shared Dec 31 19:01:49 10 kernel: 0 pages swap cached Dec 31 19:01:49 10 kernel: Out of Memory: Kill process 422 (login) score 77 and children. Dec 31 19:01:49 10 kernel: Out of memory: Killed process 423 (bash). INIT: cannot fork, retry.. Config: [igor@tps linux-2.6.19.2-1]$ cat .config | grep = CONFIG_MMU=y CONFIG_GENERIC_HARDIRQS=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_PPC=y CONFIG_PPC32=y CONFIG_GENERIC_NVRAM=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SYSVIPC=y CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_SYSCTL_SYSCALL=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_RT_MUTEXES=y CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_KMOD=y CONFIG_BLOCK=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y CONFIG_DEFAULT_AS=y CONFIG_DEFAULT_IOSCHED="anticipatory" CONFIG_40x=y CONFIG_4xx=y CONFIG_PPChameleonEVB=y CONFIG_TZ_64=y CONFIG_IBM_OCP=y CONFIG_405EP=y CONFIG_PPC_GEN550=y CONFIG_UART0_TTYS0=y CONFIG_NOT_COHERENT_CACHE=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_HZ_100=y CONFIG_HZ=100 CONFIG_PREEMPT_NONE=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_BINFMT_ELF=y CONFIG_SECCOMP=y CONFIG_ISA_DMA_API=y CONFIG_HIGHMEM_START=0xfe000000 CONFIG_LOWMEM_SIZE=0x30000000 CONFIG_KERNEL_START=0xc0000000 CONFIG_TASK_SIZE=0x80000000 CONFIG_CONSISTENT_START=0xff100000 CONFIG_CONSISTENT_SIZE=0x00200000 CONFIG_BOOT_LOAD=0x00400000 CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_XFRM=y CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_FIB_HASH=y CONFIG_IP_PNP=y CONFIG_IP_PNP_DHCP=y CONFIG_SYN_COOKIES=y CONFIG_INET_XFRM_MODE_TRANSPORT=y CONFIG_INET_XFRM_MODE_TUNNEL=y CONFIG_INET_XFRM_MODE_BEET=y CONFIG_INET_DIAG=y CONFIG_INET_TCP_DIAG=y CONFIG_TCP_CONG_CUBIC=y CONFIG_DEFAULT_TCP_CONG="cubic" CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_MTD=y CONFIG_MTD_DEBUG=y CONFIG_MTD_DEBUG_VERBOSE=2 CONFIG_MTD_PARTITIONS=y CONFIG_MTD_CHAR=y CONFIG_MTD_BLOCK=y CONFIG_MTD_MAP_BANK_WIDTH_1=y CONFIG_MTD_MAP_BANK_WIDTH_2=y CONFIG_MTD_MAP_BANK_WIDTH_4=y CONFIG_MTD_CFI_I1=y CONFIG_MTD_CFI_I2=y CONFIG_MTD_COMPLEX_MAPPINGS=y CONFIG_MTD_NAND=y CONFIG_MTD_NAND_IDS=y CONFIG_MTD_NAND_PPCHAMELEONEVB=y CONFIG_BLK_DEV_LOOP=y CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_COUNT=4 CONFIG_BLK_DEV_RAM_SIZE=4096 CONFIG_BLK_DEV_RAM_BLOCKSIZE=1024 CONFIG_BLK_DEV_INITRD=y CONFIG_NETDEVICES=y CONFIG_NET_ETHERNET=y CONFIG_MII=y CONFIG_IBM_EMAC=y CONFIG_IBM_EMAC_RXB=64 CONFIG_IBM_EMAC_TXB=32 CONFIG_IBM_EMAC_POLL_WEIGHT=32 CONFIG_IBM_EMAC_RX_COPY_THRESHOLD=256 CONFIG_IBM_EMAC_RX_SKB_HEADROOM=0 CONFIG_INPUT=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 CONFIG_SERIAL_8250_RUNTIME_UARTS=4 CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 CONFIG_HW_RANDOM=y CONFIG_GEN_RTC=y CONFIG_HWMON=y CONFIG_FIRMWARE_EDID=y CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_XIP=y CONFIG_FS_XIP=y CONFIG_FS_MBCACHE=y CONFIG_FS_POSIX_ACL=y CONFIG_INOTIFY=y CONFIG_INOTIFY_USER=y CONFIG_DNOTIFY=y CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_PROC_SYSCTL=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_JFFS2_FS=y CONFIG_JFFS2_FS_DEBUG=0 CONFIG_JFFS2_FS_WRITEBUFFER=y CONFIG_JFFS2_COMPRESSION_OPTIONS=y CONFIG_JFFS2_ZLIB=y CONFIG_JFFS2_RTIME=y CONFIG_JFFS2_CMODE_PRIORITY=y CONFIG_CRAMFS=y CONFIG_NFS_FS=y CONFIG_NFS_V3=y CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y CONFIG_ROOT_NFS=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_NFS_ACL_SUPPORT=y CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=y CONFIG_RPCSEC_GSS_KRB5=y CONFIG_PARTITION_ADVANCED=y CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_ASCII=y CONFIG_CRC32=y CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=y CONFIG_PLIST=y CONFIG_ENABLE_MUST_CHECK=y CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_DETECT_SOFTLOCKUP=y CONFIG_FORCED_INLINING=y CONFIG_PPC_OCP=y CONFIG_CRYPTO=y CONFIG_CRYPTO_ALGAPI=y CONFIG_CRYPTO_BLKCIPHER=y CONFIG_CRYPTO_MANAGER=y CONFIG_CRYPTO_MD5=y CONFIG_CRYPTO_ECB=m CONFIG_CRYPTO_CBC=y CONFIG_CRYPTO_DES=y Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-12 12:28 jffs2_gcd_mtd0 invoked oom-killer Igor Marnat @ 2007-03-12 13:10 ` Artem Bityutskiy 2007-03-19 5:45 ` Igor Marnat 0 siblings, 1 reply; 11+ messages in thread From: Artem Bityutskiy @ 2007-03-12 13:10 UTC (permalink / raw) To: Igor Marnat; +Cc: linux-mtd On Mon, 2007-03-12 at 15:28 +0300, Igor Marnat wrote: > Dear Sirs! > > I work with the embedded system based on PowerPC 405 EP with 16Mb RAM > and 32 Mb NAND FLASH. Rootfs lives on NAND FLASH and is formatted > using jffs2. > > A few months it worked just fine then suddenly the system became > unusable. Since root fs locates on NAND flash, system became > unloadable because of error. Anyway I can boot the > system, loading the kernel by TFTP and having mounted root fs by NFS. > > When the system works with the kernel loaded by tftp and fs mounted by > NFS, it all seems to be fine until the command > "mount -t jffs2 /dev/mtdblock0 /mnt" issued. After this command system tells me > "f: wrong data CRC" and then "jffs2_gcd_mtd0 invoked > oom-killer". Then it goes to reboot (see the log below). The problem occurs with > 2.6.18-rc4 kernel and with 2.6.19.2-1 (the latest one I can build for my > board). Well, in general it may happen since JFFS2 needs a lot of memory. But 16MiB for 32MiB flash looks enough. Does something else consume much memory in your system so too few is left for JFFS2? Check how much RAM do you have for JFFS2. -- Best regards, Artem Bityutskiy (Битюцкий Артём) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-12 13:10 ` Artem Bityutskiy @ 2007-03-19 5:45 ` Igor Marnat 2007-03-19 7:19 ` Thomas Gleixner 0 siblings, 1 reply; 11+ messages in thread From: Igor Marnat @ 2007-03-19 5:45 UTC (permalink / raw) To: Artem Bityutskiy; +Cc: linux-mtd Hello Artem and List! After some investigations I found some details of the problem, hope it helps. Having mounted the /dev/mtdblock0 device the jffs2 kernel thread begins garbage collection. In a function jffs2_get_inode_nodes it tries to assemble the chain of nodes of inode (if I got it correctly). I printed the common length of chain of nodes, it is generally 2 or 3, in rare cases it is bigger than 3. My file system on this NAND is corrupted in 2 places (at least). At the first time loop "while (valid_ref)" in jffs2_get_inode_nodes is passed about 28000 times and (I guess) system considers this chain as consisted of 28000 items. This chain passes successfully crc check of node header in jffs2_get_inode_nodes. Anyway it doesn't pass CRC check later. It writes the message "JFFS2 notice: (450) check_node_data: wrong data CRC in data node at" and process continues. But it cannot pass the second corrupted place. It doesn't meet the NULL node, so the valid_ref is always not NULL. It passes crc check of node header, so the loop continues till the system exhausted all the memory available (there are calls to kmalloc in the loop). When the chain length grows up to 110000 entries, oom killer begins to stop kernel threads and a bit later all the system reboots. The problem is big for me, it is not a problem when some file gets corrupted but it shouldn't prevent the whole system from running, especially when the fs in problem is not the root fs. So the questions are: 1. May be the function jffs2_get_inode_nodes can be improved? It's possible to pre-calculate chain length before trying to read it into the memory and allocate memory for it. It can be considered as broken if it's length exceeds some pre-defined maximum. 2. What can I do with the broken filesystem? Can I repair it? 3. Any other ideas or considerations? I tried it on a several kernel versions, including 2.6.20.3. Thank you for your help, Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-19 5:45 ` Igor Marnat @ 2007-03-19 7:19 ` Thomas Gleixner 2007-03-19 11:48 ` Igor Marnat 0 siblings, 1 reply; 11+ messages in thread From: Thomas Gleixner @ 2007-03-19 7:19 UTC (permalink / raw) To: Igor Marnat; +Cc: linux-mtd On Mon, 2007-03-19 at 08:45 +0300, Igor Marnat wrote: > But it cannot pass the second corrupted place. It doesn't meet the > NULL node, so the valid_ref is always not NULL. It passes crc check of node header, > so the loop continues till the system exhausted all the > memory available (there are calls to kmalloc in the loop). When the > chain length grows up to 110000 entries, oom killer begins to stop > kernel threads and a bit later all the system reboots. Can you please dump the mtd partition with the corrupt data with nanddump: # nanddump -b -f nand.dmp /dev/mtdX compress it with bz2 and put it somewhere to download. Thanks, tglx ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-19 7:19 ` Thomas Gleixner @ 2007-03-19 11:48 ` Igor Marnat 2007-03-19 16:50 ` Thomas Gleixner 0 siblings, 1 reply; 11+ messages in thread From: Igor Marnat @ 2007-03-19 11:48 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-mtd Hello Thomas, TG> Can you please dump the mtd partition with the corrupt data with TG> nanddump: TG> # nanddump -b -f nand.dmp /dev/mtdX I put it to ftp://ftp.tz.ru/pub/nand.dmp.bz2. It's size is 26 MB. Nanddump didn't want to accept options you provided, so I did the dump with command "nanddump /dev/mtd0 nand.dmp". Please tell me if it is wrong. Hope the image helps, Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-19 11:48 ` Igor Marnat @ 2007-03-19 16:50 ` Thomas Gleixner 2007-03-20 21:02 ` Igor Marnat 2007-03-21 8:14 ` Igor Marnat 0 siblings, 2 replies; 11+ messages in thread From: Thomas Gleixner @ 2007-03-19 16:50 UTC (permalink / raw) To: Igor Marnat; +Cc: David Woodhouse, linux-mtd On Mon, 2007-03-19 at 14:48 +0300, Igor Marnat wrote: > Hello Thomas, > > TG> Can you please dump the mtd partition with the corrupt data with > TG> nanddump: > > TG> # nanddump -b -f nand.dmp /dev/mtdX > > I put it to ftp://ftp.tz.ru/pub/nand.dmp.bz2. > > It's size is 26 MB. Nanddump didn't want to accept options you > provided, so I did the dump with command "nanddump /dev/mtd0 > nand.dmp". Please tell me if it is wrong. No. I probably looked at an old version. Your flash is full of corrupted nodes. Most of them belong to the files: "messages" and "wtmp" Invalid Inode node at 0x000253ec, totlen 0x00000130, #ino 1807, version 106033, isize 428154883, csize 4597, dsize 4597, offset 37875712 Invalid Inode node at 0x00025a14, totlen 0x00000046, #ino 8620, version 470658, isize 1019416371, csize 3596, dsize 3596, offset 5997846 Invalid Inode node at 0x00025e20, totlen 0x00000091, #ino 8620, version 470663, isize 5997941, csize 3438493594, dsize 4294927703, offset 5997864 Interrestingly enough are the node CRCs of some of the nodes intact, so the corruption must have happened before writing to flash. Does the patch below help ? tglx ---------------------> Subject: JFFS2: Check inode CRC first Corrupted inodes need to be sorted out before anything else is done on them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> diff --git a/fs/jffs2/scan.c b/fs/jffs2/scan.c index 7fb45bd..8a70590 100644 --- a/fs/jffs2/scan.c +++ b/fs/jffs2/scan.c @@ -952,7 +952,7 @@ static int jffs2_scan_inode_node(struct jffs2_sb_info *c, struct jffs2_erasebloc struct jffs2_raw_inode *ri, uint32_t ofs, struct jffs2_summary *s) { struct jffs2_inode_cache *ic; - uint32_t ino = je32_to_cpu(ri->ino); + uint32_t crc, ino = je32_to_cpu(ri->ino); int err; D1(printk(KERN_DEBUG "jffs2_scan_inode_node(): Node at 0x%08x\n", ofs)); @@ -966,21 +966,22 @@ static int jffs2_scan_inode_node(struct jffs2_sb_info *c, struct jffs2_erasebloc Which means that the _full_ amount of time to get to proper write mode with GC operational may actually be _longer_ than before. Sucks to be me. */ + /* Check the node CRC in any case. */ + crc = crc32(0, ri, sizeof(*ri)-8); + if (crc != je32_to_cpu(ri->node_crc)) { + printk(KERN_NOTICE "jffs2_scan_inode_node(): CRC failed on " + "node at 0x%08x: Read 0x%08x, calculated 0x%08x\n", + ofs, je32_to_cpu(ri->node_crc), crc); + /* + * We believe totlen because the CRC on the node + * _header_ was OK, just the node itself failed. + */ + return jffs2_scan_dirty_space(c, jeb, + PAD(je32_to_cpu(ri->totlen)); + } + ic = jffs2_get_ino_cache(c, ino); if (!ic) { - /* Inocache get failed. Either we read a bogus ino# or it's just genuinely the - first node we found for this inode. Do a CRC check to protect against the former - case */ - uint32_t crc = crc32(0, ri, sizeof(*ri)-8); - - if (crc != je32_to_cpu(ri->node_crc)) { - printk(KERN_NOTICE "jffs2_scan_inode_node(): CRC failed on node at 0x%08x: Read 0x%08x, calculated 0x%08x\n", - ofs, je32_to_cpu(ri->node_crc), crc); - /* We believe totlen because the CRC on the node _header_ was OK, just the node itself failed. */ - if ((err = jffs2_scan_dirty_space(c, jeb, PAD(je32_to_cpu(ri->totlen))))) - return err; - return 0; - } ic = jffs2_scan_make_ino_cache(c, ino); if (!ic) return -ENOMEM; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-19 16:50 ` Thomas Gleixner @ 2007-03-20 21:02 ` Igor Marnat 2007-03-21 8:14 ` Igor Marnat 1 sibling, 0 replies; 11+ messages in thread From: Igor Marnat @ 2007-03-20 21:02 UTC (permalink / raw) To: linux-mtd-bounces, Thomas Gleixner; +Cc: linux-mtd, David Woodhouse Hello Thomas, TG> Invalid Inode node at 0x000253ec, totlen 0x00000130, #ino 1807, version 106033, isize 428154883, csize 4597, dsize 4597, offset 37875712 TG> Invalid Inode node at 0x00025a14, totlen 0x00000046, #ino 8620, version 470658, isize 1019416371, csize 3596, dsize 3596, offset 5997846 TG> Invalid Inode node at 0x00025e20, totlen 0x00000091, #ino 8620, version 470663, isize 5997941, csize 3438493594, dsize 4294927703, offset 5997864 TG> Interrestingly enough are the node CRCs of some of the nodes intact, so TG> the corruption must have happened before writing to flash. TG> Does the patch below help ? Unfortunately no, it still crashes in the same way. Should I attach any debug output or something? -- Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-19 16:50 ` Thomas Gleixner 2007-03-20 21:02 ` Igor Marnat @ 2007-03-21 8:14 ` Igor Marnat 2007-03-21 9:17 ` Thomas Gleixner 1 sibling, 1 reply; 11+ messages in thread From: Igor Marnat @ 2007-03-21 8:14 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-mtd, David Woodhouse Hello Thomas, TG> Interrestingly enough are the node CRCs of some of the nodes intact, so TG> the corruption must have happened before writing to flash. Loop "while (valid_ref)" (those that become infinite) in jffs2_get_inode_nodes also contains crc checking of nodes. They are checked in order of proceeding. It doesn't complain about nodes' crc, so it seems that crc are ok. So I don't think that crc checking of chain of nodes helps here. -- Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-21 8:14 ` Igor Marnat @ 2007-03-21 9:17 ` Thomas Gleixner 2007-03-21 11:40 ` Igor Marnat 2007-03-28 5:15 ` Igor Marnat 0 siblings, 2 replies; 11+ messages in thread From: Thomas Gleixner @ 2007-03-21 9:17 UTC (permalink / raw) To: Igor Marnat; +Cc: linux-mtd, David Woodhouse On Wed, 2007-03-21 at 11:14 +0300, Igor Marnat wrote: > Hello Thomas, > > TG> Interrestingly enough are the node CRCs of some of the nodes intact, so > TG> the corruption must have happened before writing to flash. > > Loop "while (valid_ref)" (those that become infinite) in jffs2_get_inode_nodes also contains crc > checking of nodes. They are checked in order of proceeding. > > It doesn't complain about nodes' crc, so it seems > that crc are ok. So I don't think that crc checking of chain of nodes > helps here. Can you please set JFFS2_DEBUG_LEVEL to 2 and log the output to a serial console, bzip2 the result and upload it somewhere ? Beware: it's slow and huge! Thanks, tglx ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-21 9:17 ` Thomas Gleixner @ 2007-03-21 11:40 ` Igor Marnat 2007-03-28 5:15 ` Igor Marnat 1 sibling, 0 replies; 11+ messages in thread From: Igor Marnat @ 2007-03-21 11:40 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-mtd, David Woodhouse Hello Thomas, TG> Can you please set JFFS2_DEBUG_LEVEL to 2 and log the output to a serial TG> console, bzip2 the result and upload it somewhere ? TG> Beware: it's slow and huge! I put it to ftp://ftp.tz.ru/pub/jffs2.log.bz2. Tell me if you need any additional details. Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: jffs2_gcd_mtd0 invoked oom-killer 2007-03-21 9:17 ` Thomas Gleixner 2007-03-21 11:40 ` Igor Marnat @ 2007-03-28 5:15 ` Igor Marnat 1 sibling, 0 replies; 11+ messages in thread From: Igor Marnat @ 2007-03-28 5:15 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-mtd, David Woodhouse Hello guys! Any news about the problem? Best regards, Igor Marnat mailto:marny@rambler.ru ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-03-28 5:15 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-03-12 12:28 jffs2_gcd_mtd0 invoked oom-killer Igor Marnat 2007-03-12 13:10 ` Artem Bityutskiy 2007-03-19 5:45 ` Igor Marnat 2007-03-19 7:19 ` Thomas Gleixner 2007-03-19 11:48 ` Igor Marnat 2007-03-19 16:50 ` Thomas Gleixner 2007-03-20 21:02 ` Igor Marnat 2007-03-21 8:14 ` Igor Marnat 2007-03-21 9:17 ` Thomas Gleixner 2007-03-21 11:40 ` Igor Marnat 2007-03-28 5:15 ` Igor Marnat
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox