From: Oded Gabbay <ogabbay@advaoptical.com>
To: jaegeuk.kim@samsung.com
Cc: linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] Kernel BUG when writing to f2fs drive, PowerPC, SD card, USB3
Date: Wed, 19 Jun 2013 16:07:03 +0300 [thread overview]
Message-ID: <51C1ACF7.3030901@advaoptical.com> (raw)
In-Reply-To: <1371645828.2072.22.camel@kjgkr>
Hi,
I tried your patches and they seem to work :)
I fixed umount flag yesterday but the CRC was the keypoint. Thanks.
I will now stress-test the SD card on my board and let you know the results.
Best regards,
Oded
On 06/19/2013 03:43 PM, Jaegeuk Kim wrote:
> Hi,
> Could you test the following three patches sent right after this email?
>
> For f2fs-tools:
> 1. store crc as __le32
> 2. store checkpoint flags as __le32
>
> For f2fs:
> 1. handle crc as __le32
>
> I suspect that:
> 1. mount failure is able to be occurred due to the crc endian error.
> 2. update_sit_entry bug_on is caused by the endian problem on the
> checkpoint flags.
>
> If wrong checkpoint flag is got at mount time, we cannot build the
> latest sit entries correctly.
>
> Thanks,
>
> 2013-06-18 (화), 15:10 +0300, Oded Gabbay:
>> Hi,
>>
>> I printed also the segment no. and it appears that every time that
>> segment 187 is written to, which is assigned to the HOT data, the
>> system crashes.
>> BTW, even if I just do "echo oded > /mnt/file" and then just wait for
>> about 30-45 seconds, the crash occurs
>>
>> I got the following prints when I did the echo thing:
>>
>> REMOVE_ME update_sit_entry(202):
>> offset = 0, segoff = 3584, blkaddr = 4096, segno = 0
>> REMOVE_ME update_sit_entry(202):
>> offset = 1, segoff = 99329, blkaddr = 99841, segno = 187
>> REMOVE_ME update_sit_entry(202):
>> offset = 0, segoff = 99328, blkaddr = 99840, segno = 187
>> ------------[ cut here ]------------
>> kernel BUG
>> at /home/ogabbay/views/r5.xcj/software/os-software/powerpc/usr/src/linux-3.9.6-adva/fs/f2fs/segment.c:217!
>>
>> Here is the print from the debugfs. You can see segment 187 is
>> allocated to HOT data:
>>
>> [cj:~] # cat /sys/kernel/debug/f2fs/status
>>
>> =====[ partition info(loop0). #0 ]=====
>> [SB: 1] [CP: 2] [SIT: 2] [NAT: 2] [SSA: 1] [MAIN: 192(OverProv:55
>> Resv:48)]
>>
>> Utilization: 0% (4 valid blocks)
>> - Node: 2 (Inode: 2, Other: 0)
>> - Data: 2
>>
>> Main area: 192 segs, 192 secs 192 zones
>> - COLD data: 0, 0, 0
>> - WARM data: 1, 1, 1
>> - HOT data: 187, 187, 187
>> - Dir dnode: 190, 190, 190
>> - File dnode: 189, 189, 189
>> - Indir nodes: 188, 188, 188
>>
>> - Valid: 6
>> - Dirty: 0
>> - Prefree: 0
>> - Free: 186 (186)
>>
>> GC calls: 0 (BG: 0)
>> - data segments : 0
>> - node segments : 0
>> Try to move 0 blocks
>> - data blocks : 0
>> - node blocks : 0
>>
>> Extent Hit Ratio: 0 / 0
>>
>> Balancing F2FS Async:
>> - nodes 1 in 2
>> - dents 1 in dirs: 1
>> - meta 0 in 21
>> - NATs 2 > 29120
>> - SITs: 0
>> - free_nids: 2270
>>
>> Distribution of User Blocks: [ valid | invalid | free ]
>> [|---|-----------------------------------------------]
>>
>> SSR: 0 blocks in 0 segments
>> LFS: 0 blocks in 0 segments
>>
>> BDF: 100, avg. vblocks: 0
>>
>> Memory: 154 KB = static: 60 + cached: 94
>>
>>
>> On 06/18/2013 12:44 PM, Oded Gabbay wrote:
>>
>>> Hi,
>>>
>>> I would like to share additional information I have from pr_err I
>>> put into the update_sit_entry function.
>>>
>>> The following is the printout from the terminal. This is only the
>>> end of the printout. The start was with "segoff = 3584" and it went
>>> sequentially up until 9215, where then it jumped to 99329 and after
>>> that 99328 - which is the entry that caused the crash.
>>> Each time I repeated the experiment I got exactly the same results.
>>>
>>> I put the pr_err after the line: "offset = GET_SEGOFF_FROM_SEG0(sbi,
>>> blkaddr) & (sbi->blocks_per_seg - 1);"
>>> where segoff = GET_SEGOFF_FROM_SEG0(sbi, blkaddr)
>>>
>>> Hope this helps.
>>>
>>> REMOVE_ME update_sit_entry(202): offset = 0, segoff = 3584,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 1, segoff = 3585,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 2, segoff = 3586,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 3, segoff = 3587,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 4, segoff = 3588,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 5, segoff = 3589,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 6, segoff = 3590,
>>> sbi->blocks_per_seg = 512
>>> :::
>>> REMOVE_ME update_sit_entry(202): offset = 502, segoff = 9206,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 503, segoff = 9207,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 504, segoff = 9208,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 505, segoff = 9209,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 506, segoff = 9210,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 507, segoff = 9211,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 508, segoff = 9212,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 509, segoff = 9213,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 510, segoff = 9214,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 511, segoff = 9215,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 1, segoff = 99329,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 0, segoff = 99328,
>>> sbi->blocks_per_seg = 512
>>>
>>> ------------[ cut here ]------------
>>> kernel BUG
>>> at /home/ogabbay/views/r5.xcj/software/os-software/powerpc/usr/src/linux-3.9.6-adva/fs/f2fs/segment.c:217!
>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>> PREEMPT SMP NR_CPUS=2 P2020 FSP150
>>> Modules linked in: mdio(O) hardware_version(PO) clipresent(PO)
>>> monotonic(O) restartcause(PO) panic_buffer(O)
>>> NIP: c026c038 LR: c026bf0c CTR: 00000000
>>> REGS: c5665a60 TRAP: 0700 Tainted: P O
>>> (3.9.6-dev_ogabbay-109564*)
>>> MSR: 00029000 <CE,EE,ME> CR: 24f82c48 XER: 20000000
>>> TASK = ef943e80[1774] 'flush-7:0' THREAD: c5664000 CPU: 1
>>> GPR00: 00000000 c5665b10 ef943e80 000000a6 00021000 00000000
>>> f1e26054 725f7365
>>> GPR08: 00000000 c5a13e40 00000040 00000040 00000067 00000000
>>> c5665c64 00000000
>>> GPR16: c1419240 00080000 00000000 00000000 000000bb ef8ce100
>>> 00000000 ef8ce134
>>> GPR24: ffffffff ffffff99 000000bb ef8ce100 00000080 f1e57760
>>> ffffffff c561e800
>>> NIP [c026c038] update_sit_entry+0x234/0x23c
>>> LR [c026bf0c] update_sit_entry+0x108/0x23c
>>> Call Trace:
>>> [c5665b10] [c026bed4] update_sit_entry+0xd0/0x23c (unreliable)
>>> [c5665b40] [c026d1e8] do_write_page+0x198/0x660
>>> [c5665b80] [c026d840] write_data_page+0xa4/0xb8
>>> [c5665bc0] [c0265118] do_write_data_page+0x1e8/0x20c
>>> [c5665c20] [c02653dc] f2fs_write_data_page+0x2a0/0x2c0
>>> [c5665c40] [c0263ad8] __f2fs_writepage+0x24/0x80
>>> [c5665c50] [c00b05dc] write_cache_pages+0x1d0/0x35c
>>> [c5665d00] [c0263cf4] f2fs_write_data_pages+0xf4/0xfc
>>> [c5665d30] [c00b1d3c] do_writepages+0x30/0x64
>>> [c5665d40] [c0103fbc] __writeback_single_inode+0x34/0x10c
>>> [c5665d60] [c0104ef8] writeback_sb_inodes+0x204/0x370
>>> [c5665dd0] [c01050f4] __writeback_inodes_wb+0x90/0xd4
>>> [c5665e00] [c01054cc] wb_writeback+0x204/0x20c
>>> [c5665e50] [c0105844] wb_do_writeback+0x144/0x20c
>>> [c5665eb0] [c0105980] bdi_writeback_thread+0x74/0x144
>>> [c5665ee0] [c0059dc4] kthread+0xa8/0xac
>>> [c5665f40] [c000f014] ret_from_kernel_thread+0x64/0x6c
>>> --- Exception: 0 at (null)
>>> LR = (null)
>>> Instruction dump:
>>> 4bffff80 813d0004 5780e8fe 7f9ce0f8 39400001 579c077e 7d6900ae
>>> 7d5ce030
>>> 7d6ae078 7d68e039 7d4901ae 4082ff44 <0fe00000> 0fe00000 9421ffe0
>>> 7c0802a6
>>> ---[ end trace 707fc0870875373e ]---
>>>
>>> Oded
>>> On 06/17/2013 05:21 PM, Oded Gabbay wrote:
>>>
>>>> Hi,
>>>>
>>>> I also suspect the endian conversion issue.
>>>> Attached is a gzip-ed file, which represent an image of a freshly
>>>> formatted disk in f2fs in powerpc machine. I preferred to do it
>>>> this way so I could have a small file to send you.
>>>> I did the following to create it:
>>>>
>>>> [cj:~] # dd if=/dev/zero of=/tmp/test_file bs=4096 count=102400
>>>> 102400+0 records in
>>>> 102400+0 records out
>>>> 419430400 bytes (419 MB) copied, 1.05112 s, 399 MB/s
>>>> [cj:~] # losetup /dev/loop0 /tmp/test_file
>>>> [cj:~] # mkfs.f2fs -l label /dev/loop0
>>>>
>>>> F2FS-tools: mkfs.f2fs Ver: 1.1.0 (2013-06-13)
>>>>
>>>> Info: Label = label
>>>> Info: sector size = 512
>>>> Info: total sectors = 819200 (in 512bytes)
>>>> Info: zone aligned segment0 blkaddr: 512
>>>> Info: format successful
>>>> [cj:~] # losetup -d /dev/loop0
>>>> [cj:/tmp] # gzip test_file
>>>>
>>>> I then run my test application and got the same kernel BUG
>>>> message.
>>>>
>>>> Oded
>>>>
>>>> On 06/17/2013 03:11 PM, Jaegeuk Kim wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Thank you for the report. :)
>>>>>
>>>>> Can you send me the disk image right after formatting f2fs?
>>>>> As your previous patch, I strongly suspect the endian conversion bug.
>>>>>
>>>>> Otherwise, I recommend you to test with the latest tree from:
>>>>> http://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git
>>>>>
>>>>> Thanks,
>>>>>
>>>>> 2013-06-16 (일), 14:51 +0300, Oded Gabbay:
>>>>>> Hi,
>>>>>>
>>>>>> I'm working on a custom board with a PowerPC processor (Freescale
>>>>>> P2020).
>>>>>> On the board there is an SD card, which is connected to a USB3 chip
>>>>>> (from TI), which is connected to the PCI-e controller of the CPU.
>>>>>> I'm running with Linux kernel 3.9.6, with our custom rootFS.
>>>>>>
>>>>>> I formatted an SD card using the mkfs.f2fs utility (after fixing some
>>>>>> Big-endian issues - sent a patch a few days ago).
>>>>>> I then mounted the SD card, using "mount -o
>>>>>> noatime,nodiratime,rw,nosuid,nodev,relatime,active_logs=6,uhelper=udisks2,background_gc_off /dev/sda /mnt/sd1"
>>>>>> Then, I started a small user-space test application which opens a file
>>>>>> on the mount folder and starts to do "fwrite" into the file.
>>>>>> After 2-3 seconds, the kernel gives me a BUG and the system restarts.
>>>>>> When the system is up and I try to re-mount the SD card, I get the
>>>>>> following error message:
>>>>>>
>>>>>> F2FS-fs (sda): Failed to get valid F2FS checkpoint
>>>>>> mount: you must specify the filesystem type
>>>>>>
>>>>>> Only way is to re-format the card using mkfs.f2fs
>>>>>>
>>>>>> I took the f2fs patch that Jaegeuk Kim sent to Linus for 3.10 (here -
>>>>>> https://lkml.org/lkml/2013/5/8/122) and applied it cleanly to 3.9.6
>>>>>> I repeated the procedure but got the same result.
>>>>>>
>>>>>> The BUG is from this line, from segment.c:
>>>>>> if (!f2fs_clear_bit(offset, se->cur_valid_map))
>>>>>> BUG();
>>>>>>
>>>>>> Additional information I can give is
>>>>>>
>>>>>> 1. I tried using F2FS in ArchLinux, kernel 3.9.5, on an x86 machine,
>>>>>> with the same SD card and the same USB3-to-PCIe chip and it worked
>>>>>> flawlessly there.
>>>>>> 2. I can work with other FS on the SD card on our custom board, such
>>>>>> as Ext3, Ext4 and vfat, so this is not a H/W issue.
>>>>>>
>>>>>> Could you please try to help me pinpoint/debug the problem ?
>>>>>>
>>>>>> Here is the complete kernel BUG print:
>>>>>>
>>>>>> kernel BUG at .../linux-3.9.6-adva/fs/f2fs/segment.c:214!
>>>>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>>>>> PREEMPT SMP NR_CPUS=2 P2020 FSP150
>>>>>> Modules linked in: mdio(O) hardware_version(PO) clipresent(PO)
>>>>>> monotonic(O) restartcause(PO) panic_buffer(O)
>>>>>> NIP: c026a7e0 LR: c026a660 CTR: 00000000
>>>>>> REGS: ee761a60 TRAP: 0700 Tainted: P O
>>>>>> (3.9.6-dev_ogabbay-109482*)
>>>>>> MSR: 00029000 <CE,EE,ME> CR: 24a52588 XER: 20000000
>>>>>> TASK = efb444c0[1755] 'flush-8:0' THREAD: ee760000 CPU: 1
>>>>>> GPR00: 00000000 ee761b10 efb444c0 0000004c 00000000 00000000 01dc4900
>>>>>> eb0fa700
>>>>>> GPR08: 00000000 eb24cb00 00000040 00000040 00000038 00000000 ee761c64
>>>>>> 00000000
>>>>>> GPR16: c0aeea80 00080000 00000000 00000000 0000ed31 eb0fa700 00000000
>>>>>> eb0fa734
>>>>>> GPR24: eb0fa700 00000080 f2030620 ffffffff ffffffc8 0000ed31 ffffffff
>>>>>> c55a1000
>>>>>> NIP [c026a7e0] update_sit_entry+0x240/0x248
>>>>>> LR [c026a660] update_sit_entry+0xc0/0x248
>>>>>> Call Trace:
>>>>>> [ee761b10] [c55a1000] 0xc55a1000 (unreliable)
>>>>>> [ee761b40] [c026d1f4] do_write_page+0x198/0x660
>>>>>> [ee761b80] [c026d84c] write_data_page+0xa4/0xb8
>>>>>> [ee761bc0] [c0265118] do_write_data_page+0x1e8/0x20c
>>>>>> [ee761c20] [c02653dc] f2fs_write_data_page+0x2a0/0x2c0
>>>>>> [ee761c40] [c0263ad8] __f2fs_writepage+0x24/0x80
>>>>>> [ee761c50] [c00b05dc] write_cache_pages+0x1d0/0x35c
>>>>>> [ee761d00] [c0263cf4] f2fs_write_data_pages+0xf4/0xfc
>>>>>> [ee761d30] [c00b1d3c] do_writepages+0x30/0x64
>>>>>> [ee761d40] [c0103fbc] __writeback_single_inode+0x34/0x10c
>>>>>> [ee761d60] [c0104ef8] writeback_sb_inodes+0x204/0x370
>>>>>> [ee761dd0] [c01050f4] __writeback_inodes_wb+0x90/0xd4
>>>>>> [ee761e00] [c01054cc] wb_writeback+0x204/0x20c
>>>>>> [ee761e50] [c0105844] wb_do_writeback+0x144/0x20c
>>>>>> [ee761eb0] [c0105980] bdi_writeback_thread+0x74/0x144
>>>>>> [ee761ee0] [c0059dc4] kthread+0xa8/0xac
>>>>>> [ee761f40] [c000f014] ret_from_kernel_thread+0x64/0x6c
>>>>>> Instruction dump:
>>>>>> 4bffff2c 813a0004 5720e8fe 7f39c8f8 39400001 5739077e 7d6900ae
>>>>>> 7d59c830
>>>>>> 7d6ac878 7d68c839 7d4901ae 4082fef0 <0fe00000> 0fe00000 9421ffe0
>>>>>> 7c0802a6
>>>>>>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2013-06-19 13:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-16 11:51 [f2fs-dev] Kernel BUG when writing to f2fs drive, PowerPC, SD card, USB3 Oded Gabbay
2013-06-17 7:20 ` Huajun Li
2013-06-17 12:38 ` Oded Gabbay
2013-06-17 12:11 ` Jaegeuk Kim
2013-06-17 14:21 ` [f2fs-dev] [Virus Scan Error!] " Oded Gabbay
2013-06-18 9:44 ` [f2fs-dev] " Oded Gabbay
2013-06-18 12:10 ` Oded Gabbay
2013-06-19 12:43 ` Jaegeuk Kim
2013-06-19 12:45 ` [f2fs-dev] [PATCH 1/2] lib, mkfs: fix endian conversion for crc calculation Jaegeuk Kim
2013-06-19 12:45 ` [f2fs-dev] [PATCH 2/2] mkfs: fix to store __le32 for checkpoint flags Jaegeuk Kim
2013-06-19 12:46 ` [f2fs-dev] [PATCH] f2fs: fix crc endian conversion Jaegeuk Kim
2013-06-19 13:07 ` Oded Gabbay [this message]
2013-06-20 4:48 ` [f2fs-dev] Kernel BUG when writing to f2fs drive, PowerPC, SD card, USB3 Oded Gabbay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C1ACF7.3030901@advaoptical.com \
--to=ogabbay@advaoptical.com \
--cc=jaegeuk.kim@samsung.com \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.