From: Oded Gabbay <ogabbay@advaoptical.com>
To: jaegeuk.kim@samsung.com
Cc: linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] Kernel BUG when writing to f2fs drive, PowerPC, SD card, USB3
Date: Thu, 20 Jun 2013 07:48:00 +0300 [thread overview]
Message-ID: <51C28980.5080307@advaoptical.com> (raw)
In-Reply-To: <1371645828.2072.22.camel@kjgkr>
Hi,
After stress-testing the F2FS on SD card and iNAND, and doing multiple
mounts/umounts, I believe I can say the fix is working for the powerpc.
Thanks for all the help
Oded.
On 06/19/2013 03:43 PM, Jaegeuk Kim wrote:
> Hi,
> Could you test the following three patches sent right after this email?
>
> For f2fs-tools:
> 1. store crc as __le32
> 2. store checkpoint flags as __le32
>
> For f2fs:
> 1. handle crc as __le32
>
> I suspect that:
> 1. mount failure is able to be occurred due to the crc endian error.
> 2. update_sit_entry bug_on is caused by the endian problem on the
> checkpoint flags.
>
> If wrong checkpoint flag is got at mount time, we cannot build the
> latest sit entries correctly.
>
> Thanks,
>
> 2013-06-18 (화), 15:10 +0300, Oded Gabbay:
>> Hi,
>>
>> I printed also the segment no. and it appears that every time that
>> segment 187 is written to, which is assigned to the HOT data, the
>> system crashes.
>> BTW, even if I just do "echo oded > /mnt/file" and then just wait for
>> about 30-45 seconds, the crash occurs
>>
>> I got the following prints when I did the echo thing:
>>
>> REMOVE_ME update_sit_entry(202):
>> offset = 0, segoff = 3584, blkaddr = 4096, segno = 0
>> REMOVE_ME update_sit_entry(202):
>> offset = 1, segoff = 99329, blkaddr = 99841, segno = 187
>> REMOVE_ME update_sit_entry(202):
>> offset = 0, segoff = 99328, blkaddr = 99840, segno = 187
>> ------------[ cut here ]------------
>> kernel BUG
>> at /home/ogabbay/views/r5.xcj/software/os-software/powerpc/usr/src/linux-3.9.6-adva/fs/f2fs/segment.c:217!
>>
>> Here is the print from the debugfs. You can see segment 187 is
>> allocated to HOT data:
>>
>> [cj:~] # cat /sys/kernel/debug/f2fs/status
>>
>> =====[ partition info(loop0). #0 ]=====
>> [SB: 1] [CP: 2] [SIT: 2] [NAT: 2] [SSA: 1] [MAIN: 192(OverProv:55
>> Resv:48)]
>>
>> Utilization: 0% (4 valid blocks)
>> - Node: 2 (Inode: 2, Other: 0)
>> - Data: 2
>>
>> Main area: 192 segs, 192 secs 192 zones
>> - COLD data: 0, 0, 0
>> - WARM data: 1, 1, 1
>> - HOT data: 187, 187, 187
>> - Dir dnode: 190, 190, 190
>> - File dnode: 189, 189, 189
>> - Indir nodes: 188, 188, 188
>>
>> - Valid: 6
>> - Dirty: 0
>> - Prefree: 0
>> - Free: 186 (186)
>>
>> GC calls: 0 (BG: 0)
>> - data segments : 0
>> - node segments : 0
>> Try to move 0 blocks
>> - data blocks : 0
>> - node blocks : 0
>>
>> Extent Hit Ratio: 0 / 0
>>
>> Balancing F2FS Async:
>> - nodes 1 in 2
>> - dents 1 in dirs: 1
>> - meta 0 in 21
>> - NATs 2 > 29120
>> - SITs: 0
>> - free_nids: 2270
>>
>> Distribution of User Blocks: [ valid | invalid | free ]
>> [|---|-----------------------------------------------]
>>
>> SSR: 0 blocks in 0 segments
>> LFS: 0 blocks in 0 segments
>>
>> BDF: 100, avg. vblocks: 0
>>
>> Memory: 154 KB = static: 60 + cached: 94
>>
>>
>> On 06/18/2013 12:44 PM, Oded Gabbay wrote:
>>
>>> Hi,
>>>
>>> I would like to share additional information I have from pr_err I
>>> put into the update_sit_entry function.
>>>
>>> The following is the printout from the terminal. This is only the
>>> end of the printout. The start was with "segoff = 3584" and it went
>>> sequentially up until 9215, where then it jumped to 99329 and after
>>> that 99328 - which is the entry that caused the crash.
>>> Each time I repeated the experiment I got exactly the same results.
>>>
>>> I put the pr_err after the line: "offset = GET_SEGOFF_FROM_SEG0(sbi,
>>> blkaddr) & (sbi->blocks_per_seg - 1);"
>>> where segoff = GET_SEGOFF_FROM_SEG0(sbi, blkaddr)
>>>
>>> Hope this helps.
>>>
>>> REMOVE_ME update_sit_entry(202): offset = 0, segoff = 3584,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 1, segoff = 3585,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 2, segoff = 3586,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 3, segoff = 3587,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 4, segoff = 3588,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 5, segoff = 3589,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 6, segoff = 3590,
>>> sbi->blocks_per_seg = 512
>>> :::
>>> REMOVE_ME update_sit_entry(202): offset = 502, segoff = 9206,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 503, segoff = 9207,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 504, segoff = 9208,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 505, segoff = 9209,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 506, segoff = 9210,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 507, segoff = 9211,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 508, segoff = 9212,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 509, segoff = 9213,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 510, segoff = 9214,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 511, segoff = 9215,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 1, segoff = 99329,
>>> sbi->blocks_per_seg = 512
>>> REMOVE_ME update_sit_entry(202): offset = 0, segoff = 99328,
>>> sbi->blocks_per_seg = 512
>>>
>>> ------------[ cut here ]------------
>>> kernel BUG
>>> at /home/ogabbay/views/r5.xcj/software/os-software/powerpc/usr/src/linux-3.9.6-adva/fs/f2fs/segment.c:217!
>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>> PREEMPT SMP NR_CPUS=2 P2020 FSP150
>>> Modules linked in: mdio(O) hardware_version(PO) clipresent(PO)
>>> monotonic(O) restartcause(PO) panic_buffer(O)
>>> NIP: c026c038 LR: c026bf0c CTR: 00000000
>>> REGS: c5665a60 TRAP: 0700 Tainted: P O
>>> (3.9.6-dev_ogabbay-109564*)
>>> MSR: 00029000 <CE,EE,ME> CR: 24f82c48 XER: 20000000
>>> TASK = ef943e80[1774] 'flush-7:0' THREAD: c5664000 CPU: 1
>>> GPR00: 00000000 c5665b10 ef943e80 000000a6 00021000 00000000
>>> f1e26054 725f7365
>>> GPR08: 00000000 c5a13e40 00000040 00000040 00000067 00000000
>>> c5665c64 00000000
>>> GPR16: c1419240 00080000 00000000 00000000 000000bb ef8ce100
>>> 00000000 ef8ce134
>>> GPR24: ffffffff ffffff99 000000bb ef8ce100 00000080 f1e57760
>>> ffffffff c561e800
>>> NIP [c026c038] update_sit_entry+0x234/0x23c
>>> LR [c026bf0c] update_sit_entry+0x108/0x23c
>>> Call Trace:
>>> [c5665b10] [c026bed4] update_sit_entry+0xd0/0x23c (unreliable)
>>> [c5665b40] [c026d1e8] do_write_page+0x198/0x660
>>> [c5665b80] [c026d840] write_data_page+0xa4/0xb8
>>> [c5665bc0] [c0265118] do_write_data_page+0x1e8/0x20c
>>> [c5665c20] [c02653dc] f2fs_write_data_page+0x2a0/0x2c0
>>> [c5665c40] [c0263ad8] __f2fs_writepage+0x24/0x80
>>> [c5665c50] [c00b05dc] write_cache_pages+0x1d0/0x35c
>>> [c5665d00] [c0263cf4] f2fs_write_data_pages+0xf4/0xfc
>>> [c5665d30] [c00b1d3c] do_writepages+0x30/0x64
>>> [c5665d40] [c0103fbc] __writeback_single_inode+0x34/0x10c
>>> [c5665d60] [c0104ef8] writeback_sb_inodes+0x204/0x370
>>> [c5665dd0] [c01050f4] __writeback_inodes_wb+0x90/0xd4
>>> [c5665e00] [c01054cc] wb_writeback+0x204/0x20c
>>> [c5665e50] [c0105844] wb_do_writeback+0x144/0x20c
>>> [c5665eb0] [c0105980] bdi_writeback_thread+0x74/0x144
>>> [c5665ee0] [c0059dc4] kthread+0xa8/0xac
>>> [c5665f40] [c000f014] ret_from_kernel_thread+0x64/0x6c
>>> --- Exception: 0 at (null)
>>> LR = (null)
>>> Instruction dump:
>>> 4bffff80 813d0004 5780e8fe 7f9ce0f8 39400001 579c077e 7d6900ae
>>> 7d5ce030
>>> 7d6ae078 7d68e039 7d4901ae 4082ff44 <0fe00000> 0fe00000 9421ffe0
>>> 7c0802a6
>>> ---[ end trace 707fc0870875373e ]---
>>>
>>> Oded
>>> On 06/17/2013 05:21 PM, Oded Gabbay wrote:
>>>
>>>> Hi,
>>>>
>>>> I also suspect the endian conversion issue.
>>>> Attached is a gzip-ed file, which represent an image of a freshly
>>>> formatted disk in f2fs in powerpc machine. I preferred to do it
>>>> this way so I could have a small file to send you.
>>>> I did the following to create it:
>>>>
>>>> [cj:~] # dd if=/dev/zero of=/tmp/test_file bs=4096 count=102400
>>>> 102400+0 records in
>>>> 102400+0 records out
>>>> 419430400 bytes (419 MB) copied, 1.05112 s, 399 MB/s
>>>> [cj:~] # losetup /dev/loop0 /tmp/test_file
>>>> [cj:~] # mkfs.f2fs -l label /dev/loop0
>>>>
>>>> F2FS-tools: mkfs.f2fs Ver: 1.1.0 (2013-06-13)
>>>>
>>>> Info: Label = label
>>>> Info: sector size = 512
>>>> Info: total sectors = 819200 (in 512bytes)
>>>> Info: zone aligned segment0 blkaddr: 512
>>>> Info: format successful
>>>> [cj:~] # losetup -d /dev/loop0
>>>> [cj:/tmp] # gzip test_file
>>>>
>>>> I then run my test application and got the same kernel BUG
>>>> message.
>>>>
>>>> Oded
>>>>
>>>> On 06/17/2013 03:11 PM, Jaegeuk Kim wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Thank you for the report. :)
>>>>>
>>>>> Can you send me the disk image right after formatting f2fs?
>>>>> As your previous patch, I strongly suspect the endian conversion bug.
>>>>>
>>>>> Otherwise, I recommend you to test with the latest tree from:
>>>>> http://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git
>>>>>
>>>>> Thanks,
>>>>>
>>>>> 2013-06-16 (일), 14:51 +0300, Oded Gabbay:
>>>>>> Hi,
>>>>>>
>>>>>> I'm working on a custom board with a PowerPC processor (Freescale
>>>>>> P2020).
>>>>>> On the board there is an SD card, which is connected to a USB3 chip
>>>>>> (from TI), which is connected to the PCI-e controller of the CPU.
>>>>>> I'm running with Linux kernel 3.9.6, with our custom rootFS.
>>>>>>
>>>>>> I formatted an SD card using the mkfs.f2fs utility (after fixing some
>>>>>> Big-endian issues - sent a patch a few days ago).
>>>>>> I then mounted the SD card, using "mount -o
>>>>>> noatime,nodiratime,rw,nosuid,nodev,relatime,active_logs=6,uhelper=udisks2,background_gc_off /dev/sda /mnt/sd1"
>>>>>> Then, I started a small user-space test application which opens a file
>>>>>> on the mount folder and starts to do "fwrite" into the file.
>>>>>> After 2-3 seconds, the kernel gives me a BUG and the system restarts.
>>>>>> When the system is up and I try to re-mount the SD card, I get the
>>>>>> following error message:
>>>>>>
>>>>>> F2FS-fs (sda): Failed to get valid F2FS checkpoint
>>>>>> mount: you must specify the filesystem type
>>>>>>
>>>>>> Only way is to re-format the card using mkfs.f2fs
>>>>>>
>>>>>> I took the f2fs patch that Jaegeuk Kim sent to Linus for 3.10 (here -
>>>>>> https://lkml.org/lkml/2013/5/8/122) and applied it cleanly to 3.9.6
>>>>>> I repeated the procedure but got the same result.
>>>>>>
>>>>>> The BUG is from this line, from segment.c:
>>>>>> if (!f2fs_clear_bit(offset, se->cur_valid_map))
>>>>>> BUG();
>>>>>>
>>>>>> Additional information I can give is
>>>>>>
>>>>>> 1. I tried using F2FS in ArchLinux, kernel 3.9.5, on an x86 machine,
>>>>>> with the same SD card and the same USB3-to-PCIe chip and it worked
>>>>>> flawlessly there.
>>>>>> 2. I can work with other FS on the SD card on our custom board, such
>>>>>> as Ext3, Ext4 and vfat, so this is not a H/W issue.
>>>>>>
>>>>>> Could you please try to help me pinpoint/debug the problem ?
>>>>>>
>>>>>> Here is the complete kernel BUG print:
>>>>>>
>>>>>> kernel BUG at .../linux-3.9.6-adva/fs/f2fs/segment.c:214!
>>>>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>>>>> PREEMPT SMP NR_CPUS=2 P2020 FSP150
>>>>>> Modules linked in: mdio(O) hardware_version(PO) clipresent(PO)
>>>>>> monotonic(O) restartcause(PO) panic_buffer(O)
>>>>>> NIP: c026a7e0 LR: c026a660 CTR: 00000000
>>>>>> REGS: ee761a60 TRAP: 0700 Tainted: P O
>>>>>> (3.9.6-dev_ogabbay-109482*)
>>>>>> MSR: 00029000 <CE,EE,ME> CR: 24a52588 XER: 20000000
>>>>>> TASK = efb444c0[1755] 'flush-8:0' THREAD: ee760000 CPU: 1
>>>>>> GPR00: 00000000 ee761b10 efb444c0 0000004c 00000000 00000000 01dc4900
>>>>>> eb0fa700
>>>>>> GPR08: 00000000 eb24cb00 00000040 00000040 00000038 00000000 ee761c64
>>>>>> 00000000
>>>>>> GPR16: c0aeea80 00080000 00000000 00000000 0000ed31 eb0fa700 00000000
>>>>>> eb0fa734
>>>>>> GPR24: eb0fa700 00000080 f2030620 ffffffff ffffffc8 0000ed31 ffffffff
>>>>>> c55a1000
>>>>>> NIP [c026a7e0] update_sit_entry+0x240/0x248
>>>>>> LR [c026a660] update_sit_entry+0xc0/0x248
>>>>>> Call Trace:
>>>>>> [ee761b10] [c55a1000] 0xc55a1000 (unreliable)
>>>>>> [ee761b40] [c026d1f4] do_write_page+0x198/0x660
>>>>>> [ee761b80] [c026d84c] write_data_page+0xa4/0xb8
>>>>>> [ee761bc0] [c0265118] do_write_data_page+0x1e8/0x20c
>>>>>> [ee761c20] [c02653dc] f2fs_write_data_page+0x2a0/0x2c0
>>>>>> [ee761c40] [c0263ad8] __f2fs_writepage+0x24/0x80
>>>>>> [ee761c50] [c00b05dc] write_cache_pages+0x1d0/0x35c
>>>>>> [ee761d00] [c0263cf4] f2fs_write_data_pages+0xf4/0xfc
>>>>>> [ee761d30] [c00b1d3c] do_writepages+0x30/0x64
>>>>>> [ee761d40] [c0103fbc] __writeback_single_inode+0x34/0x10c
>>>>>> [ee761d60] [c0104ef8] writeback_sb_inodes+0x204/0x370
>>>>>> [ee761dd0] [c01050f4] __writeback_inodes_wb+0x90/0xd4
>>>>>> [ee761e00] [c01054cc] wb_writeback+0x204/0x20c
>>>>>> [ee761e50] [c0105844] wb_do_writeback+0x144/0x20c
>>>>>> [ee761eb0] [c0105980] bdi_writeback_thread+0x74/0x144
>>>>>> [ee761ee0] [c0059dc4] kthread+0xa8/0xac
>>>>>> [ee761f40] [c000f014] ret_from_kernel_thread+0x64/0x6c
>>>>>> Instruction dump:
>>>>>> 4bffff2c 813a0004 5720e8fe 7f39c8f8 39400001 5739077e 7d6900ae
>>>>>> 7d59c830
>>>>>> 7d6ac878 7d68c839 7d4901ae 4082fef0 <0fe00000> 0fe00000 9421ffe0
>>>>>> 7c0802a6
>>>>>>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
prev parent reply other threads:[~2013-06-20 4:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-16 11:51 [f2fs-dev] Kernel BUG when writing to f2fs drive, PowerPC, SD card, USB3 Oded Gabbay
2013-06-17 7:20 ` Huajun Li
2013-06-17 12:38 ` Oded Gabbay
2013-06-17 12:11 ` Jaegeuk Kim
2013-06-17 14:21 ` [f2fs-dev] [Virus Scan Error!] " Oded Gabbay
2013-06-18 9:44 ` [f2fs-dev] " Oded Gabbay
2013-06-18 12:10 ` Oded Gabbay
2013-06-19 12:43 ` Jaegeuk Kim
2013-06-19 12:45 ` [f2fs-dev] [PATCH 1/2] lib, mkfs: fix endian conversion for crc calculation Jaegeuk Kim
2013-06-19 12:45 ` [f2fs-dev] [PATCH 2/2] mkfs: fix to store __le32 for checkpoint flags Jaegeuk Kim
2013-06-19 12:46 ` [f2fs-dev] [PATCH] f2fs: fix crc endian conversion Jaegeuk Kim
2013-06-19 13:07 ` [f2fs-dev] Kernel BUG when writing to f2fs drive, PowerPC, SD card, USB3 Oded Gabbay
2013-06-20 4:48 ` Oded Gabbay [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C28980.5080307@advaoptical.com \
--to=ogabbay@advaoptical.com \
--cc=jaegeuk.kim@samsung.com \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).