Hi,

I printed also the segment no. and it appears that every time that segment 187 is written to, which is assigned to the HOT data, the system crashes.
BTW, even if I just do "echo oded > /mnt/file" and then just wait for about 30-45 seconds, the crash occurs

I got the following prints when I did the echo thing:

REMOVE_ME update_sit_entry(202):
offset = 0, segoff = 3584, blkaddr = 4096, segno = 0
REMOVE_ME update_sit_entry(202):
offset = 1, segoff = 99329, blkaddr = 99841, segno = 187
REMOVE_ME update_sit_entry(202):
offset = 0, segoff = 99328, blkaddr = 99840, segno = 187
------------[ cut here ]------------
kernel BUG at /home/ogabbay/views/r5.xcj/software/os-software/powerpc/usr/src/linux-3.9.6-adva/fs/f2fs/segment.c:217!

Here is the print from the debugfs. You can see segment 187 is allocated to HOT data:

[cj:~] # cat /sys/kernel/debug/f2fs/status

=====[ partition info(loop0). #0 ]=====
[SB: 1] [CP: 2] [SIT: 2] [NAT: 2] [SSA: 1] [MAIN: 192(OverProv:55 Resv:48)]

Utilization: 0% (4 valid blocks)
  - Node: 2 (Inode: 2, Other: 0)
  - Data: 2

Main area: 192 segs, 192 secs 192 zones
  - COLD  data: 0, 0, 0
  - WARM  data: 1, 1, 1
  - HOT   data: 187, 187, 187
  - Dir   dnode: 190, 190, 190
  - File   dnode: 189, 189, 189
  - Indir nodes: 188, 188, 188

  - Valid: 6
  - Dirty: 0
  - Prefree: 0
  - Free: 186 (186)

GC calls: 0 (BG: 0)
  - data segments : 0
  - node segments : 0
Try to move 0 blocks
  - data blocks : 0
  - node blocks : 0

Extent Hit Ratio: 0 / 0

Balancing F2FS Async:
  - nodes    1 in    2
  - dents    1 in dirs:   1
  - meta    0 in   21
  - NATs     2 > 29120
  - SITs:     0
  - free_nids:  2270

Distribution of User Blocks: [ valid | invalid | free ]
  [|---|-----------------------------------------------]

SSR: 0 blocks in 0 segments
LFS: 0 blocks in 0 segments

BDF: 100, avg. vblocks: 0

Memory: 154 KB = static: 60 + cached: 94


On 06/18/2013 12:44 PM, Oded Gabbay wrote:
Hi,

I would like to share additional information I have from pr_err I put into the update_sit_entry function.

The following is the printout from the terminal. This is only the end of the printout. The start was with "segoff = 3584" and it went sequentially up until 9215, where then it jumped to 99329 and after that 99328 - which is the entry that caused the crash.
Each time I repeated the experiment I got exactly the same results.

I put the pr_err after the line: "offset = GET_SEGOFF_FROM_SEG0(sbi, blkaddr) & (sbi->blocks_per_seg - 1);"
where segoff = GET_SEGOFF_FROM_SEG0(sbi, blkaddr)

Hope this helps.

REMOVE_ME update_sit_entry(202): offset = 0, segoff = 3584, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 1, segoff = 3585, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 2, segoff = 3586, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 3, segoff = 3587, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 4, segoff = 3588, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 5, segoff = 3589, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 6, segoff = 3590, sbi->blocks_per_seg = 512
:::
REMOVE_ME update_sit_entry(202): offset = 502, segoff = 9206, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 503, segoff = 9207, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 504, segoff = 9208, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 505, segoff = 9209, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 506, segoff = 9210, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 507, segoff = 9211, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 508, segoff = 9212, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 509, segoff = 9213, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 510, segoff = 9214, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 511, segoff = 9215, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 1, segoff = 99329, sbi->blocks_per_seg = 512
REMOVE_ME update_sit_entry(202): offset = 0, segoff = 99328, sbi->blocks_per_seg = 512

------------[ cut here ]------------
kernel BUG at /home/ogabbay/views/r5.xcj/software/os-software/powerpc/usr/src/linux-3.9.6-adva/fs/f2fs/segment.c:217!
Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT SMP NR_CPUS=2 P2020 FSP150
Modules linked in: mdio(O) hardware_version(PO) clipresent(PO) monotonic(O) restartcause(PO) panic_buffer(O)
NIP: c026c038 LR: c026bf0c CTR: 00000000
REGS: c5665a60 TRAP: 0700   Tainted: P           O  (3.9.6-dev_ogabbay-109564*)
MSR: 00029000 <CE,EE,ME>  CR: 24f82c48  XER: 20000000
TASK = ef943e80[1774] 'flush-7:0' THREAD: c5664000 CPU: 1
GPR00: 00000000 c5665b10 ef943e80 000000a6 00021000 00000000 f1e26054 725f7365
GPR08: 00000000 c5a13e40 00000040 00000040 00000067 00000000 c5665c64 00000000
GPR16: c1419240 00080000 00000000 00000000 000000bb ef8ce100 00000000 ef8ce134
GPR24: ffffffff ffffff99 000000bb ef8ce100 00000080 f1e57760 ffffffff c561e800
NIP [c026c038] update_sit_entry+0x234/0x23c
LR [c026bf0c] update_sit_entry+0x108/0x23c
Call Trace:
[c5665b10] [c026bed4] update_sit_entry+0xd0/0x23c (unreliable)
[c5665b40] [c026d1e8] do_write_page+0x198/0x660
[c5665b80] [c026d840] write_data_page+0xa4/0xb8
[c5665bc0] [c0265118] do_write_data_page+0x1e8/0x20c
[c5665c20] [c02653dc] f2fs_write_data_page+0x2a0/0x2c0
[c5665c40] [c0263ad8] __f2fs_writepage+0x24/0x80
[c5665c50] [c00b05dc] write_cache_pages+0x1d0/0x35c
[c5665d00] [c0263cf4] f2fs_write_data_pages+0xf4/0xfc
[c5665d30] [c00b1d3c] do_writepages+0x30/0x64
[c5665d40] [c0103fbc] __writeback_single_inode+0x34/0x10c
[c5665d60] [c0104ef8] writeback_sb_inodes+0x204/0x370
[c5665dd0] [c01050f4] __writeback_inodes_wb+0x90/0xd4
[c5665e00] [c01054cc] wb_writeback+0x204/0x20c
[c5665e50] [c0105844] wb_do_writeback+0x144/0x20c
[c5665eb0] [c0105980] bdi_writeback_thread+0x74/0x144
[c5665ee0] [c0059dc4] kthread+0xa8/0xac
[c5665f40] [c000f014] ret_from_kernel_thread+0x64/0x6c
--- Exception: 0 at   (null)
    LR =   (null)
Instruction dump:
4bffff80 813d0004 5780e8fe 7f9ce0f8 39400001 579c077e 7d6900ae 7d5ce030
7d6ae078 7d68e039 7d4901ae 4082ff44 <0fe00000> 0fe00000 9421ffe0 7c0802a6
---[ end trace 707fc0870875373e ]---

Oded
On 06/17/2013 05:21 PM, Oded Gabbay wrote:
Hi,

I also suspect the endian conversion issue.
Attached is a gzip-ed file, which represent an image of a freshly formatted disk in f2fs in powerpc machine. I preferred to do it this way so I could have a small file to send you.
I did the following to create it:

[cj:~] # dd if=/dev/zero of=/tmp/test_file bs=4096 count=102400
102400+0 records in
102400+0 records out
419430400 bytes (419 MB) copied, 1.05112 s, 399 MB/s
[cj:~] # losetup /dev/loop0 /tmp/test_file
[cj:~] # mkfs.f2fs -l label /dev/loop0

        F2FS-tools: mkfs.f2fs Ver: 1.1.0 (2013-06-13)

Info: Label = label
Info: sector size = 512
Info: total sectors = 819200 (in 512bytes)
Info: zone aligned segment0 blkaddr: 512
Info: format successful
[cj:~] # losetup -d /dev/loop0
[cj:/tmp] # gzip test_file

I then run my test application and got the same kernel BUG message.

Oded

On 06/17/2013 03:11 PM, Jaegeuk Kim wrote:
Hi,

Thank you for the report. :)

Can you send me the disk image right after formatting f2fs?
As your previous patch, I strongly suspect the endian conversion bug.

Otherwise, I recommend you to test with the latest tree from:
http://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git

Thanks,

2013-06-16 (일), 14:51 +0300, Oded Gabbay:
Hi,

I'm working on a custom board with a PowerPC processor (Freescale
P2020).
On the board there is an SD card, which is connected to a USB3 chip
(from TI), which is connected to the PCI-e controller of the CPU.
I'm running with Linux kernel 3.9.6, with our custom rootFS.

I formatted an SD card using the mkfs.f2fs utility (after fixing some
Big-endian issues - sent a patch a few days ago).
I then mounted the SD card, using "mount -o
noatime,nodiratime,rw,nosuid,nodev,relatime,active_logs=6,uhelper=udisks2,background_gc_off /dev/sda /mnt/sd1"
Then, I started a small user-space test application which opens a file
on the mount folder and starts to do "fwrite" into the file.
After 2-3 seconds, the kernel gives me a BUG and the system restarts. 
When the system is up and I try to re-mount the SD card, I get the
following error message:

F2FS-fs (sda): Failed to get valid F2FS checkpoint
mount: you must specify the filesystem type

Only way is to re-format the card using mkfs.f2fs

I took the f2fs patch that Jaegeuk Kim sent to Linus for 3.10 (here -
https://lkml.org/lkml/2013/5/8/122) and applied it cleanly to 3.9.6
I repeated the procedure but got the same result.

The BUG is from this line, from segment.c: 
        if (!f2fs_clear_bit(offset, se->cur_valid_map))
            BUG();

Additional information I can give is 

1. I tried using F2FS in ArchLinux, kernel 3.9.5, on an x86 machine,
with the same SD card and the same USB3-to-PCIe chip and it worked
flawlessly there.
2. I can work with other FS on the SD card on our custom board, such
as Ext3, Ext4 and vfat, so this is not a H/W issue.

Could you please try to help me pinpoint/debug the problem ?

Here is the complete kernel BUG print:

kernel BUG at .../linux-3.9.6-adva/fs/f2fs/segment.c:214!
Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT SMP NR_CPUS=2 P2020 FSP150
Modules linked in: mdio(O) hardware_version(PO) clipresent(PO)
monotonic(O) restartcause(PO) panic_buffer(O)
NIP: c026a7e0 LR: c026a660 CTR: 00000000
REGS: ee761a60 TRAP: 0700   Tainted: P           O
(3.9.6-dev_ogabbay-109482*)
MSR: 00029000 <CE,EE,ME>  CR: 24a52588  XER: 20000000
TASK = efb444c0[1755] 'flush-8:0' THREAD: ee760000 CPU: 1
GPR00: 00000000 ee761b10 efb444c0 0000004c 00000000 00000000 01dc4900
eb0fa700 
GPR08: 00000000 eb24cb00 00000040 00000040 00000038 00000000 ee761c64
00000000 
GPR16: c0aeea80 00080000 00000000 00000000 0000ed31 eb0fa700 00000000
eb0fa734 
GPR24: eb0fa700 00000080 f2030620 ffffffff ffffffc8 0000ed31 ffffffff
c55a1000 
NIP [c026a7e0] update_sit_entry+0x240/0x248
LR [c026a660] update_sit_entry+0xc0/0x248
Call Trace:
[ee761b10] [c55a1000] 0xc55a1000 (unreliable)
[ee761b40] [c026d1f4] do_write_page+0x198/0x660
[ee761b80] [c026d84c] write_data_page+0xa4/0xb8
[ee761bc0] [c0265118] do_write_data_page+0x1e8/0x20c
[ee761c20] [c02653dc] f2fs_write_data_page+0x2a0/0x2c0
[ee761c40] [c0263ad8] __f2fs_writepage+0x24/0x80
[ee761c50] [c00b05dc] write_cache_pages+0x1d0/0x35c
[ee761d00] [c0263cf4] f2fs_write_data_pages+0xf4/0xfc
[ee761d30] [c00b1d3c] do_writepages+0x30/0x64
[ee761d40] [c0103fbc] __writeback_single_inode+0x34/0x10c
[ee761d60] [c0104ef8] writeback_sb_inodes+0x204/0x370
[ee761dd0] [c01050f4] __writeback_inodes_wb+0x90/0xd4
[ee761e00] [c01054cc] wb_writeback+0x204/0x20c
[ee761e50] [c0105844] wb_do_writeback+0x144/0x20c
[ee761eb0] [c0105980] bdi_writeback_thread+0x74/0x144
[ee761ee0] [c0059dc4] kthread+0xa8/0xac
[ee761f40] [c000f014] ret_from_kernel_thread+0x64/0x6c
Instruction dump:
4bffff2c 813a0004 5720e8fe 7f39c8f8 39400001 5739077e 7d6900ae
7d59c830 
7d6ac878 7d68c839 7d4901ae 4082fef0 <0fe00000> 0fe00000 9421ffe0
7c0802a6 

-- 
Best regards,
Oded Gabbay
Principal Engineer Advanced Packet Technologies
ADVA Optical Networking Israel Ltd.
P.O. Box 2552
2 Hatidhar St.
Raanana 4366504, Israel
Tel: +(972)-9-7750130
Fax: +(972)-9-7462092
Mobile: +(972)-54-6543998
E-mail: ogabbay@advaoptical.com
 
www.advaoptical.com
Let's ADVANCE
 
ADVA Optical Networking SE is a European stock corporation (\"Societas
Europaea\") with registered offices at Maerzenquelle 1-3, D-98617
Meiningen, Germany * CEO: Brian L. Protiva, Chief Officers: Dr.
Christoph Glingener, Christian Unterberger, Jaswir Singh * Chairman of
the Supervisory Board: Anthony Maher * AG Jena HRB 508155 * VAT No. DE
175 446 349

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel