linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Philippe De Muyter <phdm@macq.eu>
To: linux-f2fs-devel@lists.sourceforge.net
Subject: [f2fs-dev] concern about f2fs and fsck.f2fs
Date: Wed, 14 Sep 2022 00:49:08 +0200	[thread overview]
Message-ID: <20220913224908.GA25100@172.21.0.10> (raw)

Hello,

I use the f2fs version from linux 4.1.15 for the root partition of our product.
At that time, fsck.f2fs did nothing.  But now I feel the need to run fsck.f2fs
in a initramfs at startup.

I have cloned f2fs-tools and compiled version 1.14 of f2fs.fsck, and used
it to check and fix automatically, with option '-y', my root partition.

When running it on a previously fsck'd partition, where I had installed
new files and shutdown the system properly, I have run fsck.f2fs and
it gave me thousands of messages

 [FIX] (fsck_chk_inode_blk:1141)  --> Regular: 0x13cc1 reset i_gc_failures from 0x1 to 0x00
 ...
 [FIX] (fsck_chk_inode_blk:1141)  --> Regular: 0x81b4 reset i_gc_failures from 0x1 to 0x00

and finished with :

 [FSCK] Check node 76051 / 76058 (100.00%)
 
 [FSCK] Max image size: 3600 MB, Free space: 887 MB
 [FSCK] Unreachable nat entries                        [Ok..] [0x0]
 [FSCK] SIT valid block bitmap checking                [Ok..]
 [FSCK] Hard link checking for regular file            [Ok..] [0x956]
 [FSCK] valid_block_count matching with CP             [Ok..] [0x96a9c]
 [FSCK] valid_node_count matching with CP (de lookup)  [Ok..] [0x1291a]
 [FSCK] valid_node_count matching with CP (nat lookup) [Ok..] [0x1291a]
 [FSCK] valid_inode_count matched with CP              [Ok..] [0x1280b]
 [FSCK] free segment_count matched with CP             [Ok..] [0x217]
 [FSCK] next block offset is free                      [Ok..]
 [FSCK] fixing SIT types
 [FSCK] other corrupted bugs                           [Ok..]
 Info: Duplicate valid checkpoint to mirror position 1024 -> 512
 Info: Write valid nat_bits in checkpoint
 Info: Write valid nat_bits in checkpoint
 
 Done: 288.437391 secs
 
But, afterwards, when the f2fs driver in the kernel worked with the fixed file
system, it complained, only once, with :

 [ 2349.673407] ------------[ cut here ]------------
 [ 2349.676777] WARNING: CPU: 0 PID: 2359 at fs/f2fs/node.c:1863 flush_nat_entries+0x734/0x7c4()
 [ 2349.683996] Modules linked in:
 [ 2349.685796] CPU: 0 PID: 2359 Comm: python3 Not tainted 4.1.15-02177-gcef0cbe-dirty #166
 [ 2349.692527] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
 [ 2349.697780] [<80015f58>] (unwind_backtrace) from [<80012020>] (show_stack+0x10/0x14)
 [ 2349.704273] [<80012020>] (show_stack) from [<80732e64>] (dump_stack+0x68/0xb8)
 [ 2349.710205] [<80732e64>] (dump_stack) from [<8002b694>] (warn_slowpath_common+0x74/0xac)
 [ 2349.717031] [<8002b694>] (warn_slowpath_common) from [<8002b6e8>] (warn_slowpath_null+0x1c/0x24)
 [ 2349.724545] [<8002b6e8>] (warn_slowpath_null) from [<8024f8dc>] (flush_nat_entries+0x734/0x7c4)
 [ 2349.731950] [<8024f8dc>] (flush_nat_entries) from [<8024456c>] (write_checkpoint+0x208/0xe68)
 [ 2349.739207] [<8024456c>] (write_checkpoint) from [<802400c4>] (f2fs_sync_fs+0x50/0x70)
 [ 2349.745859] [<802400c4>] (f2fs_sync_fs) from [<8010436c>] (sync_fs_one_sb+0x28/0x2c)
 [ 2349.752315] [<8010436c>] (sync_fs_one_sb) from [<800df9e0>] (iterate_supers+0xac/0xd4)
 [ 2349.758960] [<800df9e0>] (iterate_supers) from [<80104414>] (sys_sync+0x48/0x98)
 [ 2349.765094] [<80104414>] (sys_sync) from [<8000f440>] (ret_fast_syscall+0x0/0x3c)
 [ 2349.771296] ---[ end trace 5d91f10cd7a61715 ]---

fs/f2fs/node.c:1863 is here :

        /* flush dirty nats in nat entry set */
        list_for_each_entry_safe(ne, cur, &set->entry_list, list) {
                struct f2fs_nat_entry *raw_ne;
                nid_t nid = nat_get_nid(ne);
                int offset;

                if (nat_get_blkaddr(ne) == NEW_ADDR)
                        continue;

                if (to_journal) {
                        offset = lookup_journal_in_cursum(sum,
                                                        NAT_JOURNAL, nid, 1);
HERE >>>>>>>>>>         f2fs_bug_on(sbi, offset < 0);
                        raw_ne = &nat_in_journal(sum, offset);
                        nid_in_journal(sum, offset) = cpu_to_le32(nid);
                } else {
                        raw_ne = &nat_blk->entries[nid - start_nid];
                }
                raw_nat_from_node_info(raw_ne, &ne->ni);

                down_write(&NM_I(sbi)->nat_tree_lock);
                nat_reset_flag(ne);
                __clear_nat_cache_dirty(NM_I(sbi), ne);
                up_write(&NM_I(sbi)->nat_tree_lock);

                if (nat_get_blkaddr(ne) == NULL_ADDR)
                        add_free_nid(sbi, nid, false);
        }

Is that 'offset < 0', probably '-1' as error return, really unexpected ?
Is it caused by the fsck run on the filesystem ?
Why does the driver continue with the negative offet instead of taking
an error path ?
Crashing the kernel is not an option as this runs on unattended devices,
and consequently F2FS_CHECK_FS is not set.

Afterwards, the kernel continued to run but issued those messages

 attempt to access beyond end of device
 mmcblk0p2: rw=16384, want=5798631936, limit=7372800

with various rw and want values.

After a second clean shutdown and reboot, initramfs ran again fsck.f2fs, and
this time I had much less 'reset i_gc_failures' messages, but many 'Set node summary'
and some 'Set data summary 0x6b1' messages, and a summary with 'Fail' reports.

 [FIX] (is_valid_ssa_node_blk: 201)  --> Set node summary 0x67a -> [0x9428] [0xd165c]
 [FIX] (is_valid_ssa_node_blk: 201)  --> Set node summary 0x67a -> [0x9429] [0xd165d]
 [FIX] (is_valid_ssa_node_blk: 201)  --> Set node summary 0x67a -> [0x95aa] [0xd168a]
 [FIX] (is_valid_ssa_node_blk: 201)  --> Set node summary 0x67a -> [0x95ab] [0xd168b]
 [FSCK] Check node 76051 / 76058 (100.00%)
 
 NID[0x9f4d] is unreachable, blkaddr:0xcd353
 [FSCK] Max image size: 3600 MB, Free space: 887 MB
 [FSCK] Unreachable nat entries                        [Fail] [0x1]
 [FSCK] SIT valid block bitmap checking                [Ok..]
 [FSCK] Hard link checking for regular file            [Ok..] [0x956]
 [FSCK] valid_block_count matching with CP             [Ok..] [0x96a9c]
 [FSCK] valid_node_count matching with CP (de lookup)  [Ok..] [0x1291a]
 [FSCK] valid_node_count matching with CP (nat lookup) [Fail] [0x1291b]
 [FSCK] valid_inode_count matched with CP              [Ok..] [0x1280b]
 [FSCK] free segment_count matched with CP             [Ok..] [0x23a]
 [FSCK] next block offset is free                      [Ok..]
 [FSCK] fixing SIT types
 [FSCK] other corrupted bugs                           [Ok..]

Thereafter, while running linux I haven't received any warning about 'offset < 0',
nor about 'attempt to access beyond end of device'.

But now I feel unsecure about my filesystem,

Should I use an older version of fsck.f2fs (which one) or 1.15 ?

Is there a patch to do something more than spit a warning but still continue
with that negative offset ?

Best regards

Philippe


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

                 reply	other threads:[~2022-09-13 23:05 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220913224908.GA25100@172.21.0.10 \
    --to=phdm@macq.eu \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).