* [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
@ 2023-05-24 2:42 Chunhai Guo via Linux-f2fs-devel
2023-05-25 1:23 ` Chao Yu
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Chunhai Guo via Linux-f2fs-devel @ 2023-05-24 2:42 UTC (permalink / raw)
To: jaegeuk; +Cc: Chunhai Guo, linux-f2fs-devel, frank.li
find_fsync_inode() detect the looped node chain by comparing the loop
counter with free blocks. While it may take tens of seconds to quit when
the free blocks are large enough. We can use Floyd's cycle detection
algorithm to make the detection more efficient, and fix the issue by
filling a NULL address in the last node of the chain.
Below is the log we encounter on a 256GB UFS storage and it takes about
25 seconds to detect looped node chain. After changing the algorithm, it
takes about 20ms to finish the same job.
[ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
[ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
update superblock
[ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
encrypt verity extra_attr project_quota quota_ino casefold
[ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
00000000000000000000000000000000
[ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
MB)
[ 35.852827] fsck.f2fs: detect looped node chain,
blkaddr:1114802, next:1114803
[ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
failed
[ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
---
v7 -> v8 : Reformat the code to reduce indention.
v6 -> v7 : Correct logic error to handle is_detecting return by
find_node_blk_fast().
v5 -> v6 : Simplify the code by removing unnecessary retry logic.
v4 -> v5 : Use IS_INODE() to make the code more clear.
v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
it only if c.fix_on is 1.
v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
v1 -> v2 : Fix looped node chain directly after it is detected.
---
fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 112 insertions(+), 15 deletions(-)
diff --git a/fsck/mount.c b/fsck/mount.c
index 4c7488840c7c..9d6a222a055c 100644
--- a/fsck/mount.c
+++ b/fsck/mount.c
@@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
del_fsync_inode(entry);
}
+static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
+ struct f2fs_node *node_blk_fast, bool *is_detecting)
+{
+ int i, err;
+
+ for (i = 0; i < 2; i++) {
+ if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
+ *is_detecting = false;
+ return 0;
+ }
+
+ err = dev_read_block(node_blk_fast, *blkaddr_fast);
+ if (err)
+ return err;
+
+ if (!is_recoverable_dnode(sbi, node_blk_fast)) {
+ *is_detecting = false;
+ return 0;
+ }
+
+ *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
+ }
+
+ return 0;
+}
+
+static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
+ block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
+ block_t blkaddr, struct f2fs_node *node_blk)
+{
+ block_t blkaddr_entry, blkaddr_tmp;
+ int err;
+
+ /* find the entry point of the looped node chain */
+ while (blkaddr_fast != blkaddr) {
+ err = dev_read_block(node_blk_fast, blkaddr_fast);
+ if (err)
+ return err;
+ blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
+
+ err = dev_read_block(node_blk, blkaddr);
+ if (err)
+ return err;
+ blkaddr = next_blkaddr_of_node(node_blk);
+ }
+ blkaddr_entry = blkaddr;
+
+ /* find the last node of the chain */
+ do {
+ blkaddr_tmp = blkaddr;
+ err = dev_read_block(node_blk, blkaddr);
+ if (err)
+ return err;
+ blkaddr = next_blkaddr_of_node(node_blk);
+ } while (blkaddr != blkaddr_entry);
+
+ /* fix the blkaddr of last node with NULL_ADDR. */
+ node_blk->footer.next_blkaddr = NULL_ADDR;
+ if (IS_INODE(node_blk))
+ err = write_inode(node_blk, blkaddr_tmp);
+ else
+ err = dev_write_block(node_blk, blkaddr_tmp);
+ if (!err)
+ FIX_MSG("Fix looped node chain on blkaddr %u\n",
+ blkaddr_tmp);
+ return err;
+}
+
static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
{
struct curseg_info *curseg;
- struct f2fs_node *node_blk;
- block_t blkaddr;
- unsigned int loop_cnt = 0;
- unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
- sbi->total_valid_block_count;
+ struct f2fs_node *node_blk, *node_blk_fast;
+ block_t blkaddr, blkaddr_fast;
+ bool is_detecting = true;
int err = 0;
+ node_blk = calloc(F2FS_BLKSIZE, 1);
+ node_blk_fast = calloc(F2FS_BLKSIZE, 1);
+ ASSERT(node_blk && node_blk_fast);
+
/* get node pages in the current segment */
curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
-
- node_blk = calloc(F2FS_BLKSIZE, 1);
- ASSERT(node_blk);
+ blkaddr_fast = blkaddr;
while (1) {
struct fsync_inode_entry *entry;
@@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
entry->last_dentry = blkaddr;
next:
- /* sanity check in order to detect looped node chain */
- if (++loop_cnt >= free_blocks ||
- blkaddr == next_blkaddr_of_node(node_blk)) {
- MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
- blkaddr,
- next_blkaddr_of_node(node_blk));
+ blkaddr = next_blkaddr_of_node(node_blk);
+
+ /* Below we will detect looped node chain with Floyd's cycle
+ * detection algorithm.
+ */
+ if (!is_detecting)
+ continue;
+
+ err = find_node_blk_fast(sbi, &blkaddr_fast,
+ node_blk_fast, &is_detecting);
+ if (err)
+ break;
+
+ if (!is_detecting)
+ continue;
+
+ if (blkaddr_fast != blkaddr)
+ continue;
+
+ ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
+ blkaddr);
+
+ if (!c.fix_on) {
err = -1;
break;
}
- blkaddr = next_blkaddr_of_node(node_blk);
+ err = loop_node_chain_fix(sbi,
+ NEXT_FREE_BLKADDR(sbi, curseg),
+ node_blk_fast, blkaddr, node_blk);
+ if (err)
+ break;
+
+ /* Since we call get_fsync_inode() to ensure there are no
+ * duplicate inodes in the inode_list even if there are
+ * duplicate blkaddr, we can continue running after fixing the
+ * looped node chain.
+ */
+ is_detecting = false;
}
+ free(node_blk_fast);
free(node_blk);
return err;
}
--
2.25.1
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-05-24 2:42 [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently Chunhai Guo via Linux-f2fs-devel
@ 2023-05-25 1:23 ` Chao Yu
2023-05-25 22:36 ` Jaegeuk Kim
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Chao Yu @ 2023-05-25 1:23 UTC (permalink / raw)
To: Chunhai Guo, jaegeuk; +Cc: linux-f2fs-devel, frank.li
On 2023/5/24 10:42, Chunhai Guo wrote:
> find_fsync_inode() detect the looped node chain by comparing the loop
> counter with free blocks. While it may take tens of seconds to quit when
> the free blocks are large enough. We can use Floyd's cycle detection
> algorithm to make the detection more efficient, and fix the issue by
> filling a NULL address in the last node of the chain.
>
> Below is the log we encounter on a 256GB UFS storage and it takes about
> 25 seconds to detect looped node chain. After changing the algorithm, it
> takes about 20ms to finish the same job.
>
> [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
> [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
> update superblock
> [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
> encrypt verity extra_attr project_quota quota_ino casefold
> [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
> 00000000000000000000000000000000
> [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
> MB)
> [ 35.852827] fsck.f2fs: detect looped node chain,
> blkaddr:1114802, next:1114803
> [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
> failed
> [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
>
> Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Thanks,
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-05-24 2:42 [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently Chunhai Guo via Linux-f2fs-devel
2023-05-25 1:23 ` Chao Yu
@ 2023-05-25 22:36 ` Jaegeuk Kim
2023-06-14 9:27 ` Chunhai Guo via Linux-f2fs-devel
2023-06-23 1:52 ` Chao Yu
3 siblings, 0 replies; 11+ messages in thread
From: Jaegeuk Kim @ 2023-05-25 22:36 UTC (permalink / raw)
To: Chunhai Guo; +Cc: linux-f2fs-devel, frank.li
On 05/24, Chunhai Guo wrote:
> find_fsync_inode() detect the looped node chain by comparing the loop
> counter with free blocks. While it may take tens of seconds to quit when
> the free blocks are large enough. We can use Floyd's cycle detection
> algorithm to make the detection more efficient, and fix the issue by
> filling a NULL address in the last node of the chain.
>
> Below is the log we encounter on a 256GB UFS storage and it takes about
> 25 seconds to detect looped node chain. After changing the algorithm, it
> takes about 20ms to finish the same job.
>
> [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
> [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
> update superblock
> [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
> encrypt verity extra_attr project_quota quota_ino casefold
> [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
> 00000000000000000000000000000000
> [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
> MB)
> [ 35.852827] fsck.f2fs: detect looped node chain,
> blkaddr:1114802, next:1114803
> [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
> failed
> [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
>
> Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
> ---
> v7 -> v8 : Reformat the code to reduce indention.
> v6 -> v7 : Correct logic error to handle is_detecting return by
> find_node_blk_fast().
> v5 -> v6 : Simplify the code by removing unnecessary retry logic.
> v4 -> v5 : Use IS_INODE() to make the code more clear.
> v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
> it only if c.fix_on is 1.
> v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
> v1 -> v2 : Fix looped node chain directly after it is detected.
> ---
> fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 112 insertions(+), 15 deletions(-)
>
> diff --git a/fsck/mount.c b/fsck/mount.c
> index 4c7488840c7c..9d6a222a055c 100644
> --- a/fsck/mount.c
> +++ b/fsck/mount.c
> @@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
> del_fsync_inode(entry);
> }
>
> +static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
> + struct f2fs_node *node_blk_fast, bool *is_detecting)
> +{
> + int i, err;
> +
> + for (i = 0; i < 2; i++) {
> + if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
> + *is_detecting = false;
> + return 0;
> + }
> +
> + err = dev_read_block(node_blk_fast, *blkaddr_fast);
> + if (err)
> + return err;
> +
> + if (!is_recoverable_dnode(sbi, node_blk_fast)) {
> + *is_detecting = false;
> + return 0;
> + }
> +
> + *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
> + }
> +
> + return 0;
> +}
> +
> +static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
Applied after removing sbi pointer which is unneeded.
> + block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
> + block_t blkaddr, struct f2fs_node *node_blk)
> +{
> + block_t blkaddr_entry, blkaddr_tmp;
> + int err;
> +
> + /* find the entry point of the looped node chain */
> + while (blkaddr_fast != blkaddr) {
> + err = dev_read_block(node_blk_fast, blkaddr_fast);
> + if (err)
> + return err;
> + blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
> +
> + err = dev_read_block(node_blk, blkaddr);
> + if (err)
> + return err;
> + blkaddr = next_blkaddr_of_node(node_blk);
> + }
> + blkaddr_entry = blkaddr;
> +
> + /* find the last node of the chain */
> + do {
> + blkaddr_tmp = blkaddr;
> + err = dev_read_block(node_blk, blkaddr);
> + if (err)
> + return err;
> + blkaddr = next_blkaddr_of_node(node_blk);
> + } while (blkaddr != blkaddr_entry);
> +
> + /* fix the blkaddr of last node with NULL_ADDR. */
> + node_blk->footer.next_blkaddr = NULL_ADDR;
> + if (IS_INODE(node_blk))
> + err = write_inode(node_blk, blkaddr_tmp);
> + else
> + err = dev_write_block(node_blk, blkaddr_tmp);
> + if (!err)
> + FIX_MSG("Fix looped node chain on blkaddr %u\n",
> + blkaddr_tmp);
> + return err;
> +}
> +
> static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
> {
> struct curseg_info *curseg;
> - struct f2fs_node *node_blk;
> - block_t blkaddr;
> - unsigned int loop_cnt = 0;
> - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
> - sbi->total_valid_block_count;
> + struct f2fs_node *node_blk, *node_blk_fast;
> + block_t blkaddr, blkaddr_fast;
> + bool is_detecting = true;
> int err = 0;
>
> + node_blk = calloc(F2FS_BLKSIZE, 1);
> + node_blk_fast = calloc(F2FS_BLKSIZE, 1);
> + ASSERT(node_blk && node_blk_fast);
> +
> /* get node pages in the current segment */
> curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
> blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
> -
> - node_blk = calloc(F2FS_BLKSIZE, 1);
> - ASSERT(node_blk);
> + blkaddr_fast = blkaddr;
>
> while (1) {
> struct fsync_inode_entry *entry;
> @@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
> if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
> entry->last_dentry = blkaddr;
> next:
> - /* sanity check in order to detect looped node chain */
> - if (++loop_cnt >= free_blocks ||
> - blkaddr == next_blkaddr_of_node(node_blk)) {
> - MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
> - blkaddr,
> - next_blkaddr_of_node(node_blk));
> + blkaddr = next_blkaddr_of_node(node_blk);
> +
> + /* Below we will detect looped node chain with Floyd's cycle
> + * detection algorithm.
> + */
> + if (!is_detecting)
> + continue;
> +
> + err = find_node_blk_fast(sbi, &blkaddr_fast,
> + node_blk_fast, &is_detecting);
> + if (err)
> + break;
> +
> + if (!is_detecting)
> + continue;
> +
> + if (blkaddr_fast != blkaddr)
> + continue;
> +
> + ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
> + blkaddr);
> +
> + if (!c.fix_on) {
> err = -1;
> break;
> }
>
> - blkaddr = next_blkaddr_of_node(node_blk);
> + err = loop_node_chain_fix(sbi,
> + NEXT_FREE_BLKADDR(sbi, curseg),
> + node_blk_fast, blkaddr, node_blk);
> + if (err)
> + break;
> +
> + /* Since we call get_fsync_inode() to ensure there are no
> + * duplicate inodes in the inode_list even if there are
> + * duplicate blkaddr, we can continue running after fixing the
> + * looped node chain.
> + */
> + is_detecting = false;
> }
>
> + free(node_blk_fast);
> free(node_blk);
> return err;
> }
> --
> 2.25.1
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-05-24 2:42 [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently Chunhai Guo via Linux-f2fs-devel
2023-05-25 1:23 ` Chao Yu
2023-05-25 22:36 ` Jaegeuk Kim
@ 2023-06-14 9:27 ` Chunhai Guo via Linux-f2fs-devel
2023-06-14 16:15 ` Jaegeuk Kim
2023-06-17 2:20 ` Chao Yu
2023-06-23 1:52 ` Chao Yu
3 siblings, 2 replies; 11+ messages in thread
From: Chunhai Guo via Linux-f2fs-devel @ 2023-06-14 9:27 UTC (permalink / raw)
To: jaegeuk@kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net, 李扬韬
Hi Jaegeuk,
Could you please help to confirm if this patch has been merged? I cannot
see the patch in the dev-test or dev branch.
Thanks.
On 2023/5/24 10:42, 郭纯海 wrote:
> find_fsync_inode() detect the looped node chain by comparing the loop
> counter with free blocks. While it may take tens of seconds to quit when
> the free blocks are large enough. We can use Floyd's cycle detection
> algorithm to make the detection more efficient, and fix the issue by
> filling a NULL address in the last node of the chain.
>
> Below is the log we encounter on a 256GB UFS storage and it takes about
> 25 seconds to detect looped node chain. After changing the algorithm, it
> takes about 20ms to finish the same job.
>
> [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
> [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
> update superblock
> [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
> encrypt verity extra_attr project_quota quota_ino casefold
> [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
> 00000000000000000000000000000000
> [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
> MB)
> [ 35.852827] fsck.f2fs: detect looped node chain,
> blkaddr:1114802, next:1114803
> [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
> failed
> [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
>
> Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
> ---
> v7 -> v8 : Reformat the code to reduce indention.
> v6 -> v7 : Correct logic error to handle is_detecting return by
> find_node_blk_fast().
> v5 -> v6 : Simplify the code by removing unnecessary retry logic.
> v4 -> v5 : Use IS_INODE() to make the code more clear.
> v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
> it only if c.fix_on is 1.
> v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
> v1 -> v2 : Fix looped node chain directly after it is detected.
> ---
> fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 112 insertions(+), 15 deletions(-)
>
> diff --git a/fsck/mount.c b/fsck/mount.c
> index 4c7488840c7c..9d6a222a055c 100644
> --- a/fsck/mount.c
> +++ b/fsck/mount.c
> @@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
> del_fsync_inode(entry);
> }
>
> +static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
> + struct f2fs_node *node_blk_fast, bool *is_detecting)
> +{
> + int i, err;
> +
> + for (i = 0; i < 2; i++) {
> + if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
> + *is_detecting = false;
> + return 0;
> + }
> +
> + err = dev_read_block(node_blk_fast, *blkaddr_fast);
> + if (err)
> + return err;
> +
> + if (!is_recoverable_dnode(sbi, node_blk_fast)) {
> + *is_detecting = false;
> + return 0;
> + }
> +
> + *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
> + }
> +
> + return 0;
> +}
> +
> +static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
> + block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
> + block_t blkaddr, struct f2fs_node *node_blk)
> +{
> + block_t blkaddr_entry, blkaddr_tmp;
> + int err;
> +
> + /* find the entry point of the looped node chain */
> + while (blkaddr_fast != blkaddr) {
> + err = dev_read_block(node_blk_fast, blkaddr_fast);
> + if (err)
> + return err;
> + blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
> +
> + err = dev_read_block(node_blk, blkaddr);
> + if (err)
> + return err;
> + blkaddr = next_blkaddr_of_node(node_blk);
> + }
> + blkaddr_entry = blkaddr;
> +
> + /* find the last node of the chain */
> + do {
> + blkaddr_tmp = blkaddr;
> + err = dev_read_block(node_blk, blkaddr);
> + if (err)
> + return err;
> + blkaddr = next_blkaddr_of_node(node_blk);
> + } while (blkaddr != blkaddr_entry);
> +
> + /* fix the blkaddr of last node with NULL_ADDR. */
> + node_blk->footer.next_blkaddr = NULL_ADDR;
> + if (IS_INODE(node_blk))
> + err = write_inode(node_blk, blkaddr_tmp);
> + else
> + err = dev_write_block(node_blk, blkaddr_tmp);
> + if (!err)
> + FIX_MSG("Fix looped node chain on blkaddr %u\n",
> + blkaddr_tmp);
> + return err;
> +}
> +
> static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
> {
> struct curseg_info *curseg;
> - struct f2fs_node *node_blk;
> - block_t blkaddr;
> - unsigned int loop_cnt = 0;
> - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
> - sbi->total_valid_block_count;
> + struct f2fs_node *node_blk, *node_blk_fast;
> + block_t blkaddr, blkaddr_fast;
> + bool is_detecting = true;
> int err = 0;
>
> + node_blk = calloc(F2FS_BLKSIZE, 1);
> + node_blk_fast = calloc(F2FS_BLKSIZE, 1);
> + ASSERT(node_blk && node_blk_fast);
> +
> /* get node pages in the current segment */
> curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
> blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
> -
> - node_blk = calloc(F2FS_BLKSIZE, 1);
> - ASSERT(node_blk);
> + blkaddr_fast = blkaddr;
>
> while (1) {
> struct fsync_inode_entry *entry;
> @@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
> if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
> entry->last_dentry = blkaddr;
> next:
> - /* sanity check in order to detect looped node chain */
> - if (++loop_cnt >= free_blocks ||
> - blkaddr == next_blkaddr_of_node(node_blk)) {
> - MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
> - blkaddr,
> - next_blkaddr_of_node(node_blk));
> + blkaddr = next_blkaddr_of_node(node_blk);
> +
> + /* Below we will detect looped node chain with Floyd's cycle
> + * detection algorithm.
> + */
> + if (!is_detecting)
> + continue;
> +
> + err = find_node_blk_fast(sbi, &blkaddr_fast,
> + node_blk_fast, &is_detecting);
> + if (err)
> + break;
> +
> + if (!is_detecting)
> + continue;
> +
> + if (blkaddr_fast != blkaddr)
> + continue;
> +
> + ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
> + blkaddr);
> +
> + if (!c.fix_on) {
> err = -1;
> break;
> }
>
> - blkaddr = next_blkaddr_of_node(node_blk);
> + err = loop_node_chain_fix(sbi,
> + NEXT_FREE_BLKADDR(sbi, curseg),
> + node_blk_fast, blkaddr, node_blk);
> + if (err)
> + break;
> +
> + /* Since we call get_fsync_inode() to ensure there are no
> + * duplicate inodes in the inode_list even if there are
> + * duplicate blkaddr, we can continue running after fixing the
> + * looped node chain.
> + */
> + is_detecting = false;
> }
>
> + free(node_blk_fast);
> free(node_blk);
> return err;
> }
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-06-14 9:27 ` Chunhai Guo via Linux-f2fs-devel
@ 2023-06-14 16:15 ` Jaegeuk Kim
2023-06-15 2:12 ` Chunhai Guo via Linux-f2fs-devel
2023-06-17 2:20 ` Chao Yu
1 sibling, 1 reply; 11+ messages in thread
From: Jaegeuk Kim @ 2023-06-14 16:15 UTC (permalink / raw)
To: Chunhai Guo
Cc: linux-f2fs-devel@lists.sourceforge.net, 李扬韬
On 06/14, Chunhai Guo wrote:
> Hi Jaegeuk,
>
> Could you please help to confirm if this patch has been merged? I cannot see
> the patch in the dev-test or dev branch.
Thanks. Somehow it was dropped. I start to test again.
>
> Thanks.
>
> On 2023/5/24 10:42, 郭纯海 wrote:
> > find_fsync_inode() detect the looped node chain by comparing the loop
> > counter with free blocks. While it may take tens of seconds to quit when
> > the free blocks are large enough. We can use Floyd's cycle detection
> > algorithm to make the detection more efficient, and fix the issue by
> > filling a NULL address in the last node of the chain.
> >
> > Below is the log we encounter on a 256GB UFS storage and it takes about
> > 25 seconds to detect looped node chain. After changing the algorithm, it
> > takes about 20ms to finish the same job.
> >
> > [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
> > [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
> > update superblock
> > [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
> > encrypt verity extra_attr project_quota quota_ino casefold
> > [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
> > 00000000000000000000000000000000
> > [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
> > MB)
> > [ 35.852827] fsck.f2fs: detect looped node chain,
> > blkaddr:1114802, next:1114803
> > [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
> > failed
> > [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
> >
> > Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
> > ---
> > v7 -> v8 : Reformat the code to reduce indention.
> > v6 -> v7 : Correct logic error to handle is_detecting return by
> > find_node_blk_fast().
> > v5 -> v6 : Simplify the code by removing unnecessary retry logic.
> > v4 -> v5 : Use IS_INODE() to make the code more clear.
> > v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
> > it only if c.fix_on is 1.
> > v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
> > v1 -> v2 : Fix looped node chain directly after it is detected.
> > ---
> > fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 112 insertions(+), 15 deletions(-)
> >
> > diff --git a/fsck/mount.c b/fsck/mount.c
> > index 4c7488840c7c..9d6a222a055c 100644
> > --- a/fsck/mount.c
> > +++ b/fsck/mount.c
> > @@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
> > del_fsync_inode(entry);
> > }
> > +static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
> > + struct f2fs_node *node_blk_fast, bool *is_detecting)
> > +{
> > + int i, err;
> > +
> > + for (i = 0; i < 2; i++) {
> > + if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
> > + *is_detecting = false;
> > + return 0;
> > + }
> > +
> > + err = dev_read_block(node_blk_fast, *blkaddr_fast);
> > + if (err)
> > + return err;
> > +
> > + if (!is_recoverable_dnode(sbi, node_blk_fast)) {
> > + *is_detecting = false;
> > + return 0;
> > + }
> > +
> > + *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
> > + block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
> > + block_t blkaddr, struct f2fs_node *node_blk)
> > +{
> > + block_t blkaddr_entry, blkaddr_tmp;
> > + int err;
> > +
> > + /* find the entry point of the looped node chain */
> > + while (blkaddr_fast != blkaddr) {
> > + err = dev_read_block(node_blk_fast, blkaddr_fast);
> > + if (err)
> > + return err;
> > + blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
> > +
> > + err = dev_read_block(node_blk, blkaddr);
> > + if (err)
> > + return err;
> > + blkaddr = next_blkaddr_of_node(node_blk);
> > + }
> > + blkaddr_entry = blkaddr;
> > +
> > + /* find the last node of the chain */
> > + do {
> > + blkaddr_tmp = blkaddr;
> > + err = dev_read_block(node_blk, blkaddr);
> > + if (err)
> > + return err;
> > + blkaddr = next_blkaddr_of_node(node_blk);
> > + } while (blkaddr != blkaddr_entry);
> > +
> > + /* fix the blkaddr of last node with NULL_ADDR. */
> > + node_blk->footer.next_blkaddr = NULL_ADDR;
> > + if (IS_INODE(node_blk))
> > + err = write_inode(node_blk, blkaddr_tmp);
> > + else
> > + err = dev_write_block(node_blk, blkaddr_tmp);
> > + if (!err)
> > + FIX_MSG("Fix looped node chain on blkaddr %u\n",
> > + blkaddr_tmp);
> > + return err;
> > +}
> > +
> > static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
> > {
> > struct curseg_info *curseg;
> > - struct f2fs_node *node_blk;
> > - block_t blkaddr;
> > - unsigned int loop_cnt = 0;
> > - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
> > - sbi->total_valid_block_count;
> > + struct f2fs_node *node_blk, *node_blk_fast;
> > + block_t blkaddr, blkaddr_fast;
> > + bool is_detecting = true;
> > int err = 0;
> > + node_blk = calloc(F2FS_BLKSIZE, 1);
> > + node_blk_fast = calloc(F2FS_BLKSIZE, 1);
> > + ASSERT(node_blk && node_blk_fast);
> > +
> > /* get node pages in the current segment */
> > curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
> > blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
> > -
> > - node_blk = calloc(F2FS_BLKSIZE, 1);
> > - ASSERT(node_blk);
> > + blkaddr_fast = blkaddr;
> > while (1) {
> > struct fsync_inode_entry *entry;
> > @@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
> > if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
> > entry->last_dentry = blkaddr;
> > next:
> > - /* sanity check in order to detect looped node chain */
> > - if (++loop_cnt >= free_blocks ||
> > - blkaddr == next_blkaddr_of_node(node_blk)) {
> > - MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
> > - blkaddr,
> > - next_blkaddr_of_node(node_blk));
> > + blkaddr = next_blkaddr_of_node(node_blk);
> > +
> > + /* Below we will detect looped node chain with Floyd's cycle
> > + * detection algorithm.
> > + */
> > + if (!is_detecting)
> > + continue;
> > +
> > + err = find_node_blk_fast(sbi, &blkaddr_fast,
> > + node_blk_fast, &is_detecting);
> > + if (err)
> > + break;
> > +
> > + if (!is_detecting)
> > + continue;
> > +
> > + if (blkaddr_fast != blkaddr)
> > + continue;
> > +
> > + ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
> > + blkaddr);
> > +
> > + if (!c.fix_on) {
> > err = -1;
> > break;
> > }
> > - blkaddr = next_blkaddr_of_node(node_blk);
> > + err = loop_node_chain_fix(sbi,
> > + NEXT_FREE_BLKADDR(sbi, curseg),
> > + node_blk_fast, blkaddr, node_blk);
> > + if (err)
> > + break;
> > +
> > + /* Since we call get_fsync_inode() to ensure there are no
> > + * duplicate inodes in the inode_list even if there are
> > + * duplicate blkaddr, we can continue running after fixing the
> > + * looped node chain.
> > + */
> > + is_detecting = false;
> > }
> > + free(node_blk_fast);
> > free(node_blk);
> > return err;
> > }
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-06-14 16:15 ` Jaegeuk Kim
@ 2023-06-15 2:12 ` Chunhai Guo via Linux-f2fs-devel
0 siblings, 0 replies; 11+ messages in thread
From: Chunhai Guo via Linux-f2fs-devel @ 2023-06-15 2:12 UTC (permalink / raw)
To: Jaegeuk Kim
Cc: linux-f2fs-devel@lists.sourceforge.net, 李扬韬
Got it. Thanks.
On 2023/6/15 0:15, Jaegeuk Kim wrote:
> On 06/14, Chunhai Guo wrote:
>> Hi Jaegeuk,
>>
>> Could you please help to confirm if this patch has been merged? I cannot see
>> the patch in the dev-test or dev branch.
>
> Thanks. Somehow it was dropped. I start to test again.
>
>>
>> Thanks.
>>
>> On 2023/5/24 10:42, 郭纯海 wrote:
>>> find_fsync_inode() detect the looped node chain by comparing the loop
>>> counter with free blocks. While it may take tens of seconds to quit when
>>> the free blocks are large enough. We can use Floyd's cycle detection
>>> algorithm to make the detection more efficient, and fix the issue by
>>> filling a NULL address in the last node of the chain.
>>>
>>> Below is the log we encounter on a 256GB UFS storage and it takes about
>>> 25 seconds to detect looped node chain. After changing the algorithm, it
>>> takes about 20ms to finish the same job.
>>>
>>> [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
>>> [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
>>> update superblock
>>> [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
>>> encrypt verity extra_attr project_quota quota_ino casefold
>>> [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
>>> 00000000000000000000000000000000
>>> [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
>>> MB)
>>> [ 35.852827] fsck.f2fs: detect looped node chain,
>>> blkaddr:1114802, next:1114803
>>> [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
>>> failed
>>> [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
>>>
>>> Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
>>> ---
>>> v7 -> v8 : Reformat the code to reduce indention.
>>> v6 -> v7 : Correct logic error to handle is_detecting return by
>>> find_node_blk_fast().
>>> v5 -> v6 : Simplify the code by removing unnecessary retry logic.
>>> v4 -> v5 : Use IS_INODE() to make the code more clear.
>>> v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
>>> it only if c.fix_on is 1.
>>> v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
>>> v1 -> v2 : Fix looped node chain directly after it is detected.
>>> ---
>>> fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
>>> 1 file changed, 112 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/fsck/mount.c b/fsck/mount.c
>>> index 4c7488840c7c..9d6a222a055c 100644
>>> --- a/fsck/mount.c
>>> +++ b/fsck/mount.c
>>> @@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
>>> del_fsync_inode(entry);
>>> }
>>> +static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
>>> + struct f2fs_node *node_blk_fast, bool *is_detecting)
>>> +{
>>> + int i, err;
>>> +
>>> + for (i = 0; i < 2; i++) {
>>> + if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
>>> + *is_detecting = false;
>>> + return 0;
>>> + }
>>> +
>>> + err = dev_read_block(node_blk_fast, *blkaddr_fast);
>>> + if (err)
>>> + return err;
>>> +
>>> + if (!is_recoverable_dnode(sbi, node_blk_fast)) {
>>> + *is_detecting = false;
>>> + return 0;
>>> + }
>>> +
>>> + *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
>>> + block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
>>> + block_t blkaddr, struct f2fs_node *node_blk)
>>> +{
>>> + block_t blkaddr_entry, blkaddr_tmp;
>>> + int err;
>>> +
>>> + /* find the entry point of the looped node chain */
>>> + while (blkaddr_fast != blkaddr) {
>>> + err = dev_read_block(node_blk_fast, blkaddr_fast);
>>> + if (err)
>>> + return err;
>>> + blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
>>> +
>>> + err = dev_read_block(node_blk, blkaddr);
>>> + if (err)
>>> + return err;
>>> + blkaddr = next_blkaddr_of_node(node_blk);
>>> + }
>>> + blkaddr_entry = blkaddr;
>>> +
>>> + /* find the last node of the chain */
>>> + do {
>>> + blkaddr_tmp = blkaddr;
>>> + err = dev_read_block(node_blk, blkaddr);
>>> + if (err)
>>> + return err;
>>> + blkaddr = next_blkaddr_of_node(node_blk);
>>> + } while (blkaddr != blkaddr_entry);
>>> +
>>> + /* fix the blkaddr of last node with NULL_ADDR. */
>>> + node_blk->footer.next_blkaddr = NULL_ADDR;
>>> + if (IS_INODE(node_blk))
>>> + err = write_inode(node_blk, blkaddr_tmp);
>>> + else
>>> + err = dev_write_block(node_blk, blkaddr_tmp);
>>> + if (!err)
>>> + FIX_MSG("Fix looped node chain on blkaddr %u\n",
>>> + blkaddr_tmp);
>>> + return err;
>>> +}
>>> +
>>> static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
>>> {
>>> struct curseg_info *curseg;
>>> - struct f2fs_node *node_blk;
>>> - block_t blkaddr;
>>> - unsigned int loop_cnt = 0;
>>> - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
>>> - sbi->total_valid_block_count;
>>> + struct f2fs_node *node_blk, *node_blk_fast;
>>> + block_t blkaddr, blkaddr_fast;
>>> + bool is_detecting = true;
>>> int err = 0;
>>> + node_blk = calloc(F2FS_BLKSIZE, 1);
>>> + node_blk_fast = calloc(F2FS_BLKSIZE, 1);
>>> + ASSERT(node_blk && node_blk_fast);
>>> +
>>> /* get node pages in the current segment */
>>> curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
>>> blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
>>> -
>>> - node_blk = calloc(F2FS_BLKSIZE, 1);
>>> - ASSERT(node_blk);
>>> + blkaddr_fast = blkaddr;
>>> while (1) {
>>> struct fsync_inode_entry *entry;
>>> @@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
>>> if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
>>> entry->last_dentry = blkaddr;
>>> next:
>>> - /* sanity check in order to detect looped node chain */
>>> - if (++loop_cnt >= free_blocks ||
>>> - blkaddr == next_blkaddr_of_node(node_blk)) {
>>> - MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
>>> - blkaddr,
>>> - next_blkaddr_of_node(node_blk));
>>> + blkaddr = next_blkaddr_of_node(node_blk);
>>> +
>>> + /* Below we will detect looped node chain with Floyd's cycle
>>> + * detection algorithm.
>>> + */
>>> + if (!is_detecting)
>>> + continue;
>>> +
>>> + err = find_node_blk_fast(sbi, &blkaddr_fast,
>>> + node_blk_fast, &is_detecting);
>>> + if (err)
>>> + break;
>>> +
>>> + if (!is_detecting)
>>> + continue;
>>> +
>>> + if (blkaddr_fast != blkaddr)
>>> + continue;
>>> +
>>> + ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
>>> + blkaddr);
>>> +
>>> + if (!c.fix_on) {
>>> err = -1;
>>> break;
>>> }
>>> - blkaddr = next_blkaddr_of_node(node_blk);
>>> + err = loop_node_chain_fix(sbi,
>>> + NEXT_FREE_BLKADDR(sbi, curseg),
>>> + node_blk_fast, blkaddr, node_blk);
>>> + if (err)
>>> + break;
>>> +
>>> + /* Since we call get_fsync_inode() to ensure there are no
>>> + * duplicate inodes in the inode_list even if there are
>>> + * duplicate blkaddr, we can continue running after fixing the
>>> + * looped node chain.
>>> + */
>>> + is_detecting = false;
>>> }
>>> + free(node_blk_fast);
>>> free(node_blk);
>>> return err;
>>> }
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-06-14 9:27 ` Chunhai Guo via Linux-f2fs-devel
2023-06-14 16:15 ` Jaegeuk Kim
@ 2023-06-17 2:20 ` Chao Yu
2023-06-21 9:18 ` Chunhai Guo via Linux-f2fs-devel
1 sibling, 1 reply; 11+ messages in thread
From: Chao Yu @ 2023-06-17 2:20 UTC (permalink / raw)
To: Chunhai Guo, jaegeuk@kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net, 李扬韬
On 2023/6/14 17:27, Chunhai Guo wrote:
> Hi Jaegeuk,
>
> Could you please help to confirm if this patch has been merged? I cannot
> see the patch in the dev-test or dev branch.
>
> Thanks.
>
> On 2023/5/24 10:42, 郭纯海 wrote:
>> find_fsync_inode() detect the looped node chain by comparing the loop
>> counter with free blocks. While it may take tens of seconds to quit when
>> the free blocks are large enough. We can use Floyd's cycle detection
>> algorithm to make the detection more efficient, and fix the issue by
>> filling a NULL address in the last node of the chain.
>>
>> Below is the log we encounter on a 256GB UFS storage and it takes about
>> 25 seconds to detect looped node chain. After changing the algorithm, it
>> takes about 20ms to finish the same job.
>>
>> [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
>> [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
>> update superblock
>> [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
>> encrypt verity extra_attr project_quota quota_ino casefold
>> [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
>> 00000000000000000000000000000000
>> [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
>> MB)
>> [ 35.852827] fsck.f2fs: detect looped node chain,
>> blkaddr:1114802, next:1114803
>> [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
>> failed
>> [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
>>
>> Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
>> ---
>> v7 -> v8 : Reformat the code to reduce indention.
>> v6 -> v7 : Correct logic error to handle is_detecting return by
>> find_node_blk_fast().
>> v5 -> v6 : Simplify the code by removing unnecessary retry logic.
>> v4 -> v5 : Use IS_INODE() to make the code more clear.
>> v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
>> it only if c.fix_on is 1.
>> v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
>> v1 -> v2 : Fix looped node chain directly after it is detected.
>> ---
>> fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
>> 1 file changed, 112 insertions(+), 15 deletions(-)
>>
>> diff --git a/fsck/mount.c b/fsck/mount.c
>> index 4c7488840c7c..9d6a222a055c 100644
>> --- a/fsck/mount.c
>> +++ b/fsck/mount.c
>> @@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
>> del_fsync_inode(entry);
>> }
>>
>> +static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
>> + struct f2fs_node *node_blk_fast, bool *is_detecting)
>> +{
>> + int i, err;
>> +
>> + for (i = 0; i < 2; i++) {
>> + if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
>> + *is_detecting = false;
>> + return 0;
>> + }
>> +
>> + err = dev_read_block(node_blk_fast, *blkaddr_fast);
>> + if (err)
>> + return err;
>> +
>> + if (!is_recoverable_dnode(sbi, node_blk_fast)) {
>> + *is_detecting = false;
>> + return 0;
>> + }
>> +
>> + *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
>> + block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
>> + block_t blkaddr, struct f2fs_node *node_blk)
>> +{
>> + block_t blkaddr_entry, blkaddr_tmp;
>> + int err;
>> +
>> + /* find the entry point of the looped node chain */
>> + while (blkaddr_fast != blkaddr) {
>> + err = dev_read_block(node_blk_fast, blkaddr_fast);
>> + if (err)
>> + return err;
>> + blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
>> +
>> + err = dev_read_block(node_blk, blkaddr);
>> + if (err)
>> + return err;
>> + blkaddr = next_blkaddr_of_node(node_blk);
>> + }
>> + blkaddr_entry = blkaddr;
>> +
>> + /* find the last node of the chain */
>> + do {
>> + blkaddr_tmp = blkaddr;
>> + err = dev_read_block(node_blk, blkaddr);
>> + if (err)
>> + return err;
>> + blkaddr = next_blkaddr_of_node(node_blk);
>> + } while (blkaddr != blkaddr_entry);
>> +
>> + /* fix the blkaddr of last node with NULL_ADDR. */
>> + node_blk->footer.next_blkaddr = NULL_ADDR;
>> + if (IS_INODE(node_blk))
>> + err = write_inode(node_blk, blkaddr_tmp);
>> + else
>> + err = dev_write_block(node_blk, blkaddr_tmp);
>> + if (!err)
>> + FIX_MSG("Fix looped node chain on blkaddr %u\n",
>> + blkaddr_tmp);
>> + return err;
>> +}
>> +
>> static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
>> {
>> struct curseg_info *curseg;
>> - struct f2fs_node *node_blk;
>> - block_t blkaddr;
>> - unsigned int loop_cnt = 0;
>> - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
>> - sbi->total_valid_block_count;
>> + struct f2fs_node *node_blk, *node_blk_fast;
>> + block_t blkaddr, blkaddr_fast;
>> + bool is_detecting = true;
>> int err = 0;
>>
>> + node_blk = calloc(F2FS_BLKSIZE, 1);
>> + node_blk_fast = calloc(F2FS_BLKSIZE, 1);
>> + ASSERT(node_blk && node_blk_fast);
>> +
>> /* get node pages in the current segment */
>> curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
>> blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
>> -
>> - node_blk = calloc(F2FS_BLKSIZE, 1);
>> - ASSERT(node_blk);
>> + blkaddr_fast = blkaddr;
>>
>> while (1) {
>> struct fsync_inode_entry *entry;
>> @@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
>> if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
>> entry->last_dentry = blkaddr;
>> next:
>> - /* sanity check in order to detect looped node chain */
>> - if (++loop_cnt >= free_blocks ||
>> - blkaddr == next_blkaddr_of_node(node_blk)) {
>> - MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
>> - blkaddr,
>> - next_blkaddr_of_node(node_blk));
>> + blkaddr = next_blkaddr_of_node(node_blk);
>> +
>> + /* Below we will detect looped node chain with Floyd's cycle
>> + * detection algorithm.
>> + */
>> + if (!is_detecting)
>> + continue;
>> +
>> + err = find_node_blk_fast(sbi, &blkaddr_fast,
>> + node_blk_fast, &is_detecting);
>> + if (err)
>> + break;
>> +
>> + if (!is_detecting)
>> + continue;
>> +
>> + if (blkaddr_fast != blkaddr)
>> + continue;
>> +
>> + ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
>> + blkaddr);
>> +
>> + if (!c.fix_on) {
>> err = -1;
>> break;
>> }
>>
>> - blkaddr = next_blkaddr_of_node(node_blk);
>> + err = loop_node_chain_fix(sbi,
>> + NEXT_FREE_BLKADDR(sbi, curseg),
>> + node_blk_fast, blkaddr, node_blk);
>> + if (err)
>> + break;
>> +
>> + /* Since we call get_fsync_inode() to ensure there are no
>> + * duplicate inodes in the inode_list even if there are
>> + * duplicate blkaddr, we can continue running after fixing the
>> + * looped node chain.
>> + */
>> + is_detecting = false;
Hi Chunhai,
What about cleaning up above logic like we did in kernel side?
Thanks,
>> }
>>
>> + free(node_blk_fast);
>> free(node_blk);
>> return err;
>> }
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-06-17 2:20 ` Chao Yu
@ 2023-06-21 9:18 ` Chunhai Guo via Linux-f2fs-devel
0 siblings, 0 replies; 11+ messages in thread
From: Chunhai Guo via Linux-f2fs-devel @ 2023-06-21 9:18 UTC (permalink / raw)
To: Chao Yu, jaegeuk@kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net, 李扬韬
Hi Chao,
Sorry for replying late, I have send patch "fsck.f2fs: refactor looped
node chain detetected logic for cleanup" as you suggested.
Thanks.
On 2023/6/17 10:20, Chao Yu wrote:
> On 2023/6/14 17:27, Chunhai Guo wrote:
>> Hi Jaegeuk,
>>
>> Could you please help to confirm if this patch has been merged? I cannot
>> see the patch in the dev-test or dev branch.
>>
>> Thanks.
>>
>> On 2023/5/24 10:42, 郭纯海 wrote:
>>> find_fsync_inode() detect the looped node chain by comparing the loop
>>> counter with free blocks. While it may take tens of seconds to quit when
>>> the free blocks are large enough. We can use Floyd's cycle detection
>>> algorithm to make the detection more efficient, and fix the issue by
>>> filling a NULL address in the last node of the chain.
>>>
>>> Below is the log we encounter on a 256GB UFS storage and it takes about
>>> 25 seconds to detect looped node chain. After changing the algorithm, it
>>> takes about 20ms to finish the same job.
>>>
>>> [ 10.822904] fsck.f2fs: Info: version timestamp cur: 17, prev: 430
>>> [ 10.822949] fsck.f2fs: [update_superblock: 762] Info: Done to
>>> update superblock
>>> [ 10.822953] fsck.f2fs: Info: superblock features = 1499 :
>>> encrypt verity extra_attr project_quota quota_ino casefold
>>> [ 10.822956] fsck.f2fs: Info: superblock encrypt level = 0, salt =
>>> 00000000000000000000000000000000
>>> [ 10.822960] fsck.f2fs: Info: total FS sectors = 59249811 (231444
>>> MB)
>>> [ 35.852827] fsck.f2fs: detect looped node chain,
>>> blkaddr:1114802, next:1114803
>>> [ 35.852842] fsck.f2fs: [f2fs_do_mount:3846] record_fsync_data
>>> failed
>>> [ 35.856106] fsck.f2fs: fsck.f2fs terminated by exit(255)
>>>
>>> Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
>>> ---
>>> v7 -> v8 : Reformat the code to reduce indention.
>>> v6 -> v7 : Correct logic error to handle is_detecting return by
>>> find_node_blk_fast().
>>> v5 -> v6 : Simplify the code by removing unnecessary retry logic.
>>> v4 -> v5 : Use IS_INODE() to make the code more clear.
>>> v3 -> v4 : Set c.bug_on with ASSERT_MSG() when issue is detected and fix
>>> it only if c.fix_on is 1.
>>> v2 -> v3 : Write inode with write_inode() to avoid chksum being broken.
>>> v1 -> v2 : Fix looped node chain directly after it is detected.
>>> ---
>>> fsck/mount.c | 127 +++++++++++++++++++++++++++++++++++++++++++++------
>>> 1 file changed, 112 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/fsck/mount.c b/fsck/mount.c
>>> index 4c7488840c7c..9d6a222a055c 100644
>>> --- a/fsck/mount.c
>>> +++ b/fsck/mount.c
>>> @@ -3518,22 +3518,90 @@ static void destroy_fsync_dnodes(struct list_head *head)
>>> del_fsync_inode(entry);
>>> }
>>>
>>> +static int find_node_blk_fast(struct f2fs_sb_info *sbi, block_t *blkaddr_fast,
>>> + struct f2fs_node *node_blk_fast, bool *is_detecting)
>>> +{
>>> + int i, err;
>>> +
>>> + for (i = 0; i < 2; i++) {
>>> + if (!f2fs_is_valid_blkaddr(sbi, *blkaddr_fast, META_POR)) {
>>> + *is_detecting = false;
>>> + return 0;
>>> + }
>>> +
>>> + err = dev_read_block(node_blk_fast, *blkaddr_fast);
>>> + if (err)
>>> + return err;
>>> +
>>> + if (!is_recoverable_dnode(sbi, node_blk_fast)) {
>>> + *is_detecting = false;
>>> + return 0;
>>> + }
>>> +
>>> + *blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int loop_node_chain_fix(struct f2fs_sb_info *sbi,
>>> + block_t blkaddr_fast, struct f2fs_node *node_blk_fast,
>>> + block_t blkaddr, struct f2fs_node *node_blk)
>>> +{
>>> + block_t blkaddr_entry, blkaddr_tmp;
>>> + int err;
>>> +
>>> + /* find the entry point of the looped node chain */
>>> + while (blkaddr_fast != blkaddr) {
>>> + err = dev_read_block(node_blk_fast, blkaddr_fast);
>>> + if (err)
>>> + return err;
>>> + blkaddr_fast = next_blkaddr_of_node(node_blk_fast);
>>> +
>>> + err = dev_read_block(node_blk, blkaddr);
>>> + if (err)
>>> + return err;
>>> + blkaddr = next_blkaddr_of_node(node_blk);
>>> + }
>>> + blkaddr_entry = blkaddr;
>>> +
>>> + /* find the last node of the chain */
>>> + do {
>>> + blkaddr_tmp = blkaddr;
>>> + err = dev_read_block(node_blk, blkaddr);
>>> + if (err)
>>> + return err;
>>> + blkaddr = next_blkaddr_of_node(node_blk);
>>> + } while (blkaddr != blkaddr_entry);
>>> +
>>> + /* fix the blkaddr of last node with NULL_ADDR. */
>>> + node_blk->footer.next_blkaddr = NULL_ADDR;
>>> + if (IS_INODE(node_blk))
>>> + err = write_inode(node_blk, blkaddr_tmp);
>>> + else
>>> + err = dev_write_block(node_blk, blkaddr_tmp);
>>> + if (!err)
>>> + FIX_MSG("Fix looped node chain on blkaddr %u\n",
>>> + blkaddr_tmp);
>>> + return err;
>>> +}
>>> +
>>> static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
>>> {
>>> struct curseg_info *curseg;
>>> - struct f2fs_node *node_blk;
>>> - block_t blkaddr;
>>> - unsigned int loop_cnt = 0;
>>> - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
>>> - sbi->total_valid_block_count;
>>> + struct f2fs_node *node_blk, *node_blk_fast;
>>> + block_t blkaddr, blkaddr_fast;
>>> + bool is_detecting = true;
>>> int err = 0;
>>>
>>> + node_blk = calloc(F2FS_BLKSIZE, 1);
>>> + node_blk_fast = calloc(F2FS_BLKSIZE, 1);
>>> + ASSERT(node_blk && node_blk_fast);
>>> +
>>> /* get node pages in the current segment */
>>> curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
>>> blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
>>> -
>>> - node_blk = calloc(F2FS_BLKSIZE, 1);
>>> - ASSERT(node_blk);
>>> + blkaddr_fast = blkaddr;
>>>
>>> while (1) {
>>> struct fsync_inode_entry *entry;
>>> @@ -3564,19 +3632,48 @@ static int find_fsync_inode(struct f2fs_sb_info *sbi, struct list_head *head)
>>> if (IS_INODE(node_blk) && is_dent_dnode(node_blk))
>>> entry->last_dentry = blkaddr;
>>> next:
>>> - /* sanity check in order to detect looped node chain */
>>> - if (++loop_cnt >= free_blocks ||
>>> - blkaddr == next_blkaddr_of_node(node_blk)) {
>>> - MSG(0, "\tdetect looped node chain, blkaddr:%u, next:%u\n",
>>> - blkaddr,
>>> - next_blkaddr_of_node(node_blk));
>>> + blkaddr = next_blkaddr_of_node(node_blk);
>>> +
>>> + /* Below we will detect looped node chain with Floyd's cycle
>>> + * detection algorithm.
>>> + */
>>> + if (!is_detecting)
>>> + continue;
>>> +
>>> + err = find_node_blk_fast(sbi, &blkaddr_fast,
>>> + node_blk_fast, &is_detecting);
>>> + if (err)
>>> + break;
>>> +
>>> + if (!is_detecting)
>>> + continue;
>>> +
>>> + if (blkaddr_fast != blkaddr)
>>> + continue;
>>> +
>>> + ASSERT_MSG("\tdetect looped node chain, blkaddr:%u\n",
>>> + blkaddr);
>>> +
>>> + if (!c.fix_on) {
>>> err = -1;
>>> break;
>>> }
>>>
>>> - blkaddr = next_blkaddr_of_node(node_blk);
>>> + err = loop_node_chain_fix(sbi,
>>> + NEXT_FREE_BLKADDR(sbi, curseg),
>>> + node_blk_fast, blkaddr, node_blk);
>>> + if (err)
>>> + break;
>>> +
>>> + /* Since we call get_fsync_inode() to ensure there are no
>>> + * duplicate inodes in the inode_list even if there are
>>> + * duplicate blkaddr, we can continue running after fixing the
>>> + * looped node chain.
>>> + */
>>> + is_detecting = false;
>
> Hi Chunhai,
>
> What about cleaning up above logic like we did in kernel side?
>
> Thanks,
>
>>> }
>>>
>>> + free(node_blk_fast);
>>> free(node_blk);
>>> return err;
>>> }
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-05-24 2:42 [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently Chunhai Guo via Linux-f2fs-devel
` (2 preceding siblings ...)
2023-06-14 9:27 ` Chunhai Guo via Linux-f2fs-devel
@ 2023-06-23 1:52 ` Chao Yu
2023-06-23 18:50 ` Jaegeuk Kim
3 siblings, 1 reply; 11+ messages in thread
From: Chao Yu @ 2023-06-23 1:52 UTC (permalink / raw)
To: Chunhai Guo, jaegeuk; +Cc: linux-f2fs-devel, frank.li
On 2023/5/24 10:42, Chunhai Guo wrote:
> + if (!c.fix_on) {
> err = -1;
> break;
> }
One more comment.
I guess we'd better skip find_fsync_inode() and continue fsck rather
than exiting directly if c.fix_on is not set, thoughts?
Thanks,
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-06-23 1:52 ` Chao Yu
@ 2023-06-23 18:50 ` Jaegeuk Kim
2023-06-25 7:14 ` Chunhai Guo via Linux-f2fs-devel
0 siblings, 1 reply; 11+ messages in thread
From: Jaegeuk Kim @ 2023-06-23 18:50 UTC (permalink / raw)
To: Chao Yu; +Cc: Chunhai Guo, linux-f2fs-devel, frank.li
On 06/23, Chao Yu wrote:
> On 2023/5/24 10:42, Chunhai Guo wrote:
> > + if (!c.fix_on) {
> > err = -1;
> > break;
> > }
>
> One more comment.
>
> I guess we'd better skip find_fsync_inode() and continue fsck rather
> than exiting directly if c.fix_on is not set, thoughts?
Chunhai, could you please post a final version of this patch?
>
> Thanks,
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently
2023-06-23 18:50 ` Jaegeuk Kim
@ 2023-06-25 7:14 ` Chunhai Guo via Linux-f2fs-devel
0 siblings, 0 replies; 11+ messages in thread
From: Chunhai Guo via Linux-f2fs-devel @ 2023-06-25 7:14 UTC (permalink / raw)
To: Jaegeuk Kim, Chao Yu
Cc: linux-f2fs-devel@lists.sourceforge.net, 李扬韬
Hi Jaegeuk & Chao,
OK. I will post a final version including the modification.
Thanks.
On 2023/6/24 2:50, Jaegeuk Kim wrote:
> On 06/23, Chao Yu wrote:
>> On 2023/5/24 10:42, Chunhai Guo wrote:
>>> + if (!c.fix_on) {
>>> err = -1;
>>> break;
>>> }
>>
>> One more comment.
>>
>> I guess we'd better skip find_fsync_inode() and continue fsck rather
>> than exiting directly if c.fix_on is not set, thoughts?
>
> Chunhai, could you please post a final version of this patch?
>
>>
>> Thanks,
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-06-25 7:14 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-24 2:42 [f2fs-dev] [PATCH v8] fsck.f2fs: Detect and fix looped node chain efficiently Chunhai Guo via Linux-f2fs-devel
2023-05-25 1:23 ` Chao Yu
2023-05-25 22:36 ` Jaegeuk Kim
2023-06-14 9:27 ` Chunhai Guo via Linux-f2fs-devel
2023-06-14 16:15 ` Jaegeuk Kim
2023-06-15 2:12 ` Chunhai Guo via Linux-f2fs-devel
2023-06-17 2:20 ` Chao Yu
2023-06-21 9:18 ` Chunhai Guo via Linux-f2fs-devel
2023-06-23 1:52 ` Chao Yu
2023-06-23 18:50 ` Jaegeuk Kim
2023-06-25 7:14 ` Chunhai Guo via Linux-f2fs-devel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).