[PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency
@ 2024-03-20  3:54 Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 1/7] btrfs: scrub: fix incorrectly reported logical/physical address Qu Wenruo
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v2:
- Remove the patch that reworks the output message format
  That change still needs quite some debates, as it's a big trade-off
  between easy to read for a single line and easy to batch floods of
  multiple lines.
  (Considering my personal experience, I would even recommend to go CSV
   like output, as all the real-world reports are floods of such error
   messages, never a single one)

- Split the ratelimit work into its own patch
  The last one

- Address the feedbacks from the mailing list

[BACKGROUND]
During some support sessions, I found older kernels are always report
very unuseful scrub error messages like:

 BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2823, gen 0
 BTRFS error (device dm-0): unable to fixup (regular) error at logical 1563504640 on dev /dev/mapper/sys-rootlv
 BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2824, gen 0
 BTRFS error (device dm-0): unable to fixup (regular) error at logical 1579016192 on dev /dev/mapper/sys-rootlv

There are two problems:
- No proper details about the corruption
  No metadata or data indication, nor the filename or the tree id.
  Even the involved kernel (and newer kernels after the scrub rework)
  has the ability to do backref walk and find out the file path or the
  tree backref.

  My guess is, for data sometimes the corrupted sector is no longer
  referred by any data extent.

- Too noisy and useless error message from
  btrfs_dev_stat_inc_and_print()
  I'd argue we should not output any error message just for
  btrfs_dev_stat_inc_and_print().

After the series, the error message would look like this:

 BTRFS info (device dm-3): scrub: started on devid 1
 BTRFS warning (device dm-3): checksum error at logical 13647872 on dev /dev/mapper/test-scratch1, physical 13647872, root 5, inode 257, offset 16384, path: file1
 BTRFS info (device dm-3): scrub: finished on devid 1 with status: 0

This involves the following enhancement:

- Single line
  And we ensure we output at least one line for the error we hit.
  No more unrelated btrfs_dev_stat_inc_and_print() output.

- Proper fall backs
  We have the following different fall back options
  * Repaired
    Just a line of something repaired for logical/physical address.

  * Detailed data info
    Including the following elements (if possible), and if higher
    priority ones can not be fetched, it would be skipped and try
    lower priority items:
    + file path (can have multiple ones)
    + root/inode number and offset
    + logical/physical address (always output)

  * Detaile metadata info
    The priority list is:
    + root ref/level
    + logical/physical address (always output)

  For the worst case of data corruption, we would still have some like:

   BTRFS warning (device dm-2): checksum error at data, logical 13647872 on dev /dev/mapper/test-scratch1 physical 13647872

  And similar ones for metadata:

   BTRFS warning (device dm-2): checksum error at metadata, logical 13647872 on dev /dev/mapper/test-scratch1 physical 13647872

[PATCHSET STRUCTURE]
The first patch is fixing a regression in the error message, which leads
to bad logical/physical bytenr.

The second one would reduce the log level for
btrfs_dev_stat_inc_and_print().

Patch 3~4 are cleanups to remove unnecessary parameters.

Patch 5 remove the unnecessary output for nlink and reduce one inode
item lookup.

Patch 6 ensures we get at least one error message output before
ratelimit, and if we failed to output anything, output a fallback error
message.

The final patch would rework how we do rate limits, and use the same way
(btrfs_*_rl() helpers) to do rate limits.

Qu Wenruo (7):
  btrfs: scrub: fix incorrectly reported logical/physical address
  btrfs: reduce the log level for btrfs_dev_stat_inc_and_print()
  btrfs: scrub: remove unused is_super parameter from
    scrub_print_common_warning()
  btrfs: scrub: remove unnecessary dev/physical lookup for 
    scrub_stripe_report_errors()
  btrfs: scrub: simplify the inode iteration output
  btrfs: scrub: ensure we output at least one error message for
    unrepaired corruption
  btrfs: scrub: use generic ratelimit helpers to output error messages

 fs/btrfs/scrub.c   | 142 ++++++++++++---------------------------------
 fs/btrfs/volumes.c |   2 +-
 2 files changed, 39 insertions(+), 105 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/7] btrfs: scrub: fix incorrectly reported logical/physical address
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print() Qu Wenruo
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable, Anand Jain, Filipe Manana

[BUG]
Scrub is not reporting the correct logical/physical address, it can be
verified by the following script:

 # mkfs.btrfs -f $dev1
 # mount $dev1 $mnt
 # xfs_io -f -c "pwrite -S 0xaa 0 128k" $mnt/file1
 # umount $mnt
 # xfs_io -f -c "pwrite -S 0xff 13647872 4k" $dev1
 # mount $dev1 $mnt
 # btrfs scrub start -fB $mnt
 # umount $mnt

Note above 13647872 is the physical address for logical 13631488 + 4K.

Scrub would report the following error:

 BTRFS error (device dm-2): unable to fixup (regular) error at logical 13631488 on dev /dev/mapper/test-scratch1 physical 13631488
 BTRFS warning (device dm-2): checksum error at logical 13631488 on dev /dev/mapper/test-scratch1, physical 13631488, root 5, inode 257, offset 0, length 4096, links 1 (path: file1)

On the other hand, "btrfs check --check-data-csum" is reporting the
correct logical/physical address:

 Checking filesystem on /dev/test/scratch1
 UUID: db2eb621-b09d-4f24-8199-da17dc7b3201
 [5/7] checking csums against data
 mirror 1 bytenr 13647872 csum 0x13fec125 expected csum 0x656bd64e
 ERROR: errors found in csum tree

[CAUSE]
In the function scrub_stripe_report_errors(), we always use the
stripe->logical and its physical address to print the error message, not
taking the sector number into consideration at all.

[FIX]
Fix the error reporting function by calculating logical/physical with
the sector number.

Now the scrub report is correct:

 BTRFS error (device dm-2): unable to fixup (regular) error at logical 13647872 on dev /dev/mapper/test-scratch1 physical 13647872
 BTRFS warning (device dm-2): checksum error at logical 13647872 on dev /dev/mapper/test-scratch1, physical 13647872, root 5, inode 257, offset 16384, length 4096, links 1 (path: file1)

Fixes: 0096580713ff ("btrfs: scrub: introduce error reporting functionality for scrub_stripe")
CC: stable@vger.kernel.org #6.4+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index fa25004ab04e..72aa74310612 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -870,7 +870,7 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 				      DEFAULT_RATELIMIT_BURST);
 	struct btrfs_fs_info *fs_info = sctx->fs_info;
 	struct btrfs_device *dev = NULL;
-	u64 physical = 0;
+	u64 stripe_physical = stripe->physical;
 	int nr_data_sectors = 0;
 	int nr_meta_sectors = 0;
 	int nr_nodatacsum_sectors = 0;
@@ -903,13 +903,17 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 		 */
 		if (ret < 0)
 			goto skip;
-		physical = bioc->stripes[stripe_index].physical;
+		stripe_physical = bioc->stripes[stripe_index].physical;
 		dev = bioc->stripes[stripe_index].dev;
 		btrfs_put_bioc(bioc);
 	}
 
 skip:
 	for_each_set_bit(sector_nr, &stripe->extent_sector_bitmap, stripe->nr_sectors) {
+		const u64 logical = stripe->logical +
+				    (sector_nr << fs_info->sectorsize_bits);
+		const u64 physical = stripe_physical +
+				     (sector_nr << fs_info->sectorsize_bits);
 		bool repaired = false;
 
 		if (stripe->sectors[sector_nr].is_metadata) {
@@ -938,12 +942,12 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 			if (dev) {
 				btrfs_err_rl_in_rcu(fs_info,
 			"fixed up error at logical %llu on dev %s physical %llu",
-					    stripe->logical, btrfs_dev_name(dev),
+					    logical, btrfs_dev_name(dev),
 					    physical);
 			} else {
 				btrfs_err_rl_in_rcu(fs_info,
 			"fixed up error at logical %llu on mirror %u",
-					    stripe->logical, stripe->mirror_num);
+					    logical, stripe->mirror_num);
 			}
 			continue;
 		}
@@ -952,26 +956,26 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 		if (dev) {
 			btrfs_err_rl_in_rcu(fs_info,
 	"unable to fixup (regular) error at logical %llu on dev %s physical %llu",
-					    stripe->logical, btrfs_dev_name(dev),
+					    logical, btrfs_dev_name(dev),
 					    physical);
 		} else {
 			btrfs_err_rl_in_rcu(fs_info,
 	"unable to fixup (regular) error at logical %llu on mirror %u",
-					    stripe->logical, stripe->mirror_num);
+					    logical, stripe->mirror_num);
 		}
 
 		if (test_bit(sector_nr, &stripe->io_error_bitmap))
 			if (__ratelimit(&rs) && dev)
 				scrub_print_common_warning("i/o error", dev, false,
-						     stripe->logical, physical);
+						     logical, physical);
 		if (test_bit(sector_nr, &stripe->csum_error_bitmap))
 			if (__ratelimit(&rs) && dev)
 				scrub_print_common_warning("checksum error", dev, false,
-						     stripe->logical, physical);
+						     logical, physical);
 		if (test_bit(sector_nr, &stripe->meta_error_bitmap))
 			if (__ratelimit(&rs) && dev)
 				scrub_print_common_warning("header error", dev, false,
-						     stripe->logical, physical);
+						     logical, physical);
 	}
 
 	spin_lock(&sctx->stat_lock);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print()
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 1/7] btrfs: scrub: fix incorrectly reported logical/physical address Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-04-04 20:26   ` David Sterba
  2024-03-20  3:54 ` [PATCH v2 3/7] btrfs: scrub: remove unused is_super parameter from scrub_print_common_warning() Qu Wenruo
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable, Filipe Manana

Currently when we increase the device statistics, it would always lead
to an error message in the kernel log.

However this output is mostly duplicated with the existing ones:

- For scrub operations
  We always have the following messages:
  * "fixed up error at logical %llu"
  * "unable to fixup (regular) error at logical %llu"

  So no matter if the corruption is repaired or not, it scrub would
  output an error message to indicate the problem.

- For non-scrub read operations
  We also have the following messages:
  * "csum failed root %lld inode %llu off %llu" for data csum mismatch
  * "bad (tree block start|fsid|tree block level)" for metadata
  * "read error corrected: ino %llu off %llu" for repaired data/metadata

So the error message from btrfs_dev_stat_inc_and_print() is duplicated.

The real usage for the btrfs device statistics is for some user space
daemon to check if there is any new errors, acting like some checks on
SMART, thus we don't really need/want those messages in dmesg.

This patch would reduce the log level to debug (disabled by default) for
btrfs_dev_stat_inc_and_print().
For users really want to utilize btrfs devices statistics, they should
go check "btrfs device stats" periodically, and we should focus the
kernel error messages to more important things.

CC: stable@vger.kernel.org
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/volumes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e49935a54da0..126145950ed3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7828,7 +7828,7 @@ void btrfs_dev_stat_inc_and_print(struct btrfs_device *dev, int index)

 	if (!dev->dev_stats_valid)
 		return;
-	btrfs_err_rl_in_rcu(dev->fs_info,
+	btrfs_debug_rl_in_rcu(dev->fs_info,
 		"bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
 			   btrfs_dev_name(dev),
 			   btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
-- 
2.44.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/7] btrfs: scrub: remove unused is_super parameter from scrub_print_common_warning()
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 1/7] btrfs: scrub: fix incorrectly reported logical/physical address Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print() Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 4/7] btrfs: scrub: remove unnecessary dev/physical lookup for scrub_stripe_report_errors() Qu Wenruo
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

Since commit 2a2dc22f7e9d ("btrfs: scrub: use dedicated super block
verification function to scrub one super block"), the super block
scrubbing is handled in a dedicated helper, thus
scrub_print_common_warning() no longer needs to print warning for super
block errors.

Just remove the parameter.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 72aa74310612..a5a9fef2bdb2 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -476,7 +476,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 }
 
 static void scrub_print_common_warning(const char *errstr, struct btrfs_device *dev,
-				       bool is_super, u64 logical, u64 physical)
+				       u64 logical, u64 physical)
 {
 	struct btrfs_fs_info *fs_info = dev->fs_info;
 	struct btrfs_path *path;
@@ -488,12 +488,6 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 	u32 item_size;
 	int ret;
 
-	/* Super block error, no need to search extent tree. */
-	if (is_super) {
-		btrfs_warn_in_rcu(fs_info, "%s on device %s, physical %llu",
-				  errstr, btrfs_dev_name(dev), physical);
-		return;
-	}
 	path = btrfs_alloc_path();
 	if (!path)
 		return;
@@ -966,15 +960,15 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 
 		if (test_bit(sector_nr, &stripe->io_error_bitmap))
 			if (__ratelimit(&rs) && dev)
-				scrub_print_common_warning("i/o error", dev, false,
+				scrub_print_common_warning("i/o error", dev,
 						     logical, physical);
 		if (test_bit(sector_nr, &stripe->csum_error_bitmap))
 			if (__ratelimit(&rs) && dev)
-				scrub_print_common_warning("checksum error", dev, false,
+				scrub_print_common_warning("checksum error", dev,
 						     logical, physical);
 		if (test_bit(sector_nr, &stripe->meta_error_bitmap))
 			if (__ratelimit(&rs) && dev)
-				scrub_print_common_warning("header error", dev, false,
+				scrub_print_common_warning("header error", dev,
 						     logical, physical);
 	}
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/7] btrfs: scrub: remove unnecessary dev/physical lookup for  scrub_stripe_report_errors()
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
                   ` (2 preceding siblings ...)
  2024-03-20  3:54 ` [PATCH v2 3/7] btrfs: scrub: remove unused is_super parameter from scrub_print_common_warning() Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-03-20 13:09   ` Filipe Manana
  2024-03-20  3:54 ` [PATCH v2 5/7] btrfs: scrub: simplify the inode iteration output Qu Wenruo
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs

The @stripe passed into scrub_stripe_report_errors() either has
stripe->dev and stripe->physical properly populated (regular data
stripes) or stripe->flags would have SCRUB_STRIPE_FLAG_NO_REPORT
(RAID56 P/Q stripes).

Thus there is no need to go with btrfs_map_block() to get the
dev/physical.

Just add an extra ASSERT() to make sure we get stripe->dev populated
correctly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 54 +++++++-----------------------------------------
 1 file changed, 7 insertions(+), 47 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index a5a9fef2bdb2..ff952dd2b510 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -863,7 +863,7 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
 				      DEFAULT_RATELIMIT_BURST);
 	struct btrfs_fs_info *fs_info = sctx->fs_info;
-	struct btrfs_device *dev = NULL;
+	struct btrfs_device *dev = stripe->dev;
 	u64 stripe_physical = stripe->physical;
 	int nr_data_sectors = 0;
 	int nr_meta_sectors = 0;
@@ -874,35 +874,7 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 	if (test_bit(SCRUB_STRIPE_FLAG_NO_REPORT, &stripe->state))
 		return;
 
-	/*
-	 * Init needed infos for error reporting.
-	 *
-	 * Although our scrub_stripe infrastructure is mostly based on btrfs_submit_bio()
-	 * thus no need for dev/physical, error reporting still needs dev and physical.
-	 */
-	if (!bitmap_empty(&stripe->init_error_bitmap, stripe->nr_sectors)) {
-		u64 mapped_len = fs_info->sectorsize;
-		struct btrfs_io_context *bioc = NULL;
-		int stripe_index = stripe->mirror_num - 1;
-		int ret;
-
-		/* For scrub, our mirror_num should always start at 1. */
-		ASSERT(stripe->mirror_num >= 1);
-		ret = btrfs_map_block(fs_info, BTRFS_MAP_GET_READ_MIRRORS,
-				      stripe->logical, &mapped_len, &bioc,
-				      NULL, NULL);
-		/*
-		 * If we failed, dev will be NULL, and later detailed reports
-		 * will just be skipped.
-		 */
-		if (ret < 0)
-			goto skip;
-		stripe_physical = bioc->stripes[stripe_index].physical;
-		dev = bioc->stripes[stripe_index].dev;
-		btrfs_put_bioc(bioc);
-	}
-
-skip:
+	ASSERT(dev);
 	for_each_set_bit(sector_nr, &stripe->extent_sector_bitmap, stripe->nr_sectors) {
 		const u64 logical = stripe->logical +
 				    (sector_nr << fs_info->sectorsize_bits);
@@ -933,41 +905,29 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 		 * output the message of repaired message.
 		 */
 		if (repaired) {
-			if (dev) {
-				btrfs_err_rl_in_rcu(fs_info,
+			btrfs_err_rl_in_rcu(fs_info,
 			"fixed up error at logical %llu on dev %s physical %llu",
 					    logical, btrfs_dev_name(dev),
 					    physical);
-			} else {
-				btrfs_err_rl_in_rcu(fs_info,
-			"fixed up error at logical %llu on mirror %u",
-					    logical, stripe->mirror_num);
-			}
 			continue;
 		}
 
 		/* The remaining are all for unrepaired. */
-		if (dev) {
-			btrfs_err_rl_in_rcu(fs_info,
+		btrfs_err_rl_in_rcu(fs_info,
 	"unable to fixup (regular) error at logical %llu on dev %s physical %llu",
 					    logical, btrfs_dev_name(dev),
 					    physical);
-		} else {
-			btrfs_err_rl_in_rcu(fs_info,
-	"unable to fixup (regular) error at logical %llu on mirror %u",
-					    logical, stripe->mirror_num);
-		}
 
 		if (test_bit(sector_nr, &stripe->io_error_bitmap))
-			if (__ratelimit(&rs) && dev)
+			if (__ratelimit(&rs))
 				scrub_print_common_warning("i/o error", dev,
 						     logical, physical);
 		if (test_bit(sector_nr, &stripe->csum_error_bitmap))
-			if (__ratelimit(&rs) && dev)
+			if (__ratelimit(&rs))
 				scrub_print_common_warning("checksum error", dev,
 						     logical, physical);
 		if (test_bit(sector_nr, &stripe->meta_error_bitmap))
-			if (__ratelimit(&rs) && dev)
+			if (__ratelimit(&rs))
 				scrub_print_common_warning("header error", dev,
 						     logical, physical);
 	}
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/7] btrfs: scrub: simplify the inode iteration output
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
                   ` (3 preceding siblings ...)
  2024-03-20  3:54 ` [PATCH v2 4/7] btrfs: scrub: remove unnecessary dev/physical lookup for scrub_stripe_report_errors() Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 6/7] btrfs: scrub: ensure we output at least one error message for unrepaired corruption Qu Wenruo
  2024-03-20  3:54 ` [PATCH v2 7/7] btrfs: scrub: use generic ratelimit helpers to output error messages Qu Wenruo
  6 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Filipe Manana

The following two output are not really needed:

- nlinks
  Normally file inodes should have nlinks as 1, for those inodes have
  multiple hard links, the inode/root number is still enough to pin it
  down to certain inode.

- size
  The size is always fixed to sector size.

By removing the nlinks output, we can reduce one inode item lookup.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 27 +--------------------------
 1 file changed, 1 insertion(+), 26 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ff952dd2b510..85f321e3e37c 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -388,17 +388,13 @@ static noinline_for_stack struct scrub_ctx *scrub_setup_ctx(
 static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 				     u64 root, void *warn_ctx)
 {
-	u32 nlink;
 	int ret;
 	int i;
 	unsigned nofs_flag;
-	struct extent_buffer *eb;
-	struct btrfs_inode_item *inode_item;
 	struct scrub_warning *swarn = warn_ctx;
 	struct btrfs_fs_info *fs_info = swarn->dev->fs_info;
 	struct inode_fs_paths *ipath = NULL;
 	struct btrfs_root *local_root;
-	struct btrfs_key key;
 
 	local_root = btrfs_get_fs_root(fs_info, root, true);
 	if (IS_ERR(local_root)) {
@@ -406,26 +402,6 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 		goto err;
 	}
 
-	/*
-	 * this makes the path point to (inum INODE_ITEM ioff)
-	 */
-	key.objectid = inum;
-	key.type = BTRFS_INODE_ITEM_KEY;
-	key.offset = 0;
-
-	ret = btrfs_search_slot(NULL, local_root, &key, swarn->path, 0, 0);
-	if (ret) {
-		btrfs_put_root(local_root);
-		btrfs_release_path(swarn->path);
-		goto err;
-	}
-
-	eb = swarn->path->nodes[0];
-	inode_item = btrfs_item_ptr(eb, swarn->path->slots[0],
-					struct btrfs_inode_item);
-	nlink = btrfs_inode_nlink(eb, inode_item);
-	btrfs_release_path(swarn->path);
-
 	/*
 	 * init_path might indirectly call vmalloc, or use GFP_KERNEL. Scrub
 	 * uses GFP_NOFS in this context, so we keep it consistent but it does
@@ -451,12 +427,11 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 	 */
 	for (i = 0; i < ipath->fspath->elem_cnt; ++i)
 		btrfs_warn_in_rcu(fs_info,
-"%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu, length %u, links %u (path: %s)",
+"%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu, path: %s",
 				  swarn->errstr, swarn->logical,
 				  btrfs_dev_name(swarn->dev),
 				  swarn->physical,
 				  root, inum, offset,
-				  fs_info->sectorsize, nlink,
 				  (char *)(unsigned long)ipath->fspath->val[i]);
 
 	btrfs_put_root(local_root);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/7] btrfs: scrub: ensure we output at least one error message for unrepaired corruption
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
                   ` (4 preceding siblings ...)
  2024-03-20  3:54 ` [PATCH v2 5/7] btrfs: scrub: simplify the inode iteration output Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-03-20 13:30   ` Filipe Manana
  2024-03-20  3:54 ` [PATCH v2 7/7] btrfs: scrub: use generic ratelimit helpers to output error messages Qu Wenruo
  6 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs

For btrfs scrub error messages, I have hit a lot of support cases on
older kernels where no filename is outputted at all, with only error
messages like:

 BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2823, gen 0
 BTRFS error (device dm-0): unable to fixup (regular) error at logical 1563504640 on dev /dev/mapper/sys-rootlv
 BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2824, gen 0
 BTRFS error (device dm-0): unable to fixup (regular) error at logical 1579016192 on dev /dev/mapper/sys-rootlv
 BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2825, gen 0

The "unable to fixup (regular) error" line shows we hit an unrepairable
error, then normally we would do data/metadata backref walk to grab the
correct info.

But we can hit cases like the inode is already orphan (unlinked from any
parent inode), or even the subvolume is orphan (unlinked and waiting to
be deleted).

In that case we're not sure what's the proper way to continue (Is it
some critical data/metadata? Would it prevent the system from booting?)

To improve the situation, this patch would:

- Ensure we at least output one message for each unrepairable error
  If by somehow we didn't output any error message for the error, we
  always fallback to the basic logical/physical error output, with its
  type (data or metadata).

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 85f321e3e37c..0d2b042d75c2 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -226,6 +226,7 @@ struct scrub_warning {
 	u64			physical;
 	u64			logical;
 	struct btrfs_device	*dev;
+	bool			message_printed;
 };
 
 static void release_scrub_stripe(struct scrub_stripe *stripe)
@@ -425,7 +426,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 	 * we deliberately ignore the bit ipath might have been too small to
 	 * hold all of the paths here
 	 */
-	for (i = 0; i < ipath->fspath->elem_cnt; ++i)
+	for (i = 0; i < ipath->fspath->elem_cnt; ++i) {
 		btrfs_warn_in_rcu(fs_info,
 "%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu, path: %s",
 				  swarn->errstr, swarn->logical,
@@ -433,6 +434,8 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 				  swarn->physical,
 				  root, inum, offset,
 				  (char *)(unsigned long)ipath->fspath->val[i]);
+		swarn->message_printed = true;
+	}
 
 	btrfs_put_root(local_root);
 	free_ipath(ipath);
@@ -445,7 +448,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 			  btrfs_dev_name(swarn->dev),
 			  swarn->physical,
 			  root, inum, offset, ret);
-
+	swarn->message_printed = true;
 	free_ipath(ipath);
 	return 0;
 }
@@ -471,6 +474,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 	swarn.logical = logical;
 	swarn.errstr = errstr;
 	swarn.dev = NULL;
+	swarn.message_printed = false;
 
 	ret = extent_from_logical(fs_info, swarn.logical, path, &found_key,
 				  &flags);
@@ -492,12 +496,8 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 			ret = tree_backref_for_extent(&ptr, eb, &found_key, ei,
 						      item_size, &ref_root,
 						      &ref_level);
-			if (ret < 0) {
-				btrfs_warn(fs_info,
-				"failed to resolve tree backref for logical %llu: %d",
-						  swarn.logical, ret);
+			if (ret < 0)
 				break;
-			}
 			if (ret > 0)
 				break;
 			btrfs_warn_in_rcu(fs_info,
@@ -505,7 +505,13 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 				errstr, swarn.logical, btrfs_dev_name(dev),
 				swarn.physical, (ref_level ? "node" : "leaf"),
 				ref_level, ref_root);
+			swarn.message_printed = true;
 		}
+		if (!swarn.message_printed)
+			btrfs_warn_in_rcu(fs_info,
+			"%s at metadata, logical %llu on dev %s physical %llu",
+					  errstr, swarn.logical,
+					  btrfs_dev_name(dev), swarn.physical);
 		btrfs_release_path(path);
 	} else {
 		struct btrfs_backref_walk_ctx ctx = { 0 };
@@ -520,6 +526,11 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 		swarn.dev = dev;
 
 		iterate_extent_inodes(&ctx, true, scrub_print_warning_inode, &swarn);
+		if (!swarn.message_printed)
+			btrfs_warn_in_rcu(fs_info,
+	"%s at data, filename unresolved, logical %llu on dev %s physical %llu",
+					  errstr, swarn.logical,
+					  btrfs_dev_name(dev), swarn.physical);
 	}
 
 out:
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 7/7] btrfs: scrub: use generic ratelimit helpers to output error messages
  2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
                   ` (5 preceding siblings ...)
  2024-03-20  3:54 ` [PATCH v2 6/7] btrfs: scrub: ensure we output at least one error message for unrepaired corruption Qu Wenruo
@ 2024-03-20  3:54 ` Qu Wenruo
  2024-03-20 13:33   ` Filipe Manana
  6 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2024-03-20  3:54 UTC (permalink / raw)
  To: linux-btrfs

Currently scrub goes different ways to rate limits its error messages:

- regular btrfs_err_rl_in_rcu() helper for repaired sectors and
  the initial message for unrepaired sectors

- Manually rate limits scrub_print_common_warning()

I'd say the different rate limits could lead to cases where we only got
the "unable to fixup (regular) error" messages but the detailed report
about that corruption is ratelimited.

To make the whole rate limit works more consistently, change the rate
limit by:

- Always using btrfs_*_rl() helpers

- Remove the initial "unable to fixup (regular) error" message
  Since we're ensured to have at least one error message for each
  unrepaired sector (before rate limit), there is no need for
  a duplicated line.

  And if we hit rate limits, we will rate limit all the error messages
  together, not treating different error messages differently.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 26 ++++++++------------------
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 0d2b042d75c2..f942c9e3f121 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -427,7 +427,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 	 * hold all of the paths here
 	 */
 	for (i = 0; i < ipath->fspath->elem_cnt; ++i) {
-		btrfs_warn_in_rcu(fs_info,
+		btrfs_warn_rl_in_rcu(fs_info,
 "%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu, path: %s",
 				  swarn->errstr, swarn->logical,
 				  btrfs_dev_name(swarn->dev),
@@ -442,7 +442,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
 	return 0;
 
 err:
-	btrfs_warn_in_rcu(fs_info,
+	btrfs_warn_rl_in_rcu(fs_info,
 			  "%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu: path resolving failed with ret=%d",
 			  swarn->errstr, swarn->logical,
 			  btrfs_dev_name(swarn->dev),
@@ -500,7 +500,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 				break;
 			if (ret > 0)
 				break;
-			btrfs_warn_in_rcu(fs_info,
+			btrfs_warn_rl_in_rcu(fs_info,
 "%s at logical %llu on dev %s, physical %llu: metadata %s (level %d) in tree %llu",
 				errstr, swarn.logical, btrfs_dev_name(dev),
 				swarn.physical, (ref_level ? "node" : "leaf"),
@@ -508,7 +508,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 			swarn.message_printed = true;
 		}
 		if (!swarn.message_printed)
-			btrfs_warn_in_rcu(fs_info,
+			btrfs_warn_rl_in_rcu(fs_info,
 			"%s at metadata, logical %llu on dev %s physical %llu",
 					  errstr, swarn.logical,
 					  btrfs_dev_name(dev), swarn.physical);
@@ -527,7 +527,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
 
 		iterate_extent_inodes(&ctx, true, scrub_print_warning_inode, &swarn);
 		if (!swarn.message_printed)
-			btrfs_warn_in_rcu(fs_info,
+			btrfs_warn_rl_in_rcu(fs_info,
 	"%s at data, filename unresolved, logical %llu on dev %s physical %llu",
 					  errstr, swarn.logical,
 					  btrfs_dev_name(dev), swarn.physical);
@@ -846,8 +846,6 @@ static void scrub_stripe_submit_repair_read(struct scrub_stripe *stripe,
 static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 				       struct scrub_stripe *stripe)
 {
-	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
-				      DEFAULT_RATELIMIT_BURST);
 	struct btrfs_fs_info *fs_info = sctx->fs_info;
 	struct btrfs_device *dev = stripe->dev;
 	u64 stripe_physical = stripe->physical;
@@ -899,22 +897,14 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
 		}
 
 		/* The remaining are all for unrepaired. */
-		btrfs_err_rl_in_rcu(fs_info,
-	"unable to fixup (regular) error at logical %llu on dev %s physical %llu",
-					    logical, btrfs_dev_name(dev),
-					    physical);
-
 		if (test_bit(sector_nr, &stripe->io_error_bitmap))
-			if (__ratelimit(&rs))
-				scrub_print_common_warning("i/o error", dev,
+			scrub_print_common_warning("i/o error", dev,
 						     logical, physical);
 		if (test_bit(sector_nr, &stripe->csum_error_bitmap))
-			if (__ratelimit(&rs))
-				scrub_print_common_warning("checksum error", dev,
+			scrub_print_common_warning("checksum error", dev,
 						     logical, physical);
 		if (test_bit(sector_nr, &stripe->meta_error_bitmap))
-			if (__ratelimit(&rs))
-				scrub_print_common_warning("header error", dev,
+			scrub_print_common_warning("header error", dev,
 						     logical, physical);
 	}
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/7] btrfs: scrub: remove unnecessary dev/physical lookup for scrub_stripe_report_errors()
  2024-03-20  3:54 ` [PATCH v2 4/7] btrfs: scrub: remove unnecessary dev/physical lookup for scrub_stripe_report_errors() Qu Wenruo
@ 2024-03-20 13:09   ` Filipe Manana
  0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2024-03-20 13:09 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Mar 20, 2024 at 3:55 AM Qu Wenruo <wqu@suse.com> wrote:
>
> The @stripe passed into scrub_stripe_report_errors() either has
> stripe->dev and stripe->physical properly populated (regular data
> stripes) or stripe->flags would have SCRUB_STRIPE_FLAG_NO_REPORT
> (RAID56 P/Q stripes).
>
> Thus there is no need to go with btrfs_map_block() to get the
> dev/physical.
>
> Just add an extra ASSERT() to make sure we get stripe->dev populated
> correctly.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good now, thanks.

> ---
>  fs/btrfs/scrub.c | 54 +++++++-----------------------------------------
>  1 file changed, 7 insertions(+), 47 deletions(-)
>
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index a5a9fef2bdb2..ff952dd2b510 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -863,7 +863,7 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
>         static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
>                                       DEFAULT_RATELIMIT_BURST);
>         struct btrfs_fs_info *fs_info = sctx->fs_info;
> -       struct btrfs_device *dev = NULL;
> +       struct btrfs_device *dev = stripe->dev;
>         u64 stripe_physical = stripe->physical;
>         int nr_data_sectors = 0;
>         int nr_meta_sectors = 0;
> @@ -874,35 +874,7 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
>         if (test_bit(SCRUB_STRIPE_FLAG_NO_REPORT, &stripe->state))
>                 return;
>
> -       /*
> -        * Init needed infos for error reporting.
> -        *
> -        * Although our scrub_stripe infrastructure is mostly based on btrfs_submit_bio()
> -        * thus no need for dev/physical, error reporting still needs dev and physical.
> -        */
> -       if (!bitmap_empty(&stripe->init_error_bitmap, stripe->nr_sectors)) {
> -               u64 mapped_len = fs_info->sectorsize;
> -               struct btrfs_io_context *bioc = NULL;
> -               int stripe_index = stripe->mirror_num - 1;
> -               int ret;
> -
> -               /* For scrub, our mirror_num should always start at 1. */
> -               ASSERT(stripe->mirror_num >= 1);
> -               ret = btrfs_map_block(fs_info, BTRFS_MAP_GET_READ_MIRRORS,
> -                                     stripe->logical, &mapped_len, &bioc,
> -                                     NULL, NULL);
> -               /*
> -                * If we failed, dev will be NULL, and later detailed reports
> -                * will just be skipped.
> -                */
> -               if (ret < 0)
> -                       goto skip;
> -               stripe_physical = bioc->stripes[stripe_index].physical;
> -               dev = bioc->stripes[stripe_index].dev;
> -               btrfs_put_bioc(bioc);
> -       }
> -
> -skip:
> +       ASSERT(dev);
>         for_each_set_bit(sector_nr, &stripe->extent_sector_bitmap, stripe->nr_sectors) {
>                 const u64 logical = stripe->logical +
>                                     (sector_nr << fs_info->sectorsize_bits);
> @@ -933,41 +905,29 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
>                  * output the message of repaired message.
>                  */
>                 if (repaired) {
> -                       if (dev) {
> -                               btrfs_err_rl_in_rcu(fs_info,
> +                       btrfs_err_rl_in_rcu(fs_info,
>                         "fixed up error at logical %llu on dev %s physical %llu",
>                                             logical, btrfs_dev_name(dev),
>                                             physical);
> -                       } else {
> -                               btrfs_err_rl_in_rcu(fs_info,
> -                       "fixed up error at logical %llu on mirror %u",
> -                                           logical, stripe->mirror_num);
> -                       }
>                         continue;
>                 }
>
>                 /* The remaining are all for unrepaired. */
> -               if (dev) {
> -                       btrfs_err_rl_in_rcu(fs_info,
> +               btrfs_err_rl_in_rcu(fs_info,
>         "unable to fixup (regular) error at logical %llu on dev %s physical %llu",
>                                             logical, btrfs_dev_name(dev),
>                                             physical);
> -               } else {
> -                       btrfs_err_rl_in_rcu(fs_info,
> -       "unable to fixup (regular) error at logical %llu on mirror %u",
> -                                           logical, stripe->mirror_num);
> -               }
>
>                 if (test_bit(sector_nr, &stripe->io_error_bitmap))
> -                       if (__ratelimit(&rs) && dev)
> +                       if (__ratelimit(&rs))
>                                 scrub_print_common_warning("i/o error", dev,
>                                                      logical, physical);
>                 if (test_bit(sector_nr, &stripe->csum_error_bitmap))
> -                       if (__ratelimit(&rs) && dev)
> +                       if (__ratelimit(&rs))
>                                 scrub_print_common_warning("checksum error", dev,
>                                                      logical, physical);
>                 if (test_bit(sector_nr, &stripe->meta_error_bitmap))
> -                       if (__ratelimit(&rs) && dev)
> +                       if (__ratelimit(&rs))
>                                 scrub_print_common_warning("header error", dev,
>                                                      logical, physical);
>         }
> --
> 2.44.0
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] btrfs: scrub: ensure we output at least one error message for unrepaired corruption
  2024-03-20  3:54 ` [PATCH v2 6/7] btrfs: scrub: ensure we output at least one error message for unrepaired corruption Qu Wenruo
@ 2024-03-20 13:30   ` Filipe Manana
  0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2024-03-20 13:30 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Mar 20, 2024 at 3:55 AM Qu Wenruo <wqu@suse.com> wrote:
>
> For btrfs scrub error messages, I have hit a lot of support cases on
> older kernels where no filename is outputted at all, with only error
> messages like:
>
>  BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2823, gen 0
>  BTRFS error (device dm-0): unable to fixup (regular) error at logical 1563504640 on dev /dev/mapper/sys-rootlv
>  BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2824, gen 0
>  BTRFS error (device dm-0): unable to fixup (regular) error at logical 1579016192 on dev /dev/mapper/sys-rootlv
>  BTRFS error (device dm-0): bdev /dev/mapper/sys-rootlv errs: wr 0, rd 0, flush 0, corrupt 2825, gen 0
>
> The "unable to fixup (regular) error" line shows we hit an unrepairable
> error, then normally we would do data/metadata backref walk to grab the
> correct info.
>
> But we can hit cases like the inode is already orphan (unlinked from any
> parent inode), or even the subvolume is orphan (unlinked and waiting to
> be deleted).
>
> In that case we're not sure what's the proper way to continue (Is it
> some critical data/metadata? Would it prevent the system from booting?)
>
> To improve the situation, this patch would:
>
> - Ensure we at least output one message for each unrepairable error
>   If by somehow we didn't output any error message for the error, we
>   always fallback to the basic logical/physical error output, with its
>   type (data or metadata).
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, just a minor comment below.

> ---
>  fs/btrfs/scrub.c | 25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 85f321e3e37c..0d2b042d75c2 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -226,6 +226,7 @@ struct scrub_warning {
>         u64                     physical;
>         u64                     logical;
>         struct btrfs_device     *dev;
> +       bool                    message_printed;
>  };
>
>  static void release_scrub_stripe(struct scrub_stripe *stripe)
> @@ -425,7 +426,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
>          * we deliberately ignore the bit ipath might have been too small to
>          * hold all of the paths here
>          */
> -       for (i = 0; i < ipath->fspath->elem_cnt; ++i)
> +       for (i = 0; i < ipath->fspath->elem_cnt; ++i) {
>                 btrfs_warn_in_rcu(fs_info,
>  "%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu, path: %s",
>                                   swarn->errstr, swarn->logical,
> @@ -433,6 +434,8 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
>                                   swarn->physical,
>                                   root, inum, offset,
>                                   (char *)(unsigned long)ipath->fspath->val[i]);
> +               swarn->message_printed = true;
> +       }
>
>         btrfs_put_root(local_root);
>         free_ipath(ipath);
> @@ -445,7 +448,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
>                           btrfs_dev_name(swarn->dev),
>                           swarn->physical,
>                           root, inum, offset, ret);
> -
> +       swarn->message_printed = true;
>         free_ipath(ipath);
>         return 0;
>  }
> @@ -471,6 +474,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>         swarn.logical = logical;
>         swarn.errstr = errstr;
>         swarn.dev = NULL;
> +       swarn.message_printed = false;
>
>         ret = extent_from_logical(fs_info, swarn.logical, path, &found_key,
>                                   &flags);
> @@ -492,12 +496,8 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>                         ret = tree_backref_for_extent(&ptr, eb, &found_key, ei,
>                                                       item_size, &ref_root,
>                                                       &ref_level);
> -                       if (ret < 0) {
> -                               btrfs_warn(fs_info,
> -                               "failed to resolve tree backref for logical %llu: %d",
> -                                                 swarn.logical, ret);
> +                       if (ret < 0)
>                                 break;
> -                       }
>                         if (ret > 0)
>                                 break;
>                         btrfs_warn_in_rcu(fs_info,
> @@ -505,7 +505,13 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>                                 errstr, swarn.logical, btrfs_dev_name(dev),
>                                 swarn.physical, (ref_level ? "node" : "leaf"),
>                                 ref_level, ref_root);
> +                       swarn.message_printed = true;
>                 }
> +               if (!swarn.message_printed)
> +                       btrfs_warn_in_rcu(fs_info,
> +                       "%s at metadata, logical %llu on dev %s physical %llu",
> +                                         errstr, swarn.logical,
> +                                         btrfs_dev_name(dev), swarn.physical);

This could be moved below the btrfs_release_path() call, as there's no
need to be holding a leaf locked while printing the message. It's an
error path and should not happen normally, but it's a good practice to
release any locks as soon as they aren't needed anymore.

Anyway, it's optional, and feel free to skip or do that adjustment
when committing the patch.

Thanks.

>                 btrfs_release_path(path);
>         } else {
>                 struct btrfs_backref_walk_ctx ctx = { 0 };
> @@ -520,6 +526,11 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>                 swarn.dev = dev;
>
>                 iterate_extent_inodes(&ctx, true, scrub_print_warning_inode, &swarn);
> +               if (!swarn.message_printed)
> +                       btrfs_warn_in_rcu(fs_info,
> +       "%s at data, filename unresolved, logical %llu on dev %s physical %llu",
> +                                         errstr, swarn.logical,
> +                                         btrfs_dev_name(dev), swarn.physical);
>         }
>
>  out:
> --
> 2.44.0
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 7/7] btrfs: scrub: use generic ratelimit helpers to output error messages
  2024-03-20  3:54 ` [PATCH v2 7/7] btrfs: scrub: use generic ratelimit helpers to output error messages Qu Wenruo
@ 2024-03-20 13:33   ` Filipe Manana
  0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2024-03-20 13:33 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Mar 20, 2024 at 3:55 AM Qu Wenruo <wqu@suse.com> wrote:
>
> Currently scrub goes different ways to rate limits its error messages:
>
> - regular btrfs_err_rl_in_rcu() helper for repaired sectors and
>   the initial message for unrepaired sectors
>
> - Manually rate limits scrub_print_common_warning()
>
> I'd say the different rate limits could lead to cases where we only got
> the "unable to fixup (regular) error" messages but the detailed report
> about that corruption is ratelimited.
>
> To make the whole rate limit works more consistently, change the rate
> limit by:
>
> - Always using btrfs_*_rl() helpers
>
> - Remove the initial "unable to fixup (regular) error" message
>   Since we're ensured to have at least one error message for each
>   unrepaired sector (before rate limit), there is no need for
>   a duplicated line.
>
>   And if we hit rate limits, we will rate limit all the error messages
>   together, not treating different error messages differently.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/scrub.c | 26 ++++++++------------------
>  1 file changed, 8 insertions(+), 18 deletions(-)
>
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 0d2b042d75c2..f942c9e3f121 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -427,7 +427,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
>          * hold all of the paths here
>          */
>         for (i = 0; i < ipath->fspath->elem_cnt; ++i) {
> -               btrfs_warn_in_rcu(fs_info,
> +               btrfs_warn_rl_in_rcu(fs_info,
>  "%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu, path: %s",
>                                   swarn->errstr, swarn->logical,
>                                   btrfs_dev_name(swarn->dev),
> @@ -442,7 +442,7 @@ static int scrub_print_warning_inode(u64 inum, u64 offset, u64 num_bytes,
>         return 0;
>
>  err:
> -       btrfs_warn_in_rcu(fs_info,
> +       btrfs_warn_rl_in_rcu(fs_info,
>                           "%s at logical %llu on dev %s, physical %llu, root %llu, inode %llu, offset %llu: path resolving failed with ret=%d",
>                           swarn->errstr, swarn->logical,
>                           btrfs_dev_name(swarn->dev),
> @@ -500,7 +500,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>                                 break;
>                         if (ret > 0)
>                                 break;
> -                       btrfs_warn_in_rcu(fs_info,
> +                       btrfs_warn_rl_in_rcu(fs_info,
>  "%s at logical %llu on dev %s, physical %llu: metadata %s (level %d) in tree %llu",
>                                 errstr, swarn.logical, btrfs_dev_name(dev),
>                                 swarn.physical, (ref_level ? "node" : "leaf"),
> @@ -508,7 +508,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>                         swarn.message_printed = true;
>                 }
>                 if (!swarn.message_printed)
> -                       btrfs_warn_in_rcu(fs_info,
> +                       btrfs_warn_rl_in_rcu(fs_info,
>                         "%s at metadata, logical %llu on dev %s physical %llu",
>                                           errstr, swarn.logical,
>                                           btrfs_dev_name(dev), swarn.physical);
> @@ -527,7 +527,7 @@ static void scrub_print_common_warning(const char *errstr, struct btrfs_device *
>
>                 iterate_extent_inodes(&ctx, true, scrub_print_warning_inode, &swarn);
>                 if (!swarn.message_printed)
> -                       btrfs_warn_in_rcu(fs_info,
> +                       btrfs_warn_rl_in_rcu(fs_info,
>         "%s at data, filename unresolved, logical %llu on dev %s physical %llu",
>                                           errstr, swarn.logical,
>                                           btrfs_dev_name(dev), swarn.physical);
> @@ -846,8 +846,6 @@ static void scrub_stripe_submit_repair_read(struct scrub_stripe *stripe,
>  static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
>                                        struct scrub_stripe *stripe)
>  {
> -       static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> -                                     DEFAULT_RATELIMIT_BURST);
>         struct btrfs_fs_info *fs_info = sctx->fs_info;
>         struct btrfs_device *dev = stripe->dev;
>         u64 stripe_physical = stripe->physical;
> @@ -899,22 +897,14 @@ static void scrub_stripe_report_errors(struct scrub_ctx *sctx,
>                 }
>
>                 /* The remaining are all for unrepaired. */
> -               btrfs_err_rl_in_rcu(fs_info,
> -       "unable to fixup (regular) error at logical %llu on dev %s physical %llu",
> -                                           logical, btrfs_dev_name(dev),
> -                                           physical);
> -
>                 if (test_bit(sector_nr, &stripe->io_error_bitmap))
> -                       if (__ratelimit(&rs))
> -                               scrub_print_common_warning("i/o error", dev,
> +                       scrub_print_common_warning("i/o error", dev,
>                                                      logical, physical);
>                 if (test_bit(sector_nr, &stripe->csum_error_bitmap))
> -                       if (__ratelimit(&rs))
> -                               scrub_print_common_warning("checksum error", dev,
> +                       scrub_print_common_warning("checksum error", dev,
>                                                      logical, physical);
>                 if (test_bit(sector_nr, &stripe->meta_error_bitmap))
> -                       if (__ratelimit(&rs))
> -                               scrub_print_common_warning("header error", dev,
> +                       scrub_print_common_warning("header error", dev,
>                                                      logical, physical);
>         }
>
> --
> 2.44.0
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print()
  2024-03-20  3:54 ` [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print() Qu Wenruo
@ 2024-04-04 20:26   ` David Sterba
  2024-04-04 22:02     ` Qu Wenruo
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2024-04-04 20:26 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, stable, Filipe Manana

On Wed, Mar 20, 2024 at 02:24:52PM +1030, Qu Wenruo wrote:
> Currently when we increase the device statistics, it would always lead
> to an error message in the kernel log.
> 
> However this output is mostly duplicated with the existing ones:
> 
> - For scrub operations
>   We always have the following messages:
>   * "fixed up error at logical %llu"
>   * "unable to fixup (regular) error at logical %llu"
> 
>   So no matter if the corruption is repaired or not, it scrub would
>   output an error message to indicate the problem.
> 
> - For non-scrub read operations
>   We also have the following messages:
>   * "csum failed root %lld inode %llu off %llu" for data csum mismatch
>   * "bad (tree block start|fsid|tree block level)" for metadata
>   * "read error corrected: ino %llu off %llu" for repaired data/metadata
> 
> So the error message from btrfs_dev_stat_inc_and_print() is duplicated.
> 
> The real usage for the btrfs device statistics is for some user space
> daemon to check if there is any new errors, acting like some checks on
> SMART, thus we don't really need/want those messages in dmesg.
> 
> This patch would reduce the log level to debug (disabled by default) for
> btrfs_dev_stat_inc_and_print().
> For users really want to utilize btrfs devices statistics, they should
> go check "btrfs device stats" periodically, and we should focus the
> kernel error messages to more important things.

I kind if disagree with each point.

The message is meant to be logged as it will happen in production and
outside of development, so the debug level does not make sense.

The stats message is not duplicated for the individual causes, it
additionally tracks the whole state.

Logging important messages to system log is a common thing and we do that
a lot, this makes debugging and anlyzing things easier. We can't
expect that there would always be a daemon collecting the stats, there's
not standardized or recommended tool for that. A quick look to dmesg can
show that something is wrong.

What we can do: reduce the number messages so the whole stats are
printed once per transaction if there is a change.

We can also tune which events also print the stats, for example flush
errors are more interesting than read/write, comparing the number of
events that can happen in a batch.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print()
  2024-04-04 20:26   ` David Sterba
@ 2024-04-04 22:02     ` Qu Wenruo
  0 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2024-04-04 22:02 UTC (permalink / raw)
  To: dsterba, Qu Wenruo; +Cc: linux-btrfs, stable, Filipe Manana



在 2024/4/5 06:56, David Sterba 写道:
> On Wed, Mar 20, 2024 at 02:24:52PM +1030, Qu Wenruo wrote:
>> Currently when we increase the device statistics, it would always lead
>> to an error message in the kernel log.
>>
>> However this output is mostly duplicated with the existing ones:
>>
>> - For scrub operations
>>    We always have the following messages:
>>    * "fixed up error at logical %llu"
>>    * "unable to fixup (regular) error at logical %llu"
>>
>>    So no matter if the corruption is repaired or not, it scrub would
>>    output an error message to indicate the problem.
>>
>> - For non-scrub read operations
>>    We also have the following messages:
>>    * "csum failed root %lld inode %llu off %llu" for data csum mismatch
>>    * "bad (tree block start|fsid|tree block level)" for metadata
>>    * "read error corrected: ino %llu off %llu" for repaired data/metadata
>>
>> So the error message from btrfs_dev_stat_inc_and_print() is duplicated.
>>
>> The real usage for the btrfs device statistics is for some user space
>> daemon to check if there is any new errors, acting like some checks on
>> SMART, thus we don't really need/want those messages in dmesg.
>>
>> This patch would reduce the log level to debug (disabled by default) for
>> btrfs_dev_stat_inc_and_print().
>> For users really want to utilize btrfs devices statistics, they should
>> go check "btrfs device stats" periodically, and we should focus the
>> kernel error messages to more important things.
>
> I kind if disagree with each point.
>
> The message is meant to be logged as it will happen in production and
> outside of development, so the debug level does not make sense.
>
> The stats message is not duplicated for the individual causes, it
> additionally tracks the whole state.

I'd disagree with this.

We already have mount time output, and detailed causes are way more
useful and just a duplicated message repeating itself.

>
> Logging important messages to system log is a common thing and we do that
> a lot, this makes debugging and anlyzing things easier. We can't
> expect that there would always be a daemon collecting the stats, there's
> not standardized or recommended tool for that. A quick look to dmesg can
> show that something is wrong.

Then try supporting cases with all these duplicated and useless
messages, you'll hardly agree that they provide any usefulness.

>
> What we can do: reduce the number messages so the whole stats are
> printed once per transaction if there is a change.
>
> We can also tune which events also print the stats, for example flush
> errors are more interesting than read/write, comparing the number of
> events that can happen in a batch.

My another point is, if it's an important error, we should output it
with detailed reason/cause/extra info immediately.

And that's already the case for regular/scrub read errors.

For critical operations like flush, we should output extra error
messages, other than relying on that generic and vague error.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-04-04 22:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-20  3:54 [PATCH v2 0/7] btrfs: scrub: refine the error messages output frequency Qu Wenruo
2024-03-20  3:54 ` [PATCH v2 1/7] btrfs: scrub: fix incorrectly reported logical/physical address Qu Wenruo
2024-03-20  3:54 ` [PATCH v2 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print() Qu Wenruo
2024-04-04 20:26   ` David Sterba
2024-04-04 22:02     ` Qu Wenruo
2024-03-20  3:54 ` [PATCH v2 3/7] btrfs: scrub: remove unused is_super parameter from scrub_print_common_warning() Qu Wenruo
2024-03-20  3:54 ` [PATCH v2 4/7] btrfs: scrub: remove unnecessary dev/physical lookup for scrub_stripe_report_errors() Qu Wenruo
2024-03-20 13:09   ` Filipe Manana
2024-03-20  3:54 ` [PATCH v2 5/7] btrfs: scrub: simplify the inode iteration output Qu Wenruo
2024-03-20  3:54 ` [PATCH v2 6/7] btrfs: scrub: ensure we output at least one error message for unrepaired corruption Qu Wenruo
2024-03-20 13:30   ` Filipe Manana
2024-03-20  3:54 ` [PATCH v2 7/7] btrfs: scrub: use generic ratelimit helpers to output error messages Qu Wenruo
2024-03-20 13:33   ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).