linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: fix false warning in bdev_count_inflight_rw()
@ 2025-06-26  8:39 Yu Kuai
  2025-06-26  8:54 ` Damien Le Moal
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Yu Kuai @ 2025-06-26  8:39 UTC (permalink / raw)
  To: axboe, hare, hch, yukuai3, john.g.garry
  Cc: linux-block, linux-kernel, yi.zhang, calvin, david, yukuai1,
	yi.zhang, yangerkun, johnny.chenyi

From: Yu Kuai <yukuai3@huawei.com>

While bdev_count_inflight is interating all cpus, if some IOs are issued
from traversed cpu and then completed from the cpu that is not traversed
yet:

cpu0
		cpu1
		bdev_count_inflight
		 //for_each_possible_cpu
		 // cpu0 is 0
		 infliht += 0
// issue a io
blk_account_io_start
// cpu0 inflight ++

				cpu2
				// the io is done
				blk_account_io_done
				// cpu2 inflight --
		 // cpu 1 is 0
		 inflight += 0
		 // cpu2 is -1
		 inflight += -1
		 ...

In this case, the total inflight will be -1, causing lots of false
warning. Fix the problem by removing the warning.

Noted there is still a valid warning for nvme-mpath(From Yi) that is not
fixed yet.

Fixes: f5482ee5edb9 ("block: WARN if bdev inflight counter is negative")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Closes: https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/#mae89155a5006463d0a21a4a2c35ae0034b26a339
Reported-and-tested-by: Calvin Owens <calvin@wbinvd.org>
Closes: https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/#m1d935a00070bf95055d0ac84e6075158b08acaef
Reported-by: Dave Chinner <david@fromorbit.com>
Closes: https://lore.kernel.org/linux-block/aFuypjqCXo9-5_En@dread.disaster.area/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/genhd.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 8171a6bc3210..680fa717082f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -141,9 +141,14 @@ static void bdev_count_inflight_rw(struct block_device *part,
 		}
 	}
 
-	if (WARN_ON_ONCE((int)inflight[READ] < 0))
+	/*
+	 * While iterating all cpus, some IOs might issued from traversed cpu
+	 * and then completed from the cpu that is not traversed yet, causing
+	 * the inflight number to be negative.
+	 */
+	if ((int)inflight[READ] < 0)
 		inflight[READ] = 0;
-	if (WARN_ON_ONCE((int)inflight[WRITE] < 0))
+	if ((int)inflight[WRITE] < 0)
 		inflight[WRITE] = 0;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix false warning in bdev_count_inflight_rw()
  2025-06-26  8:39 [PATCH] block: fix false warning in bdev_count_inflight_rw() Yu Kuai
@ 2025-06-26  8:54 ` Damien Le Moal
  2025-06-26 11:32   ` Yu Kuai
  2025-06-26  9:34 ` John Garry
  2025-06-26 10:11 ` Christoph Hellwig
  2 siblings, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2025-06-26  8:54 UTC (permalink / raw)
  To: Yu Kuai, axboe, hare, hch, yukuai3, john.g.garry
  Cc: linux-block, linux-kernel, yi.zhang, calvin, david, yi.zhang,
	yangerkun, johnny.chenyi

On 6/26/25 17:39, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> While bdev_count_inflight is interating all cpus, if some IOs are issued
> from traversed cpu and then completed from the cpu that is not traversed
> yet:
> 
> cpu0
> 		cpu1
> 		bdev_count_inflight
> 		 //for_each_possible_cpu
> 		 // cpu0 is 0
> 		 infliht += 0
> // issue a io
> blk_account_io_start
> // cpu0 inflight ++
> 
> 				cpu2
> 				// the io is done
> 				blk_account_io_done
> 				// cpu2 inflight --
> 		 // cpu 1 is 0
> 		 inflight += 0
> 		 // cpu2 is -1
> 		 inflight += -1
> 		 ...
> 
> In this case, the total inflight will be -1, causing lots of false
> warning. Fix the problem by removing the warning.
> 
> Noted there is still a valid warning for nvme-mpath(From Yi) that is not
> fixed yet.
> 
> Fixes: f5482ee5edb9 ("block: WARN if bdev inflight counter is negative")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Closes: https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/#mae89155a5006463d0a21a4a2c35ae0034b26a339
> Reported-and-tested-by: Calvin Owens <calvin@wbinvd.org>
> Closes: https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/#m1d935a00070bf95055d0ac84e6075158b08acaef
> Reported-by: Dave Chinner <david@fromorbit.com>
> Closes: https://lore.kernel.org/linux-block/aFuypjqCXo9-5_En@dread.disaster.area/
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  block/genhd.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 8171a6bc3210..680fa717082f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -141,9 +141,14 @@ static void bdev_count_inflight_rw(struct block_device *part,
>  		}
>  	}
>  
> -	if (WARN_ON_ONCE((int)inflight[READ] < 0))
> +	/*
> +	 * While iterating all cpus, some IOs might issued from traversed cpu
> +	 * and then completed from the cpu that is not traversed yet, causing
> +	 * the inflight number to be negative.

Nit (grammar):

	 * While iterating all CPUs, some IOs may be issued from a CPU already
	 * traversed and complete on a CPU that has not yet been traversed,
	 * causing the inflight number to be negative.

> +	 */
> +	if ((int)inflight[READ] < 0)
>  		inflight[READ] = 0;
> -	if (WARN_ON_ONCE((int)inflight[WRITE] < 0))
> +	if ((int)inflight[WRITE] < 0)
>  		inflight[WRITE] = 0;
>  }
>  


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix false warning in bdev_count_inflight_rw()
  2025-06-26  8:39 [PATCH] block: fix false warning in bdev_count_inflight_rw() Yu Kuai
  2025-06-26  8:54 ` Damien Le Moal
@ 2025-06-26  9:34 ` John Garry
  2025-06-26 11:37   ` Yu Kuai
  2025-06-26 10:11 ` Christoph Hellwig
  2 siblings, 1 reply; 6+ messages in thread
From: John Garry @ 2025-06-26  9:34 UTC (permalink / raw)
  To: Yu Kuai, axboe, hare, hch, yukuai3
  Cc: linux-block, linux-kernel, yi.zhang, calvin, david, yi.zhang,
	yangerkun, johnny.chenyi

On 26/06/2025 09:39, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> While bdev_count_inflight is interating all cpus, if some IOs are issued
> from traversed cpu and then completed from the cpu that is not traversed
> yet:
> 
> cpu0
> 		cpu1
> 		bdev_count_inflight
> 		 //for_each_possible_cpu
> 		 // cpu0 is 0
> 		 infliht += 0
> // issue a io
> blk_account_io_start
> // cpu0 inflight ++
> 
> 				cpu2
> 				// the io is done
> 				blk_account_io_done
> 				// cpu2 inflight --
> 		 // cpu 1 is 0
> 		 inflight += 0
> 		 // cpu2 is -1
> 		 inflight += -1
> 		 ...
> 
> In this case, the total inflight will be -1, causing lots of false
> warning. Fix the problem by removing the warning.

Is it even safe to even use this function when not used for just 
informative purposes? I mean, for example, it is used by md code to 
check for idle state - could that check return an invalid result (and 
cause harm)?

> 
> Noted there is still a valid warning for nvme-mpath(From Yi) that is not
> fixed yet.
> 
> Fixes: f5482ee5edb9 ("block: WARN if bdev inflight counter is negative")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/*mae89155a5006463d0a21a4a2c35ae0034b26a339__;Iw!!ACWV5N9M2RV99hQ!LLzonI0PgLV8uruViz5LkA_QGoFQSsfBMNDhb45qsRoJqxuTMcO_2BxJXhMOADfnwncgrR3o99lVDCnq75I7_UI$
> Reported-and-tested-by: Calvin Owens <calvin@wbinvd.org>
> Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/*m1d935a00070bf95055d0ac84e6075158b08acaef__;Iw!!ACWV5N9M2RV99hQ!LLzonI0PgLV8uruViz5LkA_QGoFQSsfBMNDhb45qsRoJqxuTMcO_2BxJXhMOADfnwncgrR3o99lVDCnqYruhFG0$
> Reported-by: Dave Chinner <david@fromorbit.com>
> Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-block/aFuypjqCXo9-5_En@dread.disaster.area/__;!!ACWV5N9M2RV99hQ!LLzonI0PgLV8uruViz5LkA_QGoFQSsfBMNDhb45qsRoJqxuTMcO_2BxJXhMOADfnwncgrR3o99lVDCnqj32KGls$
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>   block/genhd.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 8171a6bc3210..680fa717082f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -141,9 +141,14 @@ static void bdev_count_inflight_rw(struct block_device *part,
>   		}
>   	}
>   
> -	if (WARN_ON_ONCE((int)inflight[READ] < 0))
> +	/*
> +	 * While iterating all cpus, some IOs might issued from traversed cpu
> +	 * and then completed from the cpu that is not traversed yet, causing
> +	 * the inflight number to be negative.
> +	 */
> +	if ((int)inflight[READ] < 0)
>   		inflight[READ] = 0;
> -	if (WARN_ON_ONCE((int)inflight[WRITE] < 0))
> +	if ((int)inflight[WRITE] < 0)
>   		inflight[WRITE] = 0;
>   }
>   


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix false warning in bdev_count_inflight_rw()
  2025-06-26  8:39 [PATCH] block: fix false warning in bdev_count_inflight_rw() Yu Kuai
  2025-06-26  8:54 ` Damien Le Moal
  2025-06-26  9:34 ` John Garry
@ 2025-06-26 10:11 ` Christoph Hellwig
  2 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2025-06-26 10:11 UTC (permalink / raw)
  To: Yu Kuai
  Cc: axboe, hare, hch, yukuai3, john.g.garry, linux-block,
	linux-kernel, yi.zhang, calvin, david, yi.zhang, yangerkun,
	johnny.chenyi

I think you want to make the inflight array a signed type instead,
so that if the earlier CPUs have negative counts due to migration
that gets even out later on.  Which should also make the
counter not trigger normally.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix false warning in bdev_count_inflight_rw()
  2025-06-26  8:54 ` Damien Le Moal
@ 2025-06-26 11:32   ` Yu Kuai
  0 siblings, 0 replies; 6+ messages in thread
From: Yu Kuai @ 2025-06-26 11:32 UTC (permalink / raw)
  To: Damien Le Moal, Yu Kuai, axboe, hare, hch, john.g.garry
  Cc: linux-block, linux-kernel, yi.zhang, calvin, david, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)

Hi,

在 2025/06/26 16:54, Damien Le Moal 写道:
> Nit (grammar):
> 
> 	 * While iterating all CPUs, some IOs may be issued from a CPU already
> 	 * traversed and complete on a CPU that has not yet been traversed,
> 	 * causing the inflight number to be negative.

Thanks, will change to this in v2.
Kuai


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix false warning in bdev_count_inflight_rw()
  2025-06-26  9:34 ` John Garry
@ 2025-06-26 11:37   ` Yu Kuai
  0 siblings, 0 replies; 6+ messages in thread
From: Yu Kuai @ 2025-06-26 11:37 UTC (permalink / raw)
  To: John Garry, Yu Kuai, axboe, hare, hch
  Cc: linux-block, linux-kernel, yi.zhang, calvin, david, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)

Hi,

在 2025/06/26 17:34, John Garry 写道:
> Is it even safe to even use this function when not used for just 
> informative purposes? I mean, for example, it is used by md code to 
> check for idle state - could that check return an invalid result (and 
> cause harm)?

For md code, I think it's ok, md still use completed IO for idle state.

More importantly, the io_ticks can be wrong due to inflight to be zero,
however, it's inaccurate for a long time and there is no much we can do
:(

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-06-26 11:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26  8:39 [PATCH] block: fix false warning in bdev_count_inflight_rw() Yu Kuai
2025-06-26  8:54 ` Damien Le Moal
2025-06-26 11:32   ` Yu Kuai
2025-06-26  9:34 ` John Garry
2025-06-26 11:37   ` Yu Kuai
2025-06-26 10:11 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).