linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE
@ 2022-02-03 19:28 Song Liu
  2022-02-03 19:28 ` [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE Song Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Song Liu @ 2022-02-03 19:28 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, hare, Song Liu

Changes v1 => v2:
1. Add patch 2/3 to change user visible return value to -ENODEV. (Hannes)
2. In the commit log, explain the reason to keep EIO in 1/3.

We have a use case where HDDs are regularly power on/off to perserve power.
When a drive is being removed, we often see errors like

   [  172.803279] I/O error, dev sda, sector 3137184

These messages are confusing for automations that grep dmesg, as they look
very similar to real HDD error.

Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
the error message looks like

   [  172.803279] device offline error, dev sda, sector 3137184

so that the automations won't confuse them with real I/O error.

Song Liu (3):
  block: introduce BLK_STS_OFFLINE
  block: return -ENODEV for BLK_STS_OFFLINE
  scsi: use BLK_STS_OFFLINE for not fully online devices

 block/blk-core.c          | 1 +
 drivers/scsi/scsi_lib.c   | 2 +-
 include/linux/blk_types.h | 7 +++++++
 3 files changed, 9 insertions(+), 1 deletion(-)

--
2.30.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE
  2022-02-03 19:28 [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
@ 2022-02-03 19:28 ` Song Liu
  2022-02-04  7:25   ` Hannes Reinecke
  2022-02-03 19:28 ` [PATCH v2 2/3] block: return -ENODEV for BLK_STS_OFFLINE Song Liu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Song Liu @ 2022-02-03 19:28 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, hare, Song Liu

Currently, drivers reports BLK_STS_IOERR for devices that are not full
online or being removed. This behavior could cause confusion for users,
as they are not really I/O errors from the device.

Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
offline error" in dmesg instead of "I/O error".

EIO is intentionally kept to not change user visible return value.

Signed-off-by: Song Liu <song@kernel.org>
---
 block/blk-core.c          | 1 +
 include/linux/blk_types.h | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 61f6a0dc4511..24035dd2eef1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -164,6 +164,7 @@ static const struct {
 	[BLK_STS_RESOURCE]	= { -ENOMEM,	"kernel resource" },
 	[BLK_STS_DEV_RESOURCE]	= { -EBUSY,	"device resource" },
 	[BLK_STS_AGAIN]		= { -EAGAIN,	"nonblocking retry" },
+	[BLK_STS_OFFLINE]	= { -EIO,	"device offline" },
 
 	/* device mapper special case, should not leak out: */
 	[BLK_STS_DM_REQUEUE]	= { -EREMCHG, "dm internal retry" },
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index fe065c394fff..5561e58d158a 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
  */
 #define BLK_STS_ZONE_ACTIVE_RESOURCE	((__force blk_status_t)16)
 
+/*
+ * BLK_STS_OFFLINE is returned from the driver when the target device is offline
+ * or is being taken offline. This could help differentiate the case where a
+ * device is intentionally being shut down from a real I/O error.
+ */
+#define BLK_STS_OFFLINE		((__force blk_status_t)17)
+
 /**
  * blk_path_error - returns true if error may be path related
  * @error: status the request was completed with
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] block: return -ENODEV for BLK_STS_OFFLINE
  2022-02-03 19:28 [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
  2022-02-03 19:28 ` [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE Song Liu
@ 2022-02-03 19:28 ` Song Liu
  2022-02-04  7:26   ` Hannes Reinecke
  2022-02-03 19:28 ` [PATCH v2 3/3] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Song Liu @ 2022-02-03 19:28 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, hare, Song Liu

Change the user visible return value for BLK_STS_OFFLINE to -ENODEV, which
is more descriptive than existing -EIO.

Signed-off-by: Song Liu <song@kernel.org>
---
 block/blk-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 24035dd2eef1..be8812f5489d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -164,7 +164,7 @@ static const struct {
 	[BLK_STS_RESOURCE]	= { -ENOMEM,	"kernel resource" },
 	[BLK_STS_DEV_RESOURCE]	= { -EBUSY,	"device resource" },
 	[BLK_STS_AGAIN]		= { -EAGAIN,	"nonblocking retry" },
-	[BLK_STS_OFFLINE]	= { -EIO,	"device offline" },
+	[BLK_STS_OFFLINE]	= { -ENODEV,	"device offline" },
 
 	/* device mapper special case, should not leak out: */
 	[BLK_STS_DM_REQUEUE]	= { -EREMCHG, "dm internal retry" },
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] scsi: use BLK_STS_OFFLINE for not fully online devices
  2022-02-03 19:28 [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
  2022-02-03 19:28 ` [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE Song Liu
  2022-02-03 19:28 ` [PATCH v2 2/3] block: return -ENODEV for BLK_STS_OFFLINE Song Liu
@ 2022-02-03 19:28 ` Song Liu
  2022-02-04  3:16 ` [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Martin K. Petersen
  2022-02-04  4:10 ` Jens Axboe
  4 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2022-02-03 19:28 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, hare, Song Liu

The new error message for such case looks like

[  172.809565] device offline error, dev sda, sector 3138208 ...

which will not be confused with regular I/O error (BLK_STS_IOERR).

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Song Liu <song@kernel.org>
---
 drivers/scsi/scsi_lib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..e30bc51578e9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
 		 * power management commands.
 		 */
 		if (req && !(req->rq_flags & RQF_PM))
-			return BLK_STS_IOERR;
+			return BLK_STS_OFFLINE;
 		return BLK_STS_OK;
 	}
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE
  2022-02-03 19:28 [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
                   ` (2 preceding siblings ...)
  2022-02-03 19:28 ` [PATCH v2 3/3] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
@ 2022-02-04  3:16 ` Martin K. Petersen
  2022-02-04  4:10 ` Jens Axboe
  4 siblings, 0 replies; 8+ messages in thread
From: Martin K. Petersen @ 2022-02-04  3:16 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-block, linux-scsi, kernel-team, jejb, martin.petersen,
	axboe, hare


Song,

> We have a use case where HDDs are regularly power on/off to perserve power.
> When a drive is being removed, we often see errors like
>
>    [  172.803279] I/O error, dev sda, sector 3137184
>
> These messages are confusing for automations that grep dmesg, as they look
> very similar to real HDD error.
>
> Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
> the error message looks like
>
>    [  172.803279] device offline error, dev sda, sector 3137184
>
> so that the automations won't confuse them with real I/O error.

Looks OK to me.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE
  2022-02-03 19:28 [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
                   ` (3 preceding siblings ...)
  2022-02-04  3:16 ` [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Martin K. Petersen
@ 2022-02-04  4:10 ` Jens Axboe
  4 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2022-02-04  4:10 UTC (permalink / raw)
  To: Song Liu, linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, hare

On Thu, 3 Feb 2022 11:28:24 -0800, Song Liu wrote:
> Changes v1 => v2:
> 1. Add patch 2/3 to change user visible return value to -ENODEV. (Hannes)
> 2. In the commit log, explain the reason to keep EIO in 1/3.
> 
> We have a use case where HDDs are regularly power on/off to perserve power.
> When a drive is being removed, we often see errors like
> 
> [...]

Applied, thanks!

[1/3] block: introduce BLK_STS_OFFLINE
      commit: 2651bf680bc2ad9a078b7222b0873145ab4ece07
[2/3] block: return -ENODEV for BLK_STS_OFFLINE
      commit: 7d32c027a21ef7aa0a400763397644d44b3576a9
[3/3] scsi: use BLK_STS_OFFLINE for not fully online devices
      commit: 9574d43479e16352e75bc875c9952ed8e129c9b2

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE
  2022-02-03 19:28 ` [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE Song Liu
@ 2022-02-04  7:25   ` Hannes Reinecke
  0 siblings, 0 replies; 8+ messages in thread
From: Hannes Reinecke @ 2022-02-04  7:25 UTC (permalink / raw)
  To: Song Liu, linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe

On 2/3/22 20:28, Song Liu wrote:
> Currently, drivers reports BLK_STS_IOERR for devices that are not full
> online or being removed. This behavior could cause confusion for users,
> as they are not really I/O errors from the device.
> 
> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
> offline error" in dmesg instead of "I/O error".
> 
> EIO is intentionally kept to not change user visible return value.
> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>   block/blk-core.c          | 1 +
>   include/linux/blk_types.h | 7 +++++++
>   2 files changed, 8 insertions(+)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] block: return -ENODEV for BLK_STS_OFFLINE
  2022-02-03 19:28 ` [PATCH v2 2/3] block: return -ENODEV for BLK_STS_OFFLINE Song Liu
@ 2022-02-04  7:26   ` Hannes Reinecke
  0 siblings, 0 replies; 8+ messages in thread
From: Hannes Reinecke @ 2022-02-04  7:26 UTC (permalink / raw)
  To: Song Liu, linux-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe

On 2/3/22 20:28, Song Liu wrote:
> Change the user visible return value for BLK_STS_OFFLINE to -ENODEV, which
> is more descriptive than existing -EIO.
> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>   block/blk-core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-04  7:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-03 19:28 [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2022-02-03 19:28 ` [PATCH v2 1/3] block: introduce BLK_STS_OFFLINE Song Liu
2022-02-04  7:25   ` Hannes Reinecke
2022-02-03 19:28 ` [PATCH v2 2/3] block: return -ENODEV for BLK_STS_OFFLINE Song Liu
2022-02-04  7:26   ` Hannes Reinecke
2022-02-03 19:28 ` [PATCH v2 3/3] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
2022-02-04  3:16 ` [PATCH v2 0/3] block: scsi: introduce and use BLK_STS_OFFLINE Martin K. Petersen
2022-02-04  4:10 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).