* [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE
@ 2022-02-03 6:40 Song Liu
2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:40 UTC (permalink / raw)
To: linx-block, linux-scsi
Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu
We have a use case where HDDs are regularly power on/off to perserve power.
When a drive is being removed, we often see errors like
[ 172.803279] I/O error, dev sda, sector 3137184
These messages are confusing for automations that grep dmesg, as they look
very similar to real HDD error.
Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
the error message looks like
[ 172.803279] device offline error, dev sda, sector 3137184
so that the automations won't confuse them with real I/O error.
Song Liu (2):
block: introduce BLK_STS_OFFLINE
scsi: use BLK_STS_OFFLINE for not fully online devices
block/blk-core.c | 1 +
drivers/scsi/scsi_lib.c | 2 +-
include/linux/blk_types.h | 7 +++++++
3 files changed, 9 insertions(+), 1 deletion(-)
--
2.30.2
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu @ 2022-02-03 6:40 ` Song Liu 2022-02-03 6:52 ` Song Liu 2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu 2022-02-03 6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu 2 siblings, 1 reply; 12+ messages in thread From: Song Liu @ 2022-02-03 6:40 UTC (permalink / raw) To: linx-block, linux-scsi Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu Currently, drivers reports BLK_STS_IOERR for devices that are not full online or being removed. This behavior could cause confusion for users, as they are not really I/O errors from the device. Solve this issue with a new state BLK_STS_OFFLINE, which reports "device offline error" in dmesg instead of "I/O error". Signed-off-by: Song Liu <song@kernel.org> --- block/blk-core.c | 1 + include/linux/blk_types.h | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 61f6a0dc4511..24035dd2eef1 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -164,6 +164,7 @@ static const struct { [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, /* device mapper special case, should not leak out: */ [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index fe065c394fff..5561e58d158a 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; */ #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) +/* + * BLK_STS_OFFLINE is returned from the driver when the target device is offline + * or is being taken offline. This could help differentiate the case where a + * device is intentionally being shut down from a real I/O error. + */ +#define BLK_STS_OFFLINE ((__force blk_status_t)17) + /** * blk_path_error - returns true if error may be path related * @error: status the request was completed with -- 2.30.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu @ 2022-02-03 6:52 ` Song Liu 2022-02-03 7:24 ` Hannes Reinecke 0 siblings, 1 reply; 12+ messages in thread From: Song Liu @ 2022-02-03 6:52 UTC (permalink / raw) To: linux-scsi, linux-block Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe CC linux-block (it was a typo in the original email) On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: > > Currently, drivers reports BLK_STS_IOERR for devices that are not full > online or being removed. This behavior could cause confusion for users, > as they are not really I/O errors from the device. > > Solve this issue with a new state BLK_STS_OFFLINE, which reports "device > offline error" in dmesg instead of "I/O error". > > Signed-off-by: Song Liu <song@kernel.org> > --- > block/blk-core.c | 1 + > include/linux/blk_types.h | 7 +++++++ > 2 files changed, 8 insertions(+) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 61f6a0dc4511..24035dd2eef1 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -164,6 +164,7 @@ static const struct { > [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, > [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, > [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, > + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, > > /* device mapper special case, should not leak out: */ > [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > index fe065c394fff..5561e58d158a 100644 > --- a/include/linux/blk_types.h > +++ b/include/linux/blk_types.h > @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; > */ > #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) > > +/* > + * BLK_STS_OFFLINE is returned from the driver when the target device is offline > + * or is being taken offline. This could help differentiate the case where a > + * device is intentionally being shut down from a real I/O error. > + */ > +#define BLK_STS_OFFLINE ((__force blk_status_t)17) > + > /** > * blk_path_error - returns true if error may be path related > * @error: status the request was completed with > -- > 2.30.2 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 6:52 ` Song Liu @ 2022-02-03 7:24 ` Hannes Reinecke 2022-02-03 13:47 ` Jens Axboe 0 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2022-02-03 7:24 UTC (permalink / raw) To: Song Liu, linux-scsi, linux-block Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe On 2/3/22 07:52, Song Liu wrote: > CC linux-block (it was a typo in the original email) > > On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: >> >> Currently, drivers reports BLK_STS_IOERR for devices that are not full >> online or being removed. This behavior could cause confusion for users, >> as they are not really I/O errors from the device. >> >> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device >> offline error" in dmesg instead of "I/O error". >> >> Signed-off-by: Song Liu <song@kernel.org> >> --- >> block/blk-core.c | 1 + >> include/linux/blk_types.h | 7 +++++++ >> 2 files changed, 8 insertions(+) >> >> diff --git a/block/blk-core.c b/block/blk-core.c >> index 61f6a0dc4511..24035dd2eef1 100644 >> --- a/block/blk-core.c >> +++ b/block/blk-core.c >> @@ -164,6 +164,7 @@ static const struct { >> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, >> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, >> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, >> + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, >> >> /* device mapper special case, should not leak out: */ >> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, >> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >> index fe065c394fff..5561e58d158a 100644 >> --- a/include/linux/blk_types.h >> +++ b/include/linux/blk_types.h >> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; >> */ >> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) >> >> +/* >> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline >> + * or is being taken offline. This could help differentiate the case where a >> + * device is intentionally being shut down from a real I/O error. >> + */ >> +#define BLK_STS_OFFLINE ((__force blk_status_t)17) >> + >> /** >> * blk_path_error - returns true if error may be path related >> * @error: status the request was completed with >> -- >> 2.30.2 >> Please do not overload EIO here. EIO already is a catch-all error if we don't know any better, but for the 'device offline' case we do (or rather should). Please map it onto 'ENODEV' or 'ENXIO'. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 7:24 ` Hannes Reinecke @ 2022-02-03 13:47 ` Jens Axboe 2022-02-03 17:23 ` Song Liu 0 siblings, 1 reply; 12+ messages in thread From: Jens Axboe @ 2022-02-03 13:47 UTC (permalink / raw) To: Hannes Reinecke, Song Liu, linux-scsi, linux-block Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen On 2/3/22 12:24 AM, Hannes Reinecke wrote: > On 2/3/22 07:52, Song Liu wrote: >> CC linux-block (it was a typo in the original email) >> >> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: >>> >>> Currently, drivers reports BLK_STS_IOERR for devices that are not full >>> online or being removed. This behavior could cause confusion for users, >>> as they are not really I/O errors from the device. >>> >>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device >>> offline error" in dmesg instead of "I/O error". >>> >>> Signed-off-by: Song Liu <song@kernel.org> >>> --- >>> block/blk-core.c | 1 + >>> include/linux/blk_types.h | 7 +++++++ >>> 2 files changed, 8 insertions(+) >>> >>> diff --git a/block/blk-core.c b/block/blk-core.c >>> index 61f6a0dc4511..24035dd2eef1 100644 >>> --- a/block/blk-core.c >>> +++ b/block/blk-core.c >>> @@ -164,6 +164,7 @@ static const struct { >>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, >>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, >>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, >>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, >>> >>> /* device mapper special case, should not leak out: */ >>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, >>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >>> index fe065c394fff..5561e58d158a 100644 >>> --- a/include/linux/blk_types.h >>> +++ b/include/linux/blk_types.h >>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; >>> */ >>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) >>> >>> +/* >>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline >>> + * or is being taken offline. This could help differentiate the case where a >>> + * device is intentionally being shut down from a real I/O error. >>> + */ >>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17) >>> + >>> /** >>> * blk_path_error - returns true if error may be path related >>> * @error: status the request was completed with >>> -- >>> 2.30.2 >>> > Please do not overload EIO here. > EIO already is a catch-all error if we don't know any better, but for > the 'device offline' case we do (or rather should). > Please map it onto 'ENODEV' or 'ENXIO'. It's deliberately EIO as not to force a change in behavior. I don't mind using something else, but that should be a separate change then. -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 13:47 ` Jens Axboe @ 2022-02-03 17:23 ` Song Liu 2022-02-03 18:51 ` Jens Axboe 2022-02-04 7:14 ` Hannes Reinecke 0 siblings, 2 replies; 12+ messages in thread From: Song Liu @ 2022-02-03 17:23 UTC (permalink / raw) To: Jens Axboe Cc: Hannes Reinecke, Song Liu, linux-scsi@vger.kernel.org, linux-block@vger.kernel.org, Kernel Team, James E.J. Bottomley, Martin K. Petersen Hi Hannes and Jens, > On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote: > > On 2/3/22 12:24 AM, Hannes Reinecke wrote: >> On 2/3/22 07:52, Song Liu wrote: >>> CC linux-block (it was a typo in the original email) >>> >>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: >>>> >>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full >>>> online or being removed. This behavior could cause confusion for users, >>>> as they are not really I/O errors from the device. >>>> >>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device >>>> offline error" in dmesg instead of "I/O error". >>>> >>>> Signed-off-by: Song Liu <song@kernel.org> >>>> --- >>>> block/blk-core.c | 1 + >>>> include/linux/blk_types.h | 7 +++++++ >>>> 2 files changed, 8 insertions(+) >>>> >>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>> index 61f6a0dc4511..24035dd2eef1 100644 >>>> --- a/block/blk-core.c >>>> +++ b/block/blk-core.c >>>> @@ -164,6 +164,7 @@ static const struct { >>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, >>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, >>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, >>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, >>>> >>>> /* device mapper special case, should not leak out: */ >>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, >>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >>>> index fe065c394fff..5561e58d158a 100644 >>>> --- a/include/linux/blk_types.h >>>> +++ b/include/linux/blk_types.h >>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; >>>> */ >>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) >>>> >>>> +/* >>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline >>>> + * or is being taken offline. This could help differentiate the case where a >>>> + * device is intentionally being shut down from a real I/O error. >>>> + */ >>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17) >>>> + >>>> /** >>>> * blk_path_error - returns true if error may be path related >>>> * @error: status the request was completed with >>>> -- >>>> 2.30.2 >>>> >> Please do not overload EIO here. >> EIO already is a catch-all error if we don't know any better, but for >> the 'device offline' case we do (or rather should). >> Please map it onto 'ENODEV' or 'ENXIO'. > > It's deliberately EIO as not to force a change in behavior. I don't mind > using something else, but that should be a separate change then. Thanks for these feedbacks. Shall I send v2 with an extra patch that changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch? Also, any preference between ENODEV and ENXIO? Thanks, Song ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 17:23 ` Song Liu @ 2022-02-03 18:51 ` Jens Axboe 2022-02-04 7:14 ` Hannes Reinecke 1 sibling, 0 replies; 12+ messages in thread From: Jens Axboe @ 2022-02-03 18:51 UTC (permalink / raw) To: Song Liu, Jens Axboe Cc: Hannes Reinecke, Song Liu, linux-scsi@vger.kernel.org, linux-block@vger.kernel.org, Kernel Team, James E.J. Bottomley, Martin K. Petersen On 2/3/22 10:23 AM, Song Liu wrote: > Hi Hannes and Jens, > >> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote: >> >> On 2/3/22 12:24 AM, Hannes Reinecke wrote: >>> On 2/3/22 07:52, Song Liu wrote: >>>> CC linux-block (it was a typo in the original email) >>>> >>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: >>>>> >>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full >>>>> online or being removed. This behavior could cause confusion for users, >>>>> as they are not really I/O errors from the device. >>>>> >>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device >>>>> offline error" in dmesg instead of "I/O error". >>>>> >>>>> Signed-off-by: Song Liu <song@kernel.org> >>>>> --- >>>>> block/blk-core.c | 1 + >>>>> include/linux/blk_types.h | 7 +++++++ >>>>> 2 files changed, 8 insertions(+) >>>>> >>>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>>> index 61f6a0dc4511..24035dd2eef1 100644 >>>>> --- a/block/blk-core.c >>>>> +++ b/block/blk-core.c >>>>> @@ -164,6 +164,7 @@ static const struct { >>>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, >>>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, >>>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, >>>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, >>>>> >>>>> /* device mapper special case, should not leak out: */ >>>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, >>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >>>>> index fe065c394fff..5561e58d158a 100644 >>>>> --- a/include/linux/blk_types.h >>>>> +++ b/include/linux/blk_types.h >>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; >>>>> */ >>>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) >>>>> >>>>> +/* >>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline >>>>> + * or is being taken offline. This could help differentiate the case where a >>>>> + * device is intentionally being shut down from a real I/O error. >>>>> + */ >>>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17) >>>>> + >>>>> /** >>>>> * blk_path_error - returns true if error may be path related >>>>> * @error: status the request was completed with >>>>> -- >>>>> 2.30.2 >>>>> >>> Please do not overload EIO here. >>> EIO already is a catch-all error if we don't know any better, but for >>> the 'device offline' case we do (or rather should). >>> Please map it onto 'ENODEV' or 'ENXIO'. >> >> It's deliberately EIO as not to force a change in behavior. I don't mind >> using something else, but that should be a separate change then. > > Thanks for these feedbacks. Shall I send v2 with an extra patch that > changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch? > Also, any preference between ENODEV and ENXIO? Yeah I think so, and perhaps put a mention in this patch on why EIO is chosen to not change the user visible return value. -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE 2022-02-03 17:23 ` Song Liu 2022-02-03 18:51 ` Jens Axboe @ 2022-02-04 7:14 ` Hannes Reinecke 1 sibling, 0 replies; 12+ messages in thread From: Hannes Reinecke @ 2022-02-04 7:14 UTC (permalink / raw) To: Song Liu, Jens Axboe Cc: Song Liu, linux-scsi@vger.kernel.org, linux-block@vger.kernel.org, Kernel Team, James E.J. Bottomley, Martin K. Petersen On 2/3/22 18:23, Song Liu wrote: > Hi Hannes and Jens, > >> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote: >> >> On 2/3/22 12:24 AM, Hannes Reinecke wrote: >>> On 2/3/22 07:52, Song Liu wrote: >>>> CC linux-block (it was a typo in the original email) >>>> >>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: >>>>> >>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full >>>>> online or being removed. This behavior could cause confusion for users, >>>>> as they are not really I/O errors from the device. >>>>> >>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device >>>>> offline error" in dmesg instead of "I/O error". >>>>> >>>>> Signed-off-by: Song Liu <song@kernel.org> >>>>> --- >>>>> block/blk-core.c | 1 + >>>>> include/linux/blk_types.h | 7 +++++++ >>>>> 2 files changed, 8 insertions(+) >>>>> >>>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>>> index 61f6a0dc4511..24035dd2eef1 100644 >>>>> --- a/block/blk-core.c >>>>> +++ b/block/blk-core.c >>>>> @@ -164,6 +164,7 @@ static const struct { >>>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, >>>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, >>>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, >>>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" }, >>>>> >>>>> /* device mapper special case, should not leak out: */ >>>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, >>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >>>>> index fe065c394fff..5561e58d158a 100644 >>>>> --- a/include/linux/blk_types.h >>>>> +++ b/include/linux/blk_types.h >>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t; >>>>> */ >>>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) >>>>> >>>>> +/* >>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline >>>>> + * or is being taken offline. This could help differentiate the case where a >>>>> + * device is intentionally being shut down from a real I/O error. >>>>> + */ >>>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17) >>>>> + >>>>> /** >>>>> * blk_path_error - returns true if error may be path related >>>>> * @error: status the request was completed with >>>>> -- >>>>> 2.30.2 >>>>> >>> Please do not overload EIO here. >>> EIO already is a catch-all error if we don't know any better, but for >>> the 'device offline' case we do (or rather should). >>> Please map it onto 'ENODEV' or 'ENXIO'. >> >> It's deliberately EIO as not to force a change in behavior. I don't mind >> using something else, but that should be a separate change then. > > Thanks for these feedbacks. Shall I send v2 with an extra patch that > changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch? > Also, any preference between ENODEV and ENXIO? > Please make it an addtional patch, and use ENODEV as a return value. For this patch you can add: Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices 2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu 2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu @ 2022-02-03 6:40 ` Song Liu 2022-02-03 6:53 ` Song Liu 2022-02-03 6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu 2 siblings, 1 reply; 12+ messages in thread From: Song Liu @ 2022-02-03 6:40 UTC (permalink / raw) To: linx-block, linux-scsi Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu The new error message for such case looks like [ 172.809565] device offline error, dev sda, sector 3138208 ... which will not be confused with regular I/O error (BLK_STS_IOERR). Signed-off-by: Song Liu <song@kernel.org> --- drivers/scsi/scsi_lib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0a70aa763a96..e30bc51578e9 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req) * power management commands. */ if (req && !(req->rq_flags & RQF_PM)) - return BLK_STS_IOERR; + return BLK_STS_OFFLINE; return BLK_STS_OK; } } -- 2.30.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices 2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu @ 2022-02-03 6:53 ` Song Liu 2022-02-03 7:24 ` Hannes Reinecke 0 siblings, 1 reply; 12+ messages in thread From: Song Liu @ 2022-02-03 6:53 UTC (permalink / raw) To: linux-scsi, linux-block Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe CC linux-block (it was a typo in the original email) On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: > > The new error message for such case looks like > > [ 172.809565] device offline error, dev sda, sector 3138208 ... > > which will not be confused with regular I/O error (BLK_STS_IOERR). > > Signed-off-by: Song Liu <song@kernel.org> > --- > drivers/scsi/scsi_lib.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 0a70aa763a96..e30bc51578e9 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req) > * power management commands. > */ > if (req && !(req->rq_flags & RQF_PM)) > - return BLK_STS_IOERR; > + return BLK_STS_OFFLINE; > return BLK_STS_OK; > } > } > -- > 2.30.2 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices 2022-02-03 6:53 ` Song Liu @ 2022-02-03 7:24 ` Hannes Reinecke 0 siblings, 0 replies; 12+ messages in thread From: Hannes Reinecke @ 2022-02-03 7:24 UTC (permalink / raw) To: Song Liu, linux-scsi, linux-block Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe On 2/3/22 07:53, Song Liu wrote: > CC linux-block (it was a typo in the original email) > > On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: >> >> The new error message for such case looks like >> >> [ 172.809565] device offline error, dev sda, sector 3138208 ... >> >> which will not be confused with regular I/O error (BLK_STS_IOERR). >> >> Signed-off-by: Song Liu <song@kernel.org> >> --- >> drivers/scsi/scsi_lib.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index 0a70aa763a96..e30bc51578e9 100644 >> --- a/drivers/scsi/scsi_lib.c >> +++ b/drivers/scsi/scsi_lib.c >> @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req) >> * power management commands. >> */ >> if (req && !(req->rq_flags & RQF_PM)) >> - return BLK_STS_IOERR; >> + return BLK_STS_OFFLINE; >> return BLK_STS_OK; >> } >> } >> -- >> 2.30.2 >> Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE 2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu 2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu 2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu @ 2022-02-03 6:52 ` Song Liu 2 siblings, 0 replies; 12+ messages in thread From: Song Liu @ 2022-02-03 6:52 UTC (permalink / raw) To: linux-scsi, linux-block Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe CC linux-block (it was a typo in the original email) On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote: > > We have a use case where HDDs are regularly power on/off to perserve power. > When a drive is being removed, we often see errors like > > [ 172.803279] I/O error, dev sda, sector 3137184 > > These messages are confusing for automations that grep dmesg, as they look > very similar to real HDD error. > > Solve this issue with a new block state BLK_STS_OFFLINE. After the change, > the error message looks like > > [ 172.803279] device offline error, dev sda, sector 3137184 > > so that the automations won't confuse them with real I/O error. > > Song Liu (2): > block: introduce BLK_STS_OFFLINE > scsi: use BLK_STS_OFFLINE for not fully online devices > > block/blk-core.c | 1 + > drivers/scsi/scsi_lib.c | 2 +- > include/linux/blk_types.h | 7 +++++++ > 3 files changed, 9 insertions(+), 1 deletion(-) > > -- > 2.30.2 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-02-04 7:14 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu 2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu 2022-02-03 6:52 ` Song Liu 2022-02-03 7:24 ` Hannes Reinecke 2022-02-03 13:47 ` Jens Axboe 2022-02-03 17:23 ` Song Liu 2022-02-03 18:51 ` Jens Axboe 2022-02-04 7:14 ` Hannes Reinecke 2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu 2022-02-03 6:53 ` Song Liu 2022-02-03 7:24 ` Hannes Reinecke 2022-02-03 6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox