From: Anand Jain <anajain.sg@gmail.com>
To: Boris Burkov <boris@bur.io>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2] btrfs: btrfs_log_dev_io_error() on all bio errors
Date: Tue, 7 Apr 2026 14:36:57 +0800 [thread overview]
Message-ID: <d5b43371-6a42-4dac-95b6-a1bc4fa12424@gmail.com> (raw)
In-Reply-To: <20260406161255.GA1743674@zen.localdomain>
On 7/4/26 00:12, Boris Burkov wrote:
> On Mon, Apr 06, 2026 at 09:56:07PM +0800, Anand Jain wrote:
>>
>>
>> Disagree. We should only track block-device-specific errors;
>> specifically, only the missing MEDIUM ERROR should be added [1].
>>
>> Including all error types would capture transport and fabric-related
>> errors, which do not justify a device replacement or permanent error state.
>>
>> In my experience with storage and transport protocols, transport errors
>> and timeouts are often transient and far more frequent than actual block
>> device media failures. Including them would introduce unnecessary noise
>> into the device statistics.
>
> This is just my mistake. I messed up with git and resent my v1 instead
> of the updated version after Christoph's review here
> https://lore.kernel.org/linux-btrfs/20260327102253.GA18582@lst.de/
>
> I intended to only log bdev error on IOERR, TARGET, MEDIUM, and
> PROTECTION as he suggested.
>
> Sorry for wasting your time with that, Anand.
My bad - I should have checked v1. Thanks.
> Boris
>
>>
>>
>> #define BLK_STS_NOTSUPP ((__force blk_status_t)1)
>> #define BLK_STS_TIMEOUT ((__force blk_status_t)2)
>> #define BLK_STS_NOSPC ((__force blk_status_t)3)
>> #define BLK_STS_TRANSPORT ((__force blk_status_t)4)
>> #define BLK_STS_TARGET ((__force blk_status_t)5)
>> #define BLK_STS_RESV_CONFLICT ((__force blk_status_t)6)
>> #define BLK_STS_MEDIUM ((__force blk_status_t)7)
>> #define BLK_STS_PROTECTION ((__force blk_status_t)8)
>> #define BLK_STS_RESOURCE ((__force blk_status_t)9)
>> #define BLK_STS_IOERR ((__force blk_status_t)10)
>> <snap>
>> #define BLK_STS_OFFLINE ((__force blk_status_t)16)
>> #define BLK_STS_DURATION_LIMIT ((__force blk_status_t)17)
>> #define BLK_STS_INVAL ((__force blk_status_t)19)
>>
>>
>>
>> [1]
>>
>>
>> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
>> index 2a2a21aec817..d7af38e9ce29 100644
>> --- a/fs/btrfs/bio.c
>> +++ b/fs/btrfs/bio.c
>> @@ -352,7 +352,8 @@ static void btrfs_log_dev_io_error(const struct bio
>> *bio, struct btrfs_device *d
>> {
>> if (!dev || !dev->bdev)
>> return;
>> - if (bio->bi_status != BLK_STS_IOERR && bio->bi_status !=
>> BLK_STS_TARGET)
>> + if (bio->bi_status != BLK_STS_IOERR && bio->bi_status !=
>> BLK_STS_TARGET &&
>> + bio->bi_status != BLK_STS_MEDIUM)
>> return;
>>
>> if (btrfs_op(bio) == BTRFS_MAP_WRITE)
>>
>>
>>
>> Thanks, Anand
>>
>> On 4/4/26 23:57, Boris Burkov wrote:
>>> As far as I can tell, we never intentionally constrained ourselves to
>>> these status codes, and it is misleading and surprising to lack the
>>> bdev error logging when we get a different error code from the block
>>> layer. This can lead to jumping to a wrong conclusion like "this
>>> system didn't see any bio failures but aborted with EIO".
>>>
>>> For example on nvme devices, I observe many failures coming back as
>>> BLK_STS_MEDIUM. It is apparent that the nvme driver returns a variety of
>>> BLK_STS_* status values in nvme_error_status().
>>>
>>> So handle the known expected errors and make some noise on the rest
>>> which we expect won't really happen.
>>>
>>> Signed-off-by: Boris Burkov <boris@bur.io>
>>> ---
>>> Changelog:
>>> v2:
>>> - proper bdev err logging for expected block errors
>>> - btrfs_warn_rl for all other errors
>>> ---
>>> fs/btrfs/bio.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
>>> index 2a2a21aec817..08b1d62603d0 100644
>>> --- a/fs/btrfs/bio.c
>>> +++ b/fs/btrfs/bio.c
>>> @@ -352,8 +352,7 @@ static void btrfs_log_dev_io_error(const struct bio *bio, struct btrfs_device *d
>>> {
>>> if (!dev || !dev->bdev)
>>> return;
>>> - if (bio->bi_status != BLK_STS_IOERR && bio->bi_status != BLK_STS_TARGET)
>>> - return;
>>> + ASSERT(bio->bi_status);
>>>
>>> if (btrfs_op(bio) == BTRFS_MAP_WRITE)
>>> btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_WRITE_ERRS);
>>
prev parent reply other threads:[~2026-04-07 6:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 15:57 [PATCH v2] btrfs: btrfs_log_dev_io_error() on all bio errors Boris Burkov
2026-04-06 13:56 ` Anand Jain
2026-04-06 16:12 ` Boris Burkov
2026-04-07 6:36 ` Anand Jain [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5b43371-6a42-4dac-95b6-a1bc4fa12424@gmail.com \
--to=anajain.sg@gmail.com \
--cc=boris@bur.io \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox