Re: [PATCH v2] btrfs: btrfs_log_dev_io_error() on all bio errors

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Anand Jain <anajain.sg@gmail.com>
To: Boris Burkov <boris@bur.io>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2] btrfs: btrfs_log_dev_io_error() on all bio errors
Date: Tue, 7 Apr 2026 14:36:57 +0800	[thread overview]
Message-ID: <d5b43371-6a42-4dac-95b6-a1bc4fa12424@gmail.com> (raw)
In-Reply-To: <20260406161255.GA1743674@zen.localdomain>

On 7/4/26 00:12, Boris Burkov wrote:
> On Mon, Apr 06, 2026 at 09:56:07PM +0800, Anand Jain wrote:
>>
>>
>> Disagree. We should only track block-device-specific errors;
>> specifically, only the missing MEDIUM ERROR should be added [1].
>>
>> Including all error types would capture transport and fabric-related
>> errors, which do not justify a device replacement or permanent error state.
>>
>> In my experience with storage and transport protocols, transport errors
>> and timeouts are often transient and far more frequent than actual block
>> device media failures. Including them would introduce unnecessary noise
>> into the device statistics.
> 
> This is just my mistake. I messed up with git and resent my v1 instead
> of the updated version after Christoph's review here
> https://lore.kernel.org/linux-btrfs/20260327102253.GA18582@lst.de/
> 
> I intended to only log bdev error on IOERR, TARGET, MEDIUM, and
> PROTECTION as he suggested.
> 
> Sorry for wasting your time with that, Anand.

My bad - I should have checked v1. Thanks.


> Boris
> 
>>
>>
>> #define BLK_STS_NOTSUPP         ((__force blk_status_t)1)
>> #define BLK_STS_TIMEOUT         ((__force blk_status_t)2)
>> #define BLK_STS_NOSPC           ((__force blk_status_t)3)
>> #define BLK_STS_TRANSPORT       ((__force blk_status_t)4)
>> #define BLK_STS_TARGET          ((__force blk_status_t)5)
>> #define BLK_STS_RESV_CONFLICT   ((__force blk_status_t)6)
>> #define BLK_STS_MEDIUM          ((__force blk_status_t)7)
>> #define BLK_STS_PROTECTION      ((__force blk_status_t)8)
>> #define BLK_STS_RESOURCE        ((__force blk_status_t)9)
>> #define BLK_STS_IOERR           ((__force blk_status_t)10)
>> <snap>
>> #define BLK_STS_OFFLINE         ((__force blk_status_t)16)
>> #define BLK_STS_DURATION_LIMIT  ((__force blk_status_t)17)
>> #define BLK_STS_INVAL           ((__force blk_status_t)19)
>>
>>
>>
>> [1]
>>
>>
>> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
>> index 2a2a21aec817..d7af38e9ce29 100644
>> --- a/fs/btrfs/bio.c
>> +++ b/fs/btrfs/bio.c
>> @@ -352,7 +352,8 @@ static void btrfs_log_dev_io_error(const struct bio
>> *bio, struct btrfs_device *d
>>  {
>>         if (!dev || !dev->bdev)
>>                 return;
>> -       if (bio->bi_status != BLK_STS_IOERR && bio->bi_status !=
>> BLK_STS_TARGET)
>> +       if (bio->bi_status != BLK_STS_IOERR && bio->bi_status !=
>> BLK_STS_TARGET &&
>> +           bio->bi_status != BLK_STS_MEDIUM)
>>                 return;
>>
>>         if (btrfs_op(bio) == BTRFS_MAP_WRITE)
>>
>>
>>
>> Thanks, Anand
>>
>> On 4/4/26 23:57, Boris Burkov wrote:
>>> As far as I can tell, we never intentionally constrained ourselves to
>>> these status codes, and it is misleading and surprising to lack the
>>> bdev error logging when we get a different error code from the block
>>> layer. This can lead to jumping to a wrong conclusion like "this
>>> system didn't see any bio failures but aborted with EIO".
>>>
>>> For example on nvme devices, I observe many failures coming back as
>>> BLK_STS_MEDIUM. It is apparent that the nvme driver returns a variety of
>>> BLK_STS_* status values in nvme_error_status().
>>>
>>> So handle the known expected errors and make some noise on the rest
>>> which we expect won't really happen.
>>>
>>> Signed-off-by: Boris Burkov <boris@bur.io>
>>> ---
>>> Changelog:
>>> v2:
>>> - proper bdev err logging for expected block errors
>>> - btrfs_warn_rl for all other errors
>>> ---
>>>  fs/btrfs/bio.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
>>> index 2a2a21aec817..08b1d62603d0 100644
>>> --- a/fs/btrfs/bio.c
>>> +++ b/fs/btrfs/bio.c
>>> @@ -352,8 +352,7 @@ static void btrfs_log_dev_io_error(const struct bio *bio, struct btrfs_device *d
>>>  {
>>>  	if (!dev || !dev->bdev)
>>>  		return;
>>> -	if (bio->bi_status != BLK_STS_IOERR && bio->bi_status != BLK_STS_TARGET)
>>> -		return;
>>> +	ASSERT(bio->bi_status);
>>>  
>>>  	if (btrfs_op(bio) == BTRFS_MAP_WRITE)
>>>  		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_WRITE_ERRS);
>>

     prev parent reply	other threads:[~2026-04-07  6:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-04 15:57 [PATCH v2] btrfs: btrfs_log_dev_io_error() on all bio errors Boris Burkov
2026-04-06 13:56 ` Anand Jain
2026-04-06 16:12   ` Boris Burkov
2026-04-07  6:36     ` Anand Jain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5b43371-6a42-4dac-95b6-a1bc4fa12424@gmail.com \
    --to=anajain.sg@gmail.com \
    --cc=boris@bur.io \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox