Re: How to fix "BTRFS error (device dm-3): error writing primary super block to device 1"?

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Kai Stian Olstad <btrfs+list@olstad.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: How to fix "BTRFS error (device dm-3): error writing primary super block to device 1"?
Date: Sat, 12 Apr 2025 03:02:04 +0200	[thread overview]
Message-ID: <b669450cfb7690e99cc4d9c63daa0680@olstad.com> (raw)
In-Reply-To: <3d2074dc-a36b-4fc2-8e20-52cf40584b38@gmx.com>

On 12.04.2025 02:43, Qu Wenruo wrote:
> 在 2025/4/12 09:59, Kai Stian Olstad 写道:
>> On 12.04.2025 00:10, Qu Wenruo wrote:
>>> 在 2025/4/12 01:18, Kai Stian Olstad 写道:
>>>> Kubuntu 24.04
>>>> Kernel 6.8.0-57-generic
>>>> 
>>>> 2 day ago I got a sector error on one of the BTRFS disk
>>>> 
>>>> $ journalctl -k -S 2025-04-09 | grep -A 20 mpt3sas_cm0
>>>> Apr 09 03:16:26 cb kernel: mpt3sas_cm0: log_info(0x31080000):
>>>> originator(PL), code(0x08), sub_code(0x0000)
>>>> Apr 09 03:16:26 cb kernel: mpt3sas_cm0: log_info(0x31080000):
>>>> originator(PL), code(0x08), sub_code(0x0000)
>>>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 FAILED Result:
>>>> hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=6s
>>>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 Sense Key :
>>>> Illegal Request [current]
>>>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 Add. Sense:
>>>> Logical block address out of range
>>>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 CDB: Write(16)
>>>> 8a 08 00 00 00 00 00 00 10 80 00 00 00 08 00 00
>>>> Apr 09 03:16:26 cb kernel: critical target error, dev sdd, sector
>>>> 4224 op 0x1:(WRITE) flags 0x23800 phys_seg 1 prio class 0
>>> 
>>> This error is completely from the lower layer (the block device).
>>> 
>>> Btrfs nor the LUKS upon the disk can do anything to it.
>> 
>> Thank you for the response.
>> 
>> This disk support scterc
>> 
>> $ sudo smartctl -l scterc /dev/sdd
>> SCT Error Recovery Control:
>>             Read:     70 (7.0 seconds)
>>            Write:     70 (7.0 seconds)
>> 
>> Doesn't that mean that the disk gives up after 7 seconds, and then the
>> sector i mapped to a spare.
>> So if Btrfs does a write to the sector again it will be written to the
>> spare?
>> 
>> I've experienced numerous sector errors throughout the years with 
>> mdadm
>> and they have been fixed with a check.
>> Also a few with Btrfs I think, but they have been fixed automatically.
> 
> Whatever the feature is, it's block device driver's behavior.
> 
> Btrfs only errors out because the disk reported the write failed.
> 
> For the detailed reason you should check these lines:
> 
>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=6s
>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 Sense Key :
> Illegal Request [current]
>> Apr 09 03:16:26 cb kernel: sd 4:0:1:0: [sdd] tag#5552 Add. Sense:
> Logical block address out of range

I'll check them but this is what I usually sees when a disk have a 
sector error.


>> So why not this time?
>> To me this looks like an ordinary faulty sector that can be "fixed" 
>> with
>> a write?
>> 
> I'm not sure what ever the "SCT Error recovery control" feature is, but
> if it is designed to re-map a write, it should not return -EIO for the
> initial write failure, but OK as long as eventually the write 
> succeeded.
> 
> It should not require any upper layer to do any extra work.
> 
> But since the write eventually failed, there is nothing upper layer can
> do, unless the dm or fs layer has some extra recovery mechanism.

Now I'm confused, I'm running RAID1 an only one disk has/had 1 sector 
failure.
Shouldn't Btrfs manage to to write this data, it should exist on one of 
the other drives because of RAID1?
And shouldn't a scrub fix it?

Since I don't get any other error from the block layer, the sector is 
either fixed/remapped or Btrfs doesn't try to fix the data in scrub?
If it had tried and the sector is still bad I should get sector error 
from the disk multiple time.
But this error is only in the logs that one time.

What is the purpose of Btrfs RAID1 if it doesn't try to fix this, by 
writing the data again from the good copy?

-- 
Kai Stian Olstad

next prev parent reply	other threads:[~2025-04-12  1:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11 15:48 How to fix "BTRFS error (device dm-3): error writing primary super block to device 1"? Kai Stian Olstad
2025-04-11 22:10 ` Qu Wenruo
2025-04-12  0:29   ` Kai Stian Olstad
2025-04-12  0:43     ` Qu Wenruo
2025-04-12  1:02       ` Kai Stian Olstad [this message]
2025-04-12  3:15         ` Qu Wenruo
2025-04-13  8:14           ` Kai Stian Olstad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b669450cfb7690e99cc4d9c63daa0680@olstad.com \
    --to=btrfs+list@olstad.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox