Linux RAID subsystem development
 help / color / mirror / Atom feed
* Bogus entry in mdadm bad blocks list and question about particular commit
@ 2016-02-02 20:34 Sarah Newman
  2016-02-02 20:46 ` Jes Sorensen
  0 siblings, 1 reply; 3+ messages in thread
From: Sarah Newman @ 2016-02-02 20:34 UTC (permalink / raw)
  To: linux-raid

I added some more drives to a raid 1 last night where some devices had existing bad block entries. There was nothing of particular interest in
/var/log/messages. Afterwards there is:

$ sudo /sbin/mdadm --examine-badblocks /dev/sdl1
Bad-blocks on /dev/sdl1:
   11986270392325491 for 51 sectors

The total number of sectors on the drive is 3907029168.

The start sector in the bad blocks list is 2a95730cea9573 in hex, I don't know if that string has any special significance or not.

$ sudo /sbin/mdadm --examine /dev/sdl1
/dev/sdl1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x8

     Raid Level : raid1
   Raid Devices : 11

 Avail Dev Size : 20971368 (10.00 GiB 10.74 GB)
     Array Size : 10485632 (10.00 GiB 10.74 GB)
  Used Dev Size : 20971264 (10.00 GiB 10.74 GB)
   Super Offset : 20971504 sectors
   Unused Space : before=0 sectors, after=232 sectors
          State : clean

    Update Time : Tue Feb  2 10:50:33 2016
  Bad Block Log : 512 entries available at offset -8 sectors - bad blocks present.
       Checksum : 4092aabf - correct
         Events : 143560


   Device Role : Active device 9
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)

The kernel is built from the Xen4CentOS 3.18.21-17 kernel, but there are no changes related to md in the code.

I looked for differences in between 3.18.21 and stable 3.18.y and the only interesting thing looked like e9206476ace "md/raid1: submit_bio_wait()
returns 0 on success". I don't think that's a smoking gun for the bogus bad blocks entry. But without that commit, mdadm with bad blocks enabled is
completely broken if there are write errors such that successful writes are reported as errors and vice versa, is that correct?

Thanks, Sarah

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bogus entry in mdadm bad blocks list and question about particular commit
  2016-02-02 20:34 Bogus entry in mdadm bad blocks list and question about particular commit Sarah Newman
@ 2016-02-02 20:46 ` Jes Sorensen
  2016-02-02 21:09   ` Sarah Newman
  0 siblings, 1 reply; 3+ messages in thread
From: Jes Sorensen @ 2016-02-02 20:46 UTC (permalink / raw)
  To: Sarah Newman; +Cc: linux-raid

Sarah Newman <srn@prgmr.com> writes:
> I added some more drives to a raid 1 last night where some devices had existing bad block entries. There was nothing of particular interest in
> /var/log/messages. Afterwards there is:
>
> $ sudo /sbin/mdadm --examine-badblocks /dev/sdl1
> Bad-blocks on /dev/sdl1:
>    11986270392325491 for 51 sectors
>
> The total number of sectors on the drive is 3907029168.
>
> The start sector in the bad blocks list is 2a95730cea9573 in hex, I don't know if that string has any special significance or not.
>
> $ sudo /sbin/mdadm --examine /dev/sdl1
> /dev/sdl1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x8
>
>      Raid Level : raid1
>    Raid Devices : 11
>
>  Avail Dev Size : 20971368 (10.00 GiB 10.74 GB)
>      Array Size : 10485632 (10.00 GiB 10.74 GB)
>   Used Dev Size : 20971264 (10.00 GiB 10.74 GB)
>    Super Offset : 20971504 sectors
>    Unused Space : before=0 sectors, after=232 sectors
>           State : clean
>
>     Update Time : Tue Feb  2 10:50:33 2016
>   Bad Block Log : 512 entries available at offset -8 sectors - bad blocks present.
>        Checksum : 4092aabf - correct
>          Events : 143560
>
>
>    Device Role : Active device 9
>    Array State : AAAAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
>
> The kernel is built from the Xen4CentOS 3.18.21-17 kernel, but there
> are no changes related to md in the code.
>
> I looked for differences in between 3.18.21 and stable 3.18.y and the
> only interesting thing looked like e9206476ace "md/raid1:
> submit_bio_wait() returns 0 on success". I don't think that's a
> smoking gun for the bogus bad blocks entry. But without that commit,
> mdadm with bad blocks enabled is completely broken if there are write
> errors such that successful writes are reported as errors and vice
> versa, is that correct?

Sarah,

If I remember correctly, yes then badblocks handling was pretty wedged
without that patch since it broke the narrowing down of the problem. If
you have to run such an old kernel, you really ought to backport
681ab4696062f5aa939c9e04d058732306a97176 and
203d27b0226a05202438ddb39ef0ef1acb14a759 if you have raid1 and/or raid10
arrays.

Jes

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bogus entry in mdadm bad blocks list and question about particular commit
  2016-02-02 20:46 ` Jes Sorensen
@ 2016-02-02 21:09   ` Sarah Newman
  0 siblings, 0 replies; 3+ messages in thread
From: Sarah Newman @ 2016-02-02 21:09 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid

On 02/02/2016 12:46 PM, Jes Sorensen wrote:
> Sarah Newman <srn@prgmr.com> writes:
>> I added some more drives to a raid 1 last night where some devices had existing bad block entries. There was nothing of particular interest in
>> /var/log/messages. Afterwards there is:
>>
>> $ sudo /sbin/mdadm --examine-badblocks /dev/sdl1
>> Bad-blocks on /dev/sdl1:
>>    11986270392325491 for 51 sectors
>>
>> The total number of sectors on the drive is 3907029168.
>>
>> The start sector in the bad blocks list is 2a95730cea9573 in hex, I don't know if that string has any special significance or not.

>> I looked for differences in between 3.18.21 and stable 3.18.y and the
>> only interesting thing looked like e9206476ace "md/raid1:
>> submit_bio_wait() returns 0 on success". I don't think that's a
>> smoking gun for the bogus bad blocks entry. But without that commit,
>> mdadm with bad blocks enabled is completely broken if there are write
>> errors such that successful writes are reported as errors and vice
>> versa, is that correct?
> 
> Sarah,
> 
> If I remember correctly, yes then badblocks handling was pretty wedged
> without that patch since it broke the narrowing down of the problem. If
> you have to run such an old kernel, you really ought to backport
> 681ab4696062f5aa939c9e04d058732306a97176 and
> 203d27b0226a05202438ddb39ef0ef1acb14a759 if you have raid1 and/or raid10
> arrays.

Noted.

What about the bad blocks entry with the completely bogus sector, is there a plausible explanation for that?

Thanks, Sarah

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-02-02 21:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-02 20:34 Bogus entry in mdadm bad blocks list and question about particular commit Sarah Newman
2016-02-02 20:46 ` Jes Sorensen
2016-02-02 21:09   ` Sarah Newman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox