Linux RAID subsystem development
 help / color / mirror / Atom feed
* cluster-md mddev->in_sync & mddev->safemode_delay may have bug
@ 2020-07-15  3:48 heming.zhao
  2020-07-15 18:17 ` Guoqing Jiang
  2020-07-16  0:54 ` NeilBrown
  0 siblings, 2 replies; 8+ messages in thread
From: heming.zhao @ 2020-07-15  3:48 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, guoqing.jiang

Hello List,


@Neil  @Guoqing,
Would you have time to take a look at this bug?

This mail replaces previous mail: commit 480523feae581 may introduce a bug.
Previous mail has some unclear description, I sort out & resend in this mail.

This bug was reported from a SUSE customer.

In cluster-md env, after below steps, "mdadm -D /dev/md0" shows "State: active" all the time.
```
# mdadm -S --scan
# mdadm --zero-superblock /dev/sd{a,b}
# mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb

# mdadm -D /dev/md0
/dev/md0:
            Version : 1.2
      Creation Time : Mon Jul  6 12:02:23 2020
         Raid Level : raid1
         Array Size : 64512 (63.00 MiB 66.06 MB)
      Used Dev Size : 64512 (63.00 MiB 66.06 MB)
       Raid Devices : 2
      Total Devices : 2
        Persistence : Superblock is persistent

      Intent Bitmap : Internal

        Update Time : Mon Jul  6 12:02:24 2020
              State : active <==== this line
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0

Consistency Policy : bitmap

               Name : lp-clustermd1:0  (local to host lp-clustermd1)
       Cluster Name : hacluster
               UUID : 38ae5052:560c7d36:bb221e15:7437f460
             Events : 18

     Number   Major   Minor   RaidDevice State
        0       8        0        0      active sync   /dev/sda
        1       8       16        1      active sync   /dev/sdb
```

with commit 480523feae581 (author: Neil Brown), the try_set_sync never true, so mddev->in_sync always 0.

the simplest fix is bypass try_set_sync when array is clustered.
```
  void md_check_recovery(struct mddev *mddev)
  {
     ... ...
         if (mddev_is_clustered(mddev)) {
             struct md_rdev *rdev;
             /* kick the device if another node issued a
              * remove disk.
              */
             rdev_for_each(rdev, mddev) {
                 if (test_and_clear_bit(ClusterRemove, &rdev->flags) &&
                         rdev->raid_disk < 0)
                     md_kick_rdev_from_array(rdev);
             }
+           try_set_sync = 1;
         }
     ... ...
  }
```
this fix makes commit 480523feae581 doesn't work when clustered env.
I want to know what impact with above fix.
Or does there have other solution for this issue?


--------
And for mddev->safemode_delay issue

There is also another bug when array change bitmap from internal to clustered.
the /sys/block/mdX/md/safe_mode_delay keep original value after changing bitmap type.
in safe_delay_store(), the code forbids setting mddev->safemode_delay when array is clustered.
So in cluster-md env, the expected safemode_delay value should be 0.

reproduction steps:
```
# mdadm --zero-superblock /dev/sd{b,c,d}
# mdadm -C /dev/md0 -b internal -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc
# cat /sys/block/md0/md/safe_mode_delay
0.204
# mdadm -G /dev/md0 -b none
# mdadm --grow /dev/md0 --bitmap=clustered
# cat /sys/block/md0/md/safe_mode_delay
0.204  <== doesn't change, should ZERO for cluster-md
```

thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-16  6:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-15  3:48 cluster-md mddev->in_sync & mddev->safemode_delay may have bug heming.zhao
2020-07-15 18:17 ` Guoqing Jiang
2020-07-15 18:40   ` heming.zhao
2020-07-15 19:12     ` Guoqing Jiang
2020-07-16  0:54 ` NeilBrown
2020-07-16  5:52   ` heming.zhao
2020-07-16  6:10     ` Song Liu
2020-07-16  6:22       ` heming.zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox