From mboxrd@z Thu Jan 1 00:00:00 1970 From: NeilBrown Subject: Re: [PATCH - for v3.7] DM-RAID: Fix RAID10's check for sufficient redundancy Date: Thu, 24 Jan 2013 12:17:10 +1100 Message-ID: <20130124121710.6220bb60@notabene.brown> References: <1357783259.9103.5.camel@f16> <1358912538.5871.3.camel@f16> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/ajuHnJz7+w0ec/EWABNPOUB"; protocol="application/pgp-signature" Return-path: In-Reply-To: <1358912538.5871.3.camel@f16> Sender: linux-raid-owner@vger.kernel.org To: Jonathan Brassow Cc: linux-raid@vger.kernel.org, agk@redhat.com List-Id: linux-raid.ids --Sig_/ajuHnJz7+w0ec/EWABNPOUB Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Tue, 22 Jan 2013 21:42:18 -0600 Jonathan Brassow wrote: > Neil, >=20 > Here is the RAID10 redundancy check fix backported for linux-3.7 Thanks. So this goes into 3.7.y and bumps the dm-raid version to 1.3.2 It should also go into 3.8-rc and bump the dm-raid version there to 1.4.1 Then "Add support for MD's RAID10 "far" and "offset" algorithms" which is d= ue to go into 3.9-rc should bump the dm-raid version to 1.4.2. Is that right? I' re-arranged the patches in my for-next to match this - please check I didn't mess anything up. Then I'll send them off. Thanks, NeilBrown >=20 > Thanks, > brassow >=20 >=20 > DM RAID: Fix RAID10's check for sufficient redundancy >=20 > Before attempting to activate a RAID array, it is checked for sufficient > redundancy. That is, we make sure that there are not too many failed > devices - or devices specified for rebuild - to undermine our ability to > activate the array. The current code performs this check twice - once to > ensure there were not too many devices specified for rebuild by the user > ('validate_rebuild_devices') and again after possibly experiencing a fail= ure > to read the superblock ('analyse_superblocks'). Neither of these checks = are > sufficient. The first check is done properly but with insufficient > information about the possible failure state of the devices to make a good > determination if the array can be activated. The second check is simply > done wrong in the case of RAID10 because it doesn't account for the > independence of the stripes (i.e. mirror sets). The solution is to use t= he > properly written check ('validate_rebuild_devices'), but perform the check > after the superblocks have been read and we know which devices have faile= d. > This gives us one check instead of two and performs it in a location where > it can be done right. >=20 > Only RAID10 was affected and it was affected in the following ways: > - the code did not properly catch the condition where a user specified > a device for rebuild that already had a failed device in the same mirror > set. (This condition would, however, be caught at a deeper level in MD= .) > - the code triggers a false positive and denies activation when devices in > independent mirror sets have failed - counting the failures as though t= hey > were all in the same set. >=20 > The most likely place this error was introduced (or this patch should have > been included) is in commit 4ec1e369 - first introduced in v3.7-rc1. >=20 > Signed-off-by: Jonathan Brassow > Index: linux-upstream/Documentation/device-mapper/dm-raid.txt > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- linux-upstream.orig/Documentation/device-mapper/dm-raid.txt > +++ linux-upstream/Documentation/device-mapper/dm-raid.txt > @@ -141,3 +141,4 @@ Version History > 1.2.0 Handle creation of arrays that contain failed devices. > 1.3.0 Added support for RAID 10 > 1.3.1 Allow device replacement/rebuild for RAID 10 > +1.3.2 Fix/improve redundancy checking for RAID10 > Index: linux-upstream/drivers/md/dm-raid.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- linux-upstream.orig/drivers/md/dm-raid.c > +++ linux-upstream/drivers/md/dm-raid.c > @@ -338,24 +338,22 @@ static int validate_region_size(struct r > } > =20 > /* > - * validate_rebuild_devices > + * validate_raid_redundancy > * @rs > * > - * Determine if the devices specified for rebuild can result in a valid > - * usable array that is capable of rebuilding the given devices. > + * Determine if there are enough devices in the array that haven't > + * failed (or are being rebuilt) to form a usable array. > * > * Returns: 0 on success, -EINVAL on failure. > */ > -static int validate_rebuild_devices(struct raid_set *rs) > +static int validate_raid_redundancy(struct raid_set *rs) > { > unsigned i, rebuild_cnt =3D 0; > unsigned rebuilds_per_group, copies, d; > =20 > - if (!(rs->print_flags & DMPF_REBUILD)) > - return 0; > - > for (i =3D 0; i < rs->md.raid_disks; i++) > - if (!test_bit(In_sync, &rs->dev[i].rdev.flags)) > + if (!test_bit(In_sync, &rs->dev[i].rdev.flags) || > + !rs->dev[i].rdev.sb_page) > rebuild_cnt++; > =20 > switch (rs->raid_type->level) { > @@ -391,27 +389,24 @@ static int validate_rebuild_devices(stru > * A A B B C > * C D D E E > */ > - rebuilds_per_group =3D 0; > for (i =3D 0; i < rs->md.raid_disks * copies; i++) { > + if (!(i % copies)) > + rebuilds_per_group =3D 0; > d =3D i % rs->md.raid_disks; > - if (!test_bit(In_sync, &rs->dev[d].rdev.flags) && > + if ((!rs->dev[d].rdev.sb_page || > + !test_bit(In_sync, &rs->dev[d].rdev.flags)) && > (++rebuilds_per_group >=3D copies)) > goto too_many; > - if (!((i + 1) % copies)) > - rebuilds_per_group =3D 0; > } > break; > default: > - DMERR("The rebuild parameter is not supported for %s", > - rs->raid_type->name); > - rs->ti->error =3D "Rebuild not supported for this RAID type"; > - return -EINVAL; > + if (rebuild_cnt) > + return -EINVAL; > } > =20 > return 0; > =20 > too_many: > - rs->ti->error =3D "Too many rebuild devices specified"; > return -EINVAL; > } > =20 > @@ -662,9 +657,6 @@ static int parse_raid_params(struct raid > } > rs->md.dev_sectors =3D sectors_per_dev; > =20 > - if (validate_rebuild_devices(rs)) > - return -EINVAL; > - > /* Assume there are no metadata devices until the drives are parsed */ > rs->md.persistent =3D 0; > rs->md.external =3D 1; > @@ -993,28 +985,10 @@ static int super_validate(struct mddev * > static int analyse_superblocks(struct dm_target *ti, struct raid_set *rs) > { > int ret; > - unsigned redundancy =3D 0; > struct raid_dev *dev; > struct md_rdev *rdev, *tmp, *freshest; > struct mddev *mddev =3D &rs->md; > =20 > - switch (rs->raid_type->level) { > - case 1: > - redundancy =3D rs->md.raid_disks - 1; > - break; > - case 4: > - case 5: > - case 6: > - redundancy =3D rs->raid_type->parity_devs; > - break; > - case 10: > - redundancy =3D raid10_md_layout_to_copies(mddev->layout) - 1; > - break; > - default: > - ti->error =3D "Unknown RAID type"; > - return -EINVAL; > - } > - > freshest =3D NULL; > rdev_for_each_safe(rdev, tmp, mddev) { > /* > @@ -1043,44 +1017,43 @@ static int analyse_superblocks(struct dm > break; > default: > dev =3D container_of(rdev, struct raid_dev, rdev); > - if (redundancy--) { > - if (dev->meta_dev) > - dm_put_device(ti, dev->meta_dev); > + if (dev->meta_dev) > + dm_put_device(ti, dev->meta_dev); > =20 > - dev->meta_dev =3D NULL; > - rdev->meta_bdev =3D NULL; > + dev->meta_dev =3D NULL; > + rdev->meta_bdev =3D NULL; > =20 > - if (rdev->sb_page) > - put_page(rdev->sb_page); > + if (rdev->sb_page) > + put_page(rdev->sb_page); > =20 > - rdev->sb_page =3D NULL; > + rdev->sb_page =3D NULL; > =20 > - rdev->sb_loaded =3D 0; > + rdev->sb_loaded =3D 0; > =20 > - /* > - * We might be able to salvage the data device > - * even though the meta device has failed. For > - * now, we behave as though '- -' had been > - * set for this device in the table. > - */ > - if (dev->data_dev) > - dm_put_device(ti, dev->data_dev); > - > - dev->data_dev =3D NULL; > - rdev->bdev =3D NULL; > + /* > + * We might be able to salvage the data device > + * even though the meta device has failed. For > + * now, we behave as though '- -' had been > + * set for this device in the table. > + */ > + if (dev->data_dev) > + dm_put_device(ti, dev->data_dev); > =20 > - list_del(&rdev->same_set); > + dev->data_dev =3D NULL; > + rdev->bdev =3D NULL; > =20 > - continue; > - } > - ti->error =3D "Failed to load superblock"; > - return ret; > + list_del(&rdev->same_set); > } > } > =20 > if (!freshest) > return 0; > =20 > + if (validate_raid_redundancy(rs)) { > + rs->ti->error =3D "Insufficient redundancy to activate array"; > + return -EINVAL; > + } > + > /* > * Validation of the freshest device provides the source of > * validation for the remaining devices. > @@ -1430,7 +1403,7 @@ static void raid_resume(struct dm_target > =20 > static struct target_type raid_target =3D { > .name =3D "raid", > - .version =3D {1, 3, 1}, > + .version =3D {1, 3, 2}, > .module =3D THIS_MODULE, > .ctr =3D raid_ctr, > .dtr =3D raid_dtr, >=20 --Sig_/ajuHnJz7+w0ec/EWABNPOUB Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIVAwUBUQCLljnsnt1WYoG5AQKmMBAAj4Brf0V1reSlzqHojjiQf0eSIEr52c50 CO9zZ6HcOvcBp3d2L6D4tbGQb2/GntwSKDH07K0H9axVobbdPv4bvIuaNFRTGeda LSl51zpeQ/JJ5VJuY2nJS4TzTOo+6b3/0KExUr/El18LgBJVeIv832NWWRHrUCJr c1WTb/m1r4MOGpaQ7EsDeR6mjxsaUFV7Jhya7DmLw+xQZ8+irc6bKm7DfLNMbUIl UCLnKbfdnr21CnILaKy+HCVqk/IuGX8RGjlhHQsNa8vA4rJEXfyHAqWsv62XCqwF CfQrVsjMGFaikaTMfM3app/4Ay1D+O44XlboxDpBt2JFvMavP8u9+vtM9PK4dVkg ykS94vEsmGOnX83WHSBdOx6PzgnMta2XHvJcILkM6/qwxoIFk0YlMxnGoG7SA5j9 mmK07aLl4tih0YlipLKi5IFWwHpDGPFoMFhCiY4SbPVzNuC/pdd+B8G2Odrm2Jvz cZyqVdI4HbsSPanoOdd17U7k5L3uyCX12h3f4Z1VfEh57JR5yQOLxyP7bczWIAcS 4utuSVcEp4ltfWWOuNPcHw15YNPEjZxmb9XyJ7e8fqfzXwxlHwVHtJWNavOoRczu 55bKR6Jy53Ush0xnZOmJuYaFxIivfuCms2rGjexNbPoaFmKnVYg7PDBZncufpFma 97cXLMK0WYU= =EQPg -----END PGP SIGNATURE----- --Sig_/ajuHnJz7+w0ec/EWABNPOUB--