From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-by2nam03on0137.outbound.protection.outlook.com ([104.47.42.137]:8724 "EHLO NAM03-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932730AbeARVHP (ORCPT ); Thu, 18 Jan 2018 16:07:15 -0500 From: Sasha Levin To: "stable@vger.kernel.org" , "stable-commits@vger.kernel.org" CC: NeilBrown , Shaohua Li , Sasha Levin Subject: [added to the 4.1 stable tree] raid5: Set R5_Expanded on parity devices as well as data. Date: Thu, 18 Jan 2018 21:01:54 +0000 Message-ID: <20180118205908.3220-281-alexander.levin@microsoft.com> References: <20180118205908.3220-1-alexander.levin@microsoft.com> In-Reply-To: <20180118205908.3220-1-alexander.levin@microsoft.com> Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org List-ID: From: NeilBrown This patch has been added to the stable tree. If you have any objections, please let us know. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [ Upstream commit 235b6003fb28f0dd8e7ed8fbdb088bb548291766 ] When reshaping a fully degraded raid5/raid6 to a larger nubmer of devices, the new device(s) are not in-sync and so that can make the newly grown stripe appear to be "failed". To avoid this, we set the R5_Expanded flag to say "Even though this device is not fully in-sync, this block is safe so don't treat the device as failed for this stripe". This flag is set for data devices, not not for parity devices. Consequently, if you have a RAID6 with two devices that are partly recovered and a spare, and start a reshape to include the spare, then when the reshape gets past the point where the recovery was up to, it will think the stripes are failed and will get into an infinite loop, failing to make progress. So when contructing parity on an EXPAND_READY stripe, set R5_Expanded. Reported-by: Curt Signed-off-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin --- drivers/md/raid5.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 02e6d335f178..907aa9c6e894 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1679,8 +1679,11 @@ static void ops_complete_reconstruct(void *stripe_he= ad_ref) struct r5dev *dev =3D &sh->dev[i]; =20 if (dev->written || i =3D=3D pd_idx || i =3D=3D qd_idx) { - if (!discard && !test_bit(R5_SkipCopy, &dev->flags)) + if (!discard && !test_bit(R5_SkipCopy, &dev->flags)) { set_bit(R5_UPTODATE, &dev->flags); + if (test_bit(STRIPE_EXPAND_READY, &sh->state)) + set_bit(R5_Expanded, &dev->flags); + } if (fua) set_bit(R5_WantFUA, &dev->flags); if (sync) --=20 2.11.0