From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
Song Liu <song@kernel.org>, Xiao Ni <xni@redhat.com>
Subject: [PATCH 5.4 27/34] raid5: introduce MD_BROKEN
Date: Fri, 3 Jun 2022 19:43:23 +0200 [thread overview]
Message-ID: <20220603173816.972587698@linuxfoundation.org> (raw)
In-Reply-To: <20220603173815.990072516@linuxfoundation.org>
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
commit 57668f0a4cc4083a120cc8c517ca0055c4543b59 upstream.
Raid456 module had allowed to achieve failed state. It was fixed by
fb73b357fb9 ("raid5: block failing device if raid will be failed").
This fix introduces a bug, now if raid5 fails during IO, it may result
with a hung task without completion. Faulty flag on the device is
necessary to process all requests and is checked many times, mainly in
analyze_stripe().
Allow to set faulty on drive again and set MD_BROKEN if raid is failed.
As a result, this level is allowed to achieve failed state again, but
communication with userspace (via -EBUSY status) will be preserved.
This restores possibility to fail array via #mdadm --set-faulty command
and will be fixed by additional verification on mdadm side.
Reproduction steps:
mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1
mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean
mkfs.xfs /dev/md126 -f
mount /dev/md126 /mnt/root/
fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw
--bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4
--time_based --group_reporting --name=throughput-test-job
--eta-newline=1 &
echo 1 > /sys/block/nvme2n1/device/device/remove
echo 1 > /sys/block/nvme1n1/device/device/remove
[ 1475.787779] Call Trace:
[ 1475.793111] __schedule+0x2a6/0x700
[ 1475.799460] schedule+0x38/0xa0
[ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456]
[ 1475.813856] ? finish_wait+0x80/0x80
[ 1475.820332] raid5_make_request+0x180/0xb40 [raid456]
[ 1475.828281] ? finish_wait+0x80/0x80
[ 1475.834727] ? finish_wait+0x80/0x80
[ 1475.841127] ? finish_wait+0x80/0x80
[ 1475.847480] md_handle_request+0x119/0x190
[ 1475.854390] md_make_request+0x8a/0x190
[ 1475.861041] generic_make_request+0xcf/0x310
[ 1475.868145] submit_bio+0x3c/0x160
[ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60
[ 1475.882070] iomap_dio_bio_actor+0x175/0x390
[ 1475.889149] iomap_apply+0xff/0x310
[ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.909974] iomap_dio_rw+0x2f2/0x490
[ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.923680] ? atime_needs_update+0x77/0xe0
[ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
[ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
[ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs]
[ 1475.953403] aio_read+0xd5/0x180
[ 1475.959395] ? _cond_resched+0x15/0x30
[ 1475.965907] io_submit_one+0x20b/0x3c0
[ 1475.972398] __x64_sys_io_submit+0xa2/0x180
[ 1475.979335] ? do_io_getevents+0x7c/0xc0
[ 1475.986009] do_syscall_64+0x5b/0x1a0
[ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 1476.000255] RIP: 0033:0x7f11fc27978d
[ 1476.006631] Code: Bad RIP value.
[ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds.
Cc: stable@vger.kernel.org
Fixes: fb73b357fb9 ("raid5: block failing device if raid will be failed")
Reviewd-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/raid5.c | 47 ++++++++++++++++++++++-------------------------
1 file changed, 22 insertions(+), 25 deletions(-)
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -609,17 +609,17 @@ int raid5_calc_degraded(struct r5conf *c
return degraded;
}
-static int has_failed(struct r5conf *conf)
+static bool has_failed(struct r5conf *conf)
{
- int degraded;
+ int degraded = conf->mddev->degraded;
- if (conf->mddev->reshape_position == MaxSector)
- return conf->mddev->degraded > conf->max_degraded;
+ if (test_bit(MD_BROKEN, &conf->mddev->flags))
+ return true;
- degraded = raid5_calc_degraded(conf);
- if (degraded > conf->max_degraded)
- return 1;
- return 0;
+ if (conf->mddev->reshape_position != MaxSector)
+ degraded = raid5_calc_degraded(conf);
+
+ return degraded > conf->max_degraded;
}
struct stripe_head *
@@ -2679,34 +2679,31 @@ static void raid5_error(struct mddev *md
unsigned long flags;
pr_debug("raid456: error called\n");
+ pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n",
+ mdname(mddev), bdevname(rdev->bdev, b));
+
spin_lock_irqsave(&conf->device_lock, flags);
+ set_bit(Faulty, &rdev->flags);
+ clear_bit(In_sync, &rdev->flags);
+ mddev->degraded = raid5_calc_degraded(conf);
- if (test_bit(In_sync, &rdev->flags) &&
- mddev->degraded == conf->max_degraded) {
- /*
- * Don't allow to achieve failed state
- * Don't try to recover this device
- */
+ if (has_failed(conf)) {
+ set_bit(MD_BROKEN, &conf->mddev->flags);
conf->recovery_disabled = mddev->recovery_disabled;
- spin_unlock_irqrestore(&conf->device_lock, flags);
- return;
+
+ pr_crit("md/raid:%s: Cannot continue operation (%d/%d failed).\n",
+ mdname(mddev), mddev->degraded, conf->raid_disks);
+ } else {
+ pr_crit("md/raid:%s: Operation continuing on %d devices.\n",
+ mdname(mddev), conf->raid_disks - mddev->degraded);
}
- set_bit(Faulty, &rdev->flags);
- clear_bit(In_sync, &rdev->flags);
- mddev->degraded = raid5_calc_degraded(conf);
spin_unlock_irqrestore(&conf->device_lock, flags);
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
set_bit(Blocked, &rdev->flags);
set_mask_bits(&mddev->sb_flags, 0,
BIT(MD_SB_CHANGE_DEVS) | BIT(MD_SB_CHANGE_PENDING));
- pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n"
- "md/raid:%s: Operation continuing on %d devices.\n",
- mdname(mddev),
- bdevname(rdev->bdev, b),
- mdname(mddev),
- conf->raid_disks - mddev->degraded);
r5c_update_on_rdev_error(mddev, rdev);
}
next prev parent reply other threads:[~2022-06-03 17:48 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-03 17:42 [PATCH 5.4 00/34] 5.4.197-rc1 review Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.4 01/34] lockdown: also lock down previous kgdb use Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.4 02/34] x86/pci/xen: Disable PCI/MSI[-X] masking for XEN_HVM guests Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.4 03/34] staging: rtl8723bs: prevent ->Ssid overflow in rtw_wx_set_scan() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 04/34] Input: goodix - fix spurious key release events Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 05/34] tcp: change source port randomizarion at connect() time Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 06/34] secure_seq: use the 64 bits of the siphash for port offset calculation Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 07/34] media: vim2m: Register video device after setting up internals Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 08/34] media: vim2m: initialize the media device earlier Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 09/34] ACPI: sysfs: Make sparse happy about address space in use Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 10/34] ACPI: sysfs: Fix BERT error region memory mapping Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 11/34] pinctrl: sunxi: fix f1c100s uart2 function Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 12/34] net: af_key: check encryption module availability consistency Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 13/34] net: ftgmac100: Disable hardware checksum on AST2600 Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 14/34] i2c: ismt: Provide a DMA buffer for Interrupt Cause Logging Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 15/34] drivers: i2c: thunderx: Allow driver to work with ACPI defined TWSI controllers Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 16/34] assoc_array: Fix BUG_ON during garbage collect Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 17/34] cfg80211: set custom regdomain after wiphy registration Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 18/34] drm/i915: Fix -Wstringop-overflow warning in call to intel_read_wm_latency() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 19/34] exec: Force single empty string when argv is empty Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 20/34] netfilter: conntrack: re-fetch conntrack after insertion Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 21/34] crypto: ecrdsa - Fix incorrect use of vli_cmp Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 22/34] zsmalloc: fix races between asynchronous zspage free and page migration Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 23/34] dm integrity: fix error code in dm_integrity_ctr() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 24/34] dm crypt: make printing of the key constant-time Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 25/34] dm stats: add cond_resched when looping over entries Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 26/34] dm verity: set DM_TARGET_IMMUTABLE feature flag Greg Kroah-Hartman
2022-06-10 4:22 ` Oleksandr Tymoshenko
2022-06-10 5:15 ` Greg KH
2022-06-10 8:10 ` Oleksandr Tymoshenko
2022-06-10 15:11 ` [dm-devel] " Mike Snitzer
2022-06-10 15:11 ` Mike Snitzer
2022-06-13 9:13 ` [dm-devel] " Greg KH
2022-06-13 9:13 ` Greg KH
2022-06-15 14:36 ` [dm-devel] " Guenter Roeck
2022-06-15 14:36 ` Guenter Roeck
2022-06-15 15:29 ` [dm-devel] " Mike Snitzer
2022-06-15 15:29 ` Mike Snitzer
2022-06-15 17:50 ` [dm-devel] " Guenter Roeck
2022-06-15 17:50 ` Guenter Roeck
2022-06-15 20:02 ` [dm-devel] " Mike Snitzer
2022-06-15 20:02 ` Mike Snitzer
2022-06-15 20:40 ` [dm-devel] " Guenter Roeck
2022-06-15 20:40 ` Guenter Roeck
2022-06-15 23:59 ` [dm-devel] " Guenter Roeck
2022-06-15 23:59 ` Guenter Roeck
2022-06-16 23:22 ` [dm-devel] " Guenter Roeck
2022-06-16 23:22 ` Guenter Roeck
2022-06-20 11:44 ` [dm-devel] " Greg KH
2022-06-20 11:44 ` Greg KH
2022-06-21 16:35 ` [dm-devel] [5.4.y PATCH v2] dm: remove special-casing of bio-based immutable singleton target on NVMe Mike Snitzer
2022-06-21 16:35 ` Mike Snitzer
2022-06-23 15:48 ` [dm-devel] " Greg KH
2022-06-23 15:48 ` Greg KH
2022-06-23 16:00 ` [dm-devel] Patch "dm: remove special-casing of bio-based immutable singleton target on NVMe" has been added to the 5.4-stable tree gregkh
2022-06-23 16:00 ` gregkh
2022-06-03 17:43 ` Greg Kroah-Hartman [this message]
2022-06-03 17:43 ` [PATCH 5.4 28/34] HID: multitouch: Add support for Google Whiskers Touchpad Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 29/34] tpm: Fix buffer access in tpm2_get_tpm_pt() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 30/34] tpm: ibmvtpm: Correct the return value in tpm_ibmvtpm_probe() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 31/34] docs: submitting-patches: Fix crossref to The canonical patch format Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 32/34] NFS: Memory allocation failures are not server fatal errors Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 33/34] NFSD: Fix possible sleep during nfsd4_release_lockowner() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.4 34/34] bpf: Enlarge offset check value to INT_MAX in bpf_skb_{load,store}_bytes Greg Kroah-Hartman
2022-06-04 12:21 ` [PATCH 5.4 00/34] 5.4.197-rc1 review Sudip Mukherjee
2022-06-04 17:31 ` Naresh Kamboju
2022-06-04 18:54 ` Guenter Roeck
2022-06-06 1:08 ` Samuel Zou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220603173816.972587698@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mariusz.tkaczyk@linux.intel.com \
--cc=song@kernel.org \
--cc=stable@vger.kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.