From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
Song Liu <song@kernel.org>, Xiao Ni <xni@redhat.com>
Subject: [PATCH 5.17 53/75] raid5: introduce MD_BROKEN
Date: Fri, 3 Jun 2022 19:43:37 +0200 [thread overview]
Message-ID: <20220603173823.246262199@linuxfoundation.org> (raw)
In-Reply-To: <20220603173821.749019262@linuxfoundation.org>
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
commit 57668f0a4cc4083a120cc8c517ca0055c4543b59 upstream.
Raid456 module had allowed to achieve failed state. It was fixed by
fb73b357fb9 ("raid5: block failing device if raid will be failed").
This fix introduces a bug, now if raid5 fails during IO, it may result
with a hung task without completion. Faulty flag on the device is
necessary to process all requests and is checked many times, mainly in
analyze_stripe().
Allow to set faulty on drive again and set MD_BROKEN if raid is failed.
As a result, this level is allowed to achieve failed state again, but
communication with userspace (via -EBUSY status) will be preserved.
This restores possibility to fail array via #mdadm --set-faulty command
and will be fixed by additional verification on mdadm side.
Reproduction steps:
mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1
mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean
mkfs.xfs /dev/md126 -f
mount /dev/md126 /mnt/root/
fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw
--bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4
--time_based --group_reporting --name=throughput-test-job
--eta-newline=1 &
echo 1 > /sys/block/nvme2n1/device/device/remove
echo 1 > /sys/block/nvme1n1/device/device/remove
[ 1475.787779] Call Trace:
[ 1475.793111] __schedule+0x2a6/0x700
[ 1475.799460] schedule+0x38/0xa0
[ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456]
[ 1475.813856] ? finish_wait+0x80/0x80
[ 1475.820332] raid5_make_request+0x180/0xb40 [raid456]
[ 1475.828281] ? finish_wait+0x80/0x80
[ 1475.834727] ? finish_wait+0x80/0x80
[ 1475.841127] ? finish_wait+0x80/0x80
[ 1475.847480] md_handle_request+0x119/0x190
[ 1475.854390] md_make_request+0x8a/0x190
[ 1475.861041] generic_make_request+0xcf/0x310
[ 1475.868145] submit_bio+0x3c/0x160
[ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60
[ 1475.882070] iomap_dio_bio_actor+0x175/0x390
[ 1475.889149] iomap_apply+0xff/0x310
[ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.909974] iomap_dio_rw+0x2f2/0x490
[ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.923680] ? atime_needs_update+0x77/0xe0
[ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
[ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
[ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs]
[ 1475.953403] aio_read+0xd5/0x180
[ 1475.959395] ? _cond_resched+0x15/0x30
[ 1475.965907] io_submit_one+0x20b/0x3c0
[ 1475.972398] __x64_sys_io_submit+0xa2/0x180
[ 1475.979335] ? do_io_getevents+0x7c/0xc0
[ 1475.986009] do_syscall_64+0x5b/0x1a0
[ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 1476.000255] RIP: 0033:0x7f11fc27978d
[ 1476.006631] Code: Bad RIP value.
[ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds.
Cc: stable@vger.kernel.org
Fixes: fb73b357fb9 ("raid5: block failing device if raid will be failed")
Reviewd-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/raid5.c | 47 ++++++++++++++++++++++-------------------------
1 file changed, 22 insertions(+), 25 deletions(-)
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -686,17 +686,17 @@ int raid5_calc_degraded(struct r5conf *c
return degraded;
}
-static int has_failed(struct r5conf *conf)
+static bool has_failed(struct r5conf *conf)
{
- int degraded;
+ int degraded = conf->mddev->degraded;
- if (conf->mddev->reshape_position == MaxSector)
- return conf->mddev->degraded > conf->max_degraded;
+ if (test_bit(MD_BROKEN, &conf->mddev->flags))
+ return true;
- degraded = raid5_calc_degraded(conf);
- if (degraded > conf->max_degraded)
- return 1;
- return 0;
+ if (conf->mddev->reshape_position != MaxSector)
+ degraded = raid5_calc_degraded(conf);
+
+ return degraded > conf->max_degraded;
}
struct stripe_head *
@@ -2877,34 +2877,31 @@ static void raid5_error(struct mddev *md
unsigned long flags;
pr_debug("raid456: error called\n");
+ pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n",
+ mdname(mddev), bdevname(rdev->bdev, b));
+
spin_lock_irqsave(&conf->device_lock, flags);
+ set_bit(Faulty, &rdev->flags);
+ clear_bit(In_sync, &rdev->flags);
+ mddev->degraded = raid5_calc_degraded(conf);
- if (test_bit(In_sync, &rdev->flags) &&
- mddev->degraded == conf->max_degraded) {
- /*
- * Don't allow to achieve failed state
- * Don't try to recover this device
- */
+ if (has_failed(conf)) {
+ set_bit(MD_BROKEN, &conf->mddev->flags);
conf->recovery_disabled = mddev->recovery_disabled;
- spin_unlock_irqrestore(&conf->device_lock, flags);
- return;
+
+ pr_crit("md/raid:%s: Cannot continue operation (%d/%d failed).\n",
+ mdname(mddev), mddev->degraded, conf->raid_disks);
+ } else {
+ pr_crit("md/raid:%s: Operation continuing on %d devices.\n",
+ mdname(mddev), conf->raid_disks - mddev->degraded);
}
- set_bit(Faulty, &rdev->flags);
- clear_bit(In_sync, &rdev->flags);
- mddev->degraded = raid5_calc_degraded(conf);
spin_unlock_irqrestore(&conf->device_lock, flags);
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
set_bit(Blocked, &rdev->flags);
set_mask_bits(&mddev->sb_flags, 0,
BIT(MD_SB_CHANGE_DEVS) | BIT(MD_SB_CHANGE_PENDING));
- pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n"
- "md/raid:%s: Operation continuing on %d devices.\n",
- mdname(mddev),
- bdevname(rdev->bdev, b),
- mdname(mddev),
- conf->raid_disks - mddev->degraded);
r5c_update_on_rdev_error(mddev, rdev);
}
next prev parent reply other threads:[~2022-06-03 17:58 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-03 17:42 [PATCH 5.17 00/75] 5.17.13-rc1 review Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 01/75] ALSA: usb-audio: Dont get sample rate for MCT Trigger 5 USB-to-HDMI Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 02/75] ALSA: hda/realtek: Add quirk for Dell Latitude 7520 Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 03/75] ALSA: hda/realtek: Add quirk for the Framework Laptop Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 04/75] pinctrl: sunxi: fix f1c100s uart2 function Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 05/75] KVM: arm64: Dont hypercall before EL2 init Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 06/75] percpu_ref_init(): clean ->percpu_count_ref on failure Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 07/75] net: af_key: check encryption module availability consistency Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 08/75] nfc: pn533: Fix buggy cleanup order Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 09/75] net: ftgmac100: Disable hardware checksum on AST2600 Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 10/75] i2c: ismt: Provide a DMA buffer for Interrupt Cause Logging Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 11/75] drivers: i2c: thunderx: Allow driver to work with ACPI defined TWSI controllers Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 12/75] netfilter: nf_tables: disallow non-stateful expression in sets earlier Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 13/75] i2c: ismt: prevent memory corruption in ismt_access() Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 14/75] assoc_array: Fix BUG_ON during garbage collect Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.17 15/75] pipe: make poll_usage boolean and annotate its access Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 16/75] pipe: Fix missing lock in pipe_resize_ring() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 17/75] net: ipa: compute proper aggregation limit Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 18/75] drm/i915: Fix -Wstringop-overflow warning in call to intel_read_wm_latency() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 19/75] exfat: check if cluster num is valid Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 20/75] exfat: fix referencing wrong parent directory information after renaming Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 21/75] netfilter: nft_limit: Clone packet limits cost value Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 22/75] netfilter: nf_tables: sanitize nft_set_desc_concat_parse() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 23/75] netfilter: nf_tables: hold mutex on netns pre_exit path Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 24/75] netfilter: nf_tables: double hook unregistration in netns path Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 25/75] netfilter: conntrack: re-fetch conntrack after insertion Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 26/75] KVM: PPC: Book3S HV: fix incorrect NULL check on list iterator Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 27/75] x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 28/75] x86/kvm: Alloc dummy async #PF token outside of raw spinlock Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 29/75] x86, kvm: use correct GFP flags for preemption disabled Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 30/75] x86/uaccess: Implement macros for CMPXCHG on user addresses Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 31/75] KVM: x86: Use __try_cmpxchg_user() to update guest PTE A/D bits Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 32/75] KVM: x86: Use __try_cmpxchg_user() to emulate atomic accesses Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 33/75] KVM: x86: fix typo in __try_cmpxchg_user causing non-atomicness Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 34/75] KVM: x86: avoid calling x86 emulator without a decoded instruction Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 35/75] KVM: x86: avoid loading a vCPU after .vm_destroy was called Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 36/75] KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 37/75] KVM: x86: Drop WARNs that assert a triple fault never "escapes" from L2 Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 38/75] KVM: x86/mmu: Dont rebuild page when the page is synced and no tlb flushing is required Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 39/75] KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 40/75] crypto: caam - fix i.MX6SX entropy delay value Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 41/75] crypto: ecrdsa - Fix incorrect use of vli_cmp Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 42/75] zsmalloc: fix races between asynchronous zspage free and page migration Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 43/75] tools/memory-model/README: Update klitmus7 compat table Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 44/75] ALSA: usb-audio: Workaround for clock setup on TEAC devices Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 45/75] ALSA: usb-audio: Add missing ep_idx in fixed EP quirks Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 46/75] ALSA: usb-audio: Configure sync endpoints before data Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 47/75] Bluetooth: hci_qca: Use del_timer_sync() before freeing Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 48/75] ARM: dts: s5pv210: Correct interrupt name for bluetooth in Aries Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 49/75] dm integrity: fix error code in dm_integrity_ctr() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 50/75] dm crypt: make printing of the key constant-time Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 51/75] dm stats: add cond_resched when looping over entries Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 52/75] dm verity: set DM_TARGET_IMMUTABLE feature flag Greg Kroah-Hartman
2022-06-03 17:43 ` Greg Kroah-Hartman [this message]
2022-06-03 17:43 ` [PATCH 5.17 54/75] fs/ntfs3: validate BOOT sectors_per_clusters Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 55/75] HID: multitouch: Add support for Google Whiskers Touchpad Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 56/75] HID: multitouch: add quirks to enable Lenovo X12 trackpoint Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 57/75] x86/sgx: Disconnect backing page references from dirty status Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 58/75] x86/sgx: Mark PCMD page as dirty when modifying contents Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 59/75] x86/sgx: Obtain backing storage page with enclave mutex held Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 60/75] x86/sgx: Fix race between reclaimer and page fault handler Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 61/75] x86/sgx: Ensure no data in PCMD page after truncate Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 62/75] media: i2c: imx412: Fix reset GPIO polarity Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 63/75] media: i2c: imx412: Fix power_off ordering Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 64/75] tpm: Fix buffer access in tpm2_get_tpm_pt() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 65/75] tpm: ibmvtpm: Correct the return value in tpm_ibmvtpm_probe() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 66/75] docs: submitting-patches: Fix crossref to The canonical patch format Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 67/75] NFS: Memory allocation failures are not server fatal errors Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 68/75] NFSD: Fix possible sleep during nfsd4_release_lockowner() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 69/75] bpf: Fix potential array overflow in bpf_trampoline_get_progs() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 70/75] bpf: Fix combination of jit blinding and pointers to bpf subprogs Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 71/75] bpf: Enlarge offset check value to INT_MAX in bpf_skb_{load,store}_bytes Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 72/75] bpf: Fix usage of trace RCU in local storage Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 73/75] bpf: Fix excessive memory allocation in stack_map_alloc() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 74/75] bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.17 75/75] bpf: Check PTR_TO_MEM | MEM_RDONLY " Greg Kroah-Hartman
2022-06-03 22:20 ` [PATCH 5.17 00/75] 5.17.13-rc1 review Fox Chen
2022-06-03 22:57 ` Justin Forbes
2022-06-04 12:48 ` Sudip Mukherjee
2022-06-04 16:40 ` Naresh Kamboju
2022-06-04 18:55 ` Guenter Roeck
2022-06-04 19:32 ` Ron Economos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220603173823.246262199@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mariusz.tkaczyk@linux.intel.com \
--cc=song@kernel.org \
--cc=stable@vger.kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox