All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Song Liu <song@kernel.org>, Xiao Ni <xni@redhat.com>
Subject: [PATCH 5.10 44/53] raid5: introduce MD_BROKEN
Date: Fri,  3 Jun 2022 19:43:29 +0200	[thread overview]
Message-ID: <20220603173820.001200691@linuxfoundation.org> (raw)
In-Reply-To: <20220603173818.716010877@linuxfoundation.org>

From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

commit 57668f0a4cc4083a120cc8c517ca0055c4543b59 upstream.

Raid456 module had allowed to achieve failed state. It was fixed by
fb73b357fb9 ("raid5: block failing device if raid will be failed").
This fix introduces a bug, now if raid5 fails during IO, it may result
with a hung task without completion. Faulty flag on the device is
necessary to process all requests and is checked many times, mainly in
analyze_stripe().
Allow to set faulty on drive again and set MD_BROKEN if raid is failed.

As a result, this level is allowed to achieve failed state again, but
communication with userspace (via -EBUSY status) will be preserved.

This restores possibility to fail array via #mdadm --set-faulty command
and will be fixed by additional verification on mdadm side.

Reproduction steps:
 mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1
 mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean
 mkfs.xfs /dev/md126 -f
 mount /dev/md126 /mnt/root/

 fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw
--bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4
--time_based --group_reporting --name=throughput-test-job
--eta-newline=1 &

 echo 1 > /sys/block/nvme2n1/device/device/remove
 echo 1 > /sys/block/nvme1n1/device/device/remove

 [ 1475.787779] Call Trace:
 [ 1475.793111] __schedule+0x2a6/0x700
 [ 1475.799460] schedule+0x38/0xa0
 [ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456]
 [ 1475.813856] ? finish_wait+0x80/0x80
 [ 1475.820332] raid5_make_request+0x180/0xb40 [raid456]
 [ 1475.828281] ? finish_wait+0x80/0x80
 [ 1475.834727] ? finish_wait+0x80/0x80
 [ 1475.841127] ? finish_wait+0x80/0x80
 [ 1475.847480] md_handle_request+0x119/0x190
 [ 1475.854390] md_make_request+0x8a/0x190
 [ 1475.861041] generic_make_request+0xcf/0x310
 [ 1475.868145] submit_bio+0x3c/0x160
 [ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60
 [ 1475.882070] iomap_dio_bio_actor+0x175/0x390
 [ 1475.889149] iomap_apply+0xff/0x310
 [ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390
 [ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390
 [ 1475.909974] iomap_dio_rw+0x2f2/0x490
 [ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390
 [ 1475.923680] ? atime_needs_update+0x77/0xe0
 [ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
 [ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
 [ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs]
 [ 1475.953403] aio_read+0xd5/0x180
 [ 1475.959395] ? _cond_resched+0x15/0x30
 [ 1475.965907] io_submit_one+0x20b/0x3c0
 [ 1475.972398] __x64_sys_io_submit+0xa2/0x180
 [ 1475.979335] ? do_io_getevents+0x7c/0xc0
 [ 1475.986009] do_syscall_64+0x5b/0x1a0
 [ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca
 [ 1476.000255] RIP: 0033:0x7f11fc27978d
 [ 1476.006631] Code: Bad RIP value.
 [ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds.

Cc: stable@vger.kernel.org
Fixes: fb73b357fb9 ("raid5: block failing device if raid will be failed")
Reviewd-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/raid5.c |   47 ++++++++++++++++++++++-------------------------
 1 file changed, 22 insertions(+), 25 deletions(-)

--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -686,17 +686,17 @@ int raid5_calc_degraded(struct r5conf *c
 	return degraded;
 }
 
-static int has_failed(struct r5conf *conf)
+static bool has_failed(struct r5conf *conf)
 {
-	int degraded;
+	int degraded = conf->mddev->degraded;
 
-	if (conf->mddev->reshape_position == MaxSector)
-		return conf->mddev->degraded > conf->max_degraded;
+	if (test_bit(MD_BROKEN, &conf->mddev->flags))
+		return true;
 
-	degraded = raid5_calc_degraded(conf);
-	if (degraded > conf->max_degraded)
-		return 1;
-	return 0;
+	if (conf->mddev->reshape_position != MaxSector)
+		degraded = raid5_calc_degraded(conf);
+
+	return degraded > conf->max_degraded;
 }
 
 struct stripe_head *
@@ -2877,34 +2877,31 @@ static void raid5_error(struct mddev *md
 	unsigned long flags;
 	pr_debug("raid456: error called\n");
 
+	pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n",
+		mdname(mddev), bdevname(rdev->bdev, b));
+
 	spin_lock_irqsave(&conf->device_lock, flags);
+	set_bit(Faulty, &rdev->flags);
+	clear_bit(In_sync, &rdev->flags);
+	mddev->degraded = raid5_calc_degraded(conf);
 
-	if (test_bit(In_sync, &rdev->flags) &&
-	    mddev->degraded == conf->max_degraded) {
-		/*
-		 * Don't allow to achieve failed state
-		 * Don't try to recover this device
-		 */
+	if (has_failed(conf)) {
+		set_bit(MD_BROKEN, &conf->mddev->flags);
 		conf->recovery_disabled = mddev->recovery_disabled;
-		spin_unlock_irqrestore(&conf->device_lock, flags);
-		return;
+
+		pr_crit("md/raid:%s: Cannot continue operation (%d/%d failed).\n",
+			mdname(mddev), mddev->degraded, conf->raid_disks);
+	} else {
+		pr_crit("md/raid:%s: Operation continuing on %d devices.\n",
+			mdname(mddev), conf->raid_disks - mddev->degraded);
 	}
 
-	set_bit(Faulty, &rdev->flags);
-	clear_bit(In_sync, &rdev->flags);
-	mddev->degraded = raid5_calc_degraded(conf);
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 
 	set_bit(Blocked, &rdev->flags);
 	set_mask_bits(&mddev->sb_flags, 0,
 		      BIT(MD_SB_CHANGE_DEVS) | BIT(MD_SB_CHANGE_PENDING));
-	pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n"
-		"md/raid:%s: Operation continuing on %d devices.\n",
-		mdname(mddev),
-		bdevname(rdev->bdev, b),
-		mdname(mddev),
-		conf->raid_disks - mddev->degraded);
 	r5c_update_on_rdev_error(mddev, rdev);
 }
 



  parent reply	other threads:[~2022-06-03 17:55 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-03 17:42 [PATCH 5.10 00/53] 5.10.120-rc1 review Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 01/53] pinctrl: sunxi: fix f1c100s uart2 function Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 02/53] percpu_ref_init(): clean ->percpu_count_ref on failure Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 03/53] net: af_key: check encryption module availability consistency Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 04/53] nfc: pn533: Fix buggy cleanup order Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 05/53] net: ftgmac100: Disable hardware checksum on AST2600 Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 06/53] i2c: ismt: Provide a DMA buffer for Interrupt Cause Logging Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 07/53] drivers: i2c: thunderx: Allow driver to work with ACPI defined TWSI controllers Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 08/53] netfilter: nf_tables: disallow non-stateful expression in sets earlier Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 09/53] pipe: make poll_usage boolean and annotate its access Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 10/53] pipe: Fix missing lock in pipe_resize_ring() Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 11/53] cfg80211: set custom regdomain after wiphy registration Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 12/53] assoc_array: Fix BUG_ON during garbage collect Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 13/53] io_uring: dont re-import iovecs from callbacks Greg Kroah-Hartman
2022-06-03 17:42 ` [PATCH 5.10 14/53] io_uring: fix using under-expanded iters Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 15/53] net: ipa: compute proper aggregation limit Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 16/53] xfs: detect overflows in bmbt records Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 17/53] xfs: show the proper user quota options Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 18/53] xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 19/53] xfs: fix an ABBA deadlock in xfs_rename Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 20/53] xfs: Fix CIL throttle hang when CIL space used going backwards Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 21/53] drm/i915: Fix -Wstringop-overflow warning in call to intel_read_wm_latency() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 22/53] exfat: check if cluster num is valid Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 23/53] exfat: fix referencing wrong parent directory information after renaming Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 24/53] lib/crypto: add prompts back to crypto libraries Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 25/53] crypto: drbg - prepare for more fine-grained tracking of seeding state Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 26/53] crypto: drbg - track whether DRBG was seeded with !rng_is_initialized() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 27/53] crypto: drbg - move dynamic ->reseed_threshold adjustments to __drbg_seed() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 28/53] crypto: drbg - make reseeding from get_random_bytes() synchronous Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 29/53] netfilter: nf_tables: sanitize nft_set_desc_concat_parse() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 30/53] netfilter: conntrack: re-fetch conntrack after insertion Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 31/53] KVM: PPC: Book3S HV: fix incorrect NULL check on list iterator Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 32/53] x86/kvm: Alloc dummy async #PF token outside of raw spinlock Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 33/53] x86, kvm: use correct GFP flags for preemption disabled Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 34/53] KVM: x86: avoid calling x86 emulator without a decoded instruction Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 35/53] crypto: caam - fix i.MX6SX entropy delay value Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 36/53] crypto: ecrdsa - Fix incorrect use of vli_cmp Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 37/53] zsmalloc: fix races between asynchronous zspage free and page migration Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 38/53] Bluetooth: hci_qca: Use del_timer_sync() before freeing Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 39/53] ARM: dts: s5pv210: Correct interrupt name for bluetooth in Aries Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 40/53] dm integrity: fix error code in dm_integrity_ctr() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 41/53] dm crypt: make printing of the key constant-time Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 42/53] dm stats: add cond_resched when looping over entries Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 43/53] dm verity: set DM_TARGET_IMMUTABLE feature flag Greg Kroah-Hartman
2022-06-03 17:43 ` Greg Kroah-Hartman [this message]
2022-06-03 17:43 ` [PATCH 5.10 45/53] HID: multitouch: Add support for Google Whiskers Touchpad Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 46/53] HID: multitouch: add quirks to enable Lenovo X12 trackpoint Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 47/53] tpm: Fix buffer access in tpm2_get_tpm_pt() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 48/53] tpm: ibmvtpm: Correct the return value in tpm_ibmvtpm_probe() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 49/53] docs: submitting-patches: Fix crossref to The canonical patch format Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 50/53] NFS: Memory allocation failures are not server fatal errors Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 51/53] NFSD: Fix possible sleep during nfsd4_release_lockowner() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 52/53] bpf: Fix potential array overflow in bpf_trampoline_get_progs() Greg Kroah-Hartman
2022-06-03 17:43 ` [PATCH 5.10 53/53] bpf: Enlarge offset check value to INT_MAX in bpf_skb_{load,store}_bytes Greg Kroah-Hartman
2022-06-04 12:24 ` [PATCH 5.10 00/53] 5.10.120-rc1 review Sudip Mukherjee
2022-06-04 17:23 ` Naresh Kamboju
2022-06-04 18:55 ` Guenter Roeck
2022-06-05  3:03 ` Fox Chen
2022-06-06  1:08 ` Samuel Zou
2022-06-06  8:43 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220603173820.001200691@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=song@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.