From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Xiao Ni <xni@redhat.com>,
David Jeffery <djeffery@redhat.com>,
Nigel Croxon <ncroxon@redhat.com>,
Song Liu <songliubraving@fb.com>, Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 4.9 33/51] Dont jump to compute_result state from check_result state
Date: Wed, 15 May 2019 12:56:08 +0200 [thread overview]
Message-ID: <20190515090626.261933700@linuxfoundation.org> (raw)
In-Reply-To: <20190515090616.669619870@linuxfoundation.org>
From: Nigel Croxon <ncroxon@redhat.com>
commit 4f4fd7c5798bbdd5a03a60f6269cf1177fbd11ef upstream.
Changing state from check_state_check_result to
check_state_compute_result not only is unsafe but also doesn't
appear to serve a valid purpose. A raid6 check should only be
pushing out extra writes if doing repair and a mis-match occurs.
The stripe dev management will already try and do repair writes
for failing sectors.
This patch makes the raid6 check_state_check_result handling
work more like raid5's. If somehow too many failures for a
check, just quit the check operation for the stripe. When any
checks pass, don't try and use check_state_compute_result for
a purpose it isn't needed for and is unsafe for. Just mark the
stripe as in sync for passing its parity checks and let the
stripe dev read/write code and the bad blocks list do their
job handling I/O errors.
Repro steps from Xiao:
These are the steps to reproduce this problem:
1. redefined OPT_MEDIUM_ERR_ADDR to 12000 in scsi_debug.c
2. insmod scsi_debug.ko dev_size_mb=11000 max_luns=1 num_tgts=1
3. mdadm --create /dev/md127 --level=6 --raid-devices=5 /dev/sde1 /dev/sde2 /dev/sde3 /dev/sde5 /dev/sde6
sde is the disk created by scsi_debug
4. echo "2" >/sys/module/scsi_debug/parameters/opts
5. raid-check
It panic:
[ 4854.730899] md: data-check of RAID array md127
[ 4854.857455] sd 5:0:0:0: [sdr] tag#80 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 4854.859246] sd 5:0:0:0: [sdr] tag#80 Sense Key : Medium Error [current]
[ 4854.860694] sd 5:0:0:0: [sdr] tag#80 Add. Sense: Unrecovered read error
[ 4854.862207] sd 5:0:0:0: [sdr] tag#80 CDB: Read(10) 28 00 00 00 2d 88 00 04 00 00
[ 4854.864196] print_req_error: critical medium error, dev sdr, sector 11656 flags 0
[ 4854.867409] sd 5:0:0:0: [sdr] tag#100 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 4854.869469] sd 5:0:0:0: [sdr] tag#100 Sense Key : Medium Error [current]
[ 4854.871206] sd 5:0:0:0: [sdr] tag#100 Add. Sense: Unrecovered read error
[ 4854.872858] sd 5:0:0:0: [sdr] tag#100 CDB: Read(10) 28 00 00 00 2e e0 00 00 08 00
[ 4854.874587] print_req_error: critical medium error, dev sdr, sector 12000 flags 4000
[ 4854.876456] sd 5:0:0:0: [sdr] tag#101 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 4854.878552] sd 5:0:0:0: [sdr] tag#101 Sense Key : Medium Error [current]
[ 4854.880278] sd 5:0:0:0: [sdr] tag#101 Add. Sense: Unrecovered read error
[ 4854.881846] sd 5:0:0:0: [sdr] tag#101 CDB: Read(10) 28 00 00 00 2e e8 00 00 08 00
[ 4854.883691] print_req_error: critical medium error, dev sdr, sector 12008 flags 4000
[ 4854.893927] sd 5:0:0:0: [sdr] tag#166 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 4854.896002] sd 5:0:0:0: [sdr] tag#166 Sense Key : Medium Error [current]
[ 4854.897561] sd 5:0:0:0: [sdr] tag#166 Add. Sense: Unrecovered read error
[ 4854.899110] sd 5:0:0:0: [sdr] tag#166 CDB: Read(10) 28 00 00 00 2e e0 00 00 10 00
[ 4854.900989] print_req_error: critical medium error, dev sdr, sector 12000 flags 0
[ 4854.902757] md/raid:md127: read error NOT corrected!! (sector 9952 on sdr1).
[ 4854.904375] md/raid:md127: read error NOT corrected!! (sector 9960 on sdr1).
[ 4854.906201] ------------[ cut here ]------------
[ 4854.907341] kernel BUG at drivers/md/raid5.c:4190!
raid5.c:4190 above is this BUG_ON:
handle_parity_checks6()
...
BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */
Cc: <stable@vger.kernel.org> # v3.16+
OriginalAuthor: David Jeffery <djeffery@redhat.com>
Cc: Xiao Ni <xni@redhat.com>
Tested-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: David Jeffy <djeffery@redhat.com>
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/raid5.c | 19 ++++---------------
1 file changed, 4 insertions(+), 15 deletions(-)
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3914,26 +3914,15 @@ static void handle_parity_checks6(struct
case check_state_check_result:
sh->check_state = check_state_idle;
+ if (s->failed > 1)
+ break;
/* handle a successful check operation, if parity is correct
* we are done. Otherwise update the mismatch count and repair
* parity if !MD_RECOVERY_CHECK
*/
if (sh->ops.zero_sum_result == 0) {
- /* both parities are correct */
- if (!s->failed)
- set_bit(STRIPE_INSYNC, &sh->state);
- else {
- /* in contrast to the raid5 case we can validate
- * parity, but still have a failure to write
- * back
- */
- sh->check_state = check_state_compute_result;
- /* Returning at this point means that we may go
- * off and bring p and/or q uptodate again so
- * we make sure to check zero_sum_result again
- * to verify if p or q need writeback
- */
- }
+ /* Any parity checked was correct */
+ set_bit(STRIPE_INSYNC, &sh->state);
} else {
atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches);
if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery))
next prev parent reply other threads:[~2019-05-15 11:15 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-15 10:55 [PATCH 4.9 00/51] 4.9.177-stable review Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 01/51] netfilter: compat: initialize all fields in xt_init Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 02/51] bpf: fix struct htab_elem layout Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 03/51] bpf: convert htab map to hlist_nulls Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 04/51] platform/x86: sony-laptop: Fix unintentional fall-through Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 05/51] USB: serial: fix unthrottle races Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 06/51] iio: adc: xilinx: fix potential use-after-free on remove Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 07/51] libnvdimm/namespace: Fix a potential NULL pointer dereference Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 08/51] HID: input: add mapping for Expose/Overview key Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 09/51] HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 10/51] HID: input: add mapping for "Toggle Display" key Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 11/51] libnvdimm/btt: Fix a kmemdup failure check Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 12/51] s390/dasd: Fix capacity calculation for large volumes Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 13/51] mac80211: fix unaligned access in mesh table hash function Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 14/51] s390/3270: fix lockdep false positive on view->lock Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 15/51] mISDN: Check address length before reading address family Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 16/51] x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 17/51] KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 18/51] tools lib traceevent: Fix missing equality check for strcmp Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 19/51] init: initialize jump labels before command line option parsing Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 20/51] selftests: netfilter: check icmp pkttoobig errors are set as related Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 21/51] ipvs: do not schedule icmp errors from tunnels Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 22/51] MIPS: perf: ath79: Fix perfcount IRQ assignment Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 23/51] s390: ctcm: fix ctcm_new_device error return code Greg Kroah-Hartman
2019-05-15 10:55 ` [PATCH 4.9 24/51] drm/sun4i: Set device driver data at bind time for use in unbind Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 25/51] selftests/net: correct the return value for run_netsocktests Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 26/51] gpu: ipu-v3: dp: fix CSC handling Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 27/51] spi: Micrel eth switch: declare missing of table Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 28/51] spi: ST ST95HF NFC: " Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 29/51] Input: synaptics-rmi4 - fix possible double free Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 30/51] cw1200: fix missing unlock on error in cw1200_hw_scan() Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 31/51] ALSA: pcm: remove SNDRV_PCM_IOCTL1_INFO internal command Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 32/51] rtlwifi: rtl8723ae: Fix missing break in switch statement Greg Kroah-Hartman
2019-05-15 10:56 ` Greg Kroah-Hartman [this message]
2019-05-15 10:56 ` [PATCH 4.9 34/51] Revert "x86/vdso: Drop implicit common-page-size linker flag" Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 35/51] Revert "x86: vdso: Use $LD instead of $CC to link" Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 36/51] x86: vdso: Use $LD instead of $CC to link Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 37/51] x86/vdso: Drop implicit common-page-size linker flag Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 38/51] x86/vdso: Pass --eh-frame-hdr to the linker Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 39/51] powerpc/64s: Include cpu header Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 40/51] bridge: Fix error path for kobject_init_and_add() Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 41/51] fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied Greg Kroah-Hartman
2019-05-19 15:43 ` Nathan Chancellor
2019-05-19 20:27 ` Florian Westphal
2019-05-20 2:00 ` Hangbin Liu
2019-05-20 0:11 ` Sasha Levin
2019-05-20 0:29 ` David Ahern
2019-05-20 9:04 ` Greg Kroah-Hartman
2019-05-20 9:11 ` Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 42/51] net: ucc_geth - fix Oops when changing number of buffers in the ring Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 43/51] packet: Fix error path in packet_init Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 44/51] vlan: disable SIOCSHWTSTAMP in container Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 45/51] vrf: sit mtu should not be updated when vrf netdev is the link Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 46/51] ipv4: Fix raw socket lookup for local traffic Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 47/51] bonding: fix arp_validate toggling in active-backup mode Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 48/51] drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 49/51] drivers/virt/fsl_hypervisor.c: prevent integer overflow " Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 50/51] powerpc/lib: fix book3s/32 boot failure due to code patching Greg Kroah-Hartman
2019-05-15 10:56 ` [PATCH 4.9 51/51] powerpc/booke64: set RI in default MSR Greg Kroah-Hartman
2019-05-15 18:27 ` [PATCH 4.9 00/51] 4.9.177-stable review kernelci.org bot
2019-05-15 19:54 ` Naresh Kamboju
2019-05-16 3:34 ` Guenter Roeck
2019-05-16 11:00 ` Jon Hunter
2019-05-16 11:00 ` Jon Hunter
2019-05-16 14:09 ` shuah
2019-05-17 6:33 ` Kelsey Skunberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190515090626.261933700@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=axboe@kernel.dk \
--cc=djeffery@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ncroxon@redhat.com \
--cc=songliubraving@fb.com \
--cc=stable@vger.kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.