public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Wang <yun.wang@profitbricks.com>
To: NeilBrown <neilb@suse.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-block@vger.kernel.org, linux-raid@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>, Shaohua Li <shli@kernel.org>,
	Jinpu Wang <jinpu.wang@profitbricks.com>
Subject: Re: [RFC PATCH] blk: reset 'bi_next' when bio is done inside request
Date: Tue, 4 Apr 2017 14:48:19 +0200	[thread overview]
Message-ID: <04ef2050-cab0-27fa-8655-d56d2de0fc9b@profitbricks.com> (raw)
In-Reply-To: <d84a1dcf-6f60-d089-f81d-85df5a504c19@profitbricks.com>



On 04/04/2017 02:24 PM, Michael Wang wrote:
> On 04/04/2017 12:23 PM, Michael Wang wrote:
> [snip]
>>> add something like
>>>   if (wbio->bi_next)
>>>      printk("bi_next!= NULL i=%d read_disk=%d bi_end_io=%pf\n",
>>>           i, r1_bio->read_disk, wbio->bi_end_io);
>>>
>>> that might help narrow down what is happening.
>>
>> Just triggered again in 4.4, dmesg like:
>>
>> [  399.240230] md: super_written gets error=-5
>> [  399.240286] md: super_written gets error=-5
>> [  399.240286] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240300] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240312] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240323] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240334] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240341] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240349] md/raid1:md0: dm-0: unrecoverable I/O read error for block 204160
>> [  399.240352] bi_next!= NULL i=0 read_disk=0 bi_end_io=end_sync_write [raid1]
> 
> Is it possible that the fail fast who changed the 'bi_end_io' inside
> fix_sync_read_error() help the used bio pass the check?

Hi, NeilBrown, below patch fixed the issue in our testing, I'll post a md
RFC patch so we can continue the discussion there.

Regards,
Michael Wang

> 
> I'm not sure but if the read bio was supposed to be reused as write
> for fail fast, maybe we should reset it like this?
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 7d67235..0554110 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1986,11 +1986,13 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
>                 /* Don't try recovering from here - just fail it
>                  * ... unless it is the last working device of course */
>                 md_error(mddev, rdev);
> -               if (test_bit(Faulty, &rdev->flags))
> +               if (test_bit(Faulty, &rdev->flags)) {
>                         /* Don't try to read from here, but make sure
>                          * put_buf does it's thing
>                          */
>                         bio->bi_end_io = end_sync_write;
> +                       bio->bi_next = NULL;
> +               }
>         }
>  
>         while(sectors) {
> 
> Regards,
> Michael Wang
> 
> 
>> [  399.240363] ------------[ cut here ]------------
>> [  399.240364] kernel BUG at block/blk-core.c:2147!
>> [  399.240365] invalid opcode: 0000 [#1] SMP 
>> [  399.240378] Modules linked in: ib_srp scsi_transport_srp raid1 md_mod ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr ib_netlink iTCO_wdt iTCO_vendor_support dcdbas dell_smm_hwmon acpi_cpufreq x86_pkg_temp_thermal tpm_tis coretemp evdev tpm i2c_i801 crct10dif_pclmul serio_raw crc32_pclmul battery processor acpi_pad button kvm_intel kvm dm_round_robin irqbypass dm_multipath autofs4 sg sd_mod crc32c_intel ahci libahci psmouse libata mlx4_core scsi_mod xhci_pci xhci_hcd mlx_compat fan thermal [last unloaded: scsi_transport_srp]
>> [  399.240380] CPU: 1 PID: 2052 Comm: md0_raid1 Not tainted 4.4.50-1-pserver+ #26
>> [  399.240381] Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 1.3.6 05/26/2016
>> [  399.240381] task: ffff8804031b6200 ti: ffff8800d72b4000 task.ti: ffff8800d72b4000
>> [  399.240385] RIP: 0010:[<ffffffff813fcd9e>]  [<ffffffff813fcd9e>] generic_make_request+0x29e/0x2a0
>> [  399.240385] RSP: 0018:ffff8800d72b7d10  EFLAGS: 00010286
>> [  399.240386] RAX: ffff8804031b6200 RBX: ffff8800d2577e00 RCX: 000000003fffffff
>> [  399.240387] RDX: ffffffffc0000001 RSI: 0000000000000001 RDI: ffff8800d5e8c1e0
>> [  399.240387] RBP: ffff8800d72b7d50 R08: 0000000000000000 R09: 000000000000003f
>> [  399.240388] R10: 0000000000000004 R11: 00000000001db9ac R12: 00000000ffffffff
>> [  399.240388] R13: ffff8800d2748e00 R14: ffff88040a016400 R15: ffff8800d2748e40
>> [  399.240389] FS:  0000000000000000(0000) GS:ffff88041dc40000(0000) knlGS:0000000000000000
>> [  399.240390] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  399.240390] CR2: 00007fb49246a000 CR3: 000000040215c000 CR4: 00000000003406e0
>> [  399.240391] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [  399.240391] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [  399.240392] Stack:
>> [  399.240393]  ffff8800d72b7d18 ffff8800d72b7d30 0000000000000000 0000000000000000
>> [  399.240394]  ffffffffa079c290 ffff8800d2577e00 0000000000000000 ffff8800d2748e00
>> [  399.240395]  ffff8800d72b7e58 ffffffffa079e74c ffff88040b661c00 ffff8800d2577e00
>> [  399.240396] Call Trace:
>> [  399.240398]  [<ffffffffa079c290>] ? sync_request+0xb20/0xb20 [raid1]
>> [  399.240400]  [<ffffffffa079e74c>] raid1d+0x65c/0x1060 [raid1]
>> [  399.240403]  [<ffffffff810b6800>] ? trace_raw_output_itimer_expire+0x80/0x80
>> [  399.240407]  [<ffffffffa0772040>] md_thread+0x130/0x140 [md_mod]
>> [  399.240409]  [<ffffffff81094790>] ? wait_woken+0x80/0x80
>> [  399.240412]  [<ffffffffa0771f10>] ? find_pers+0x70/0x70 [md_mod]
>> [  399.240414]  [<ffffffff81075066>] kthread+0xd6/0xf0
>> [  399.240415]  [<ffffffff81074f90>] ? kthread_park+0x50/0x50
>> [  399.240417]  [<ffffffff8180411f>] ret_from_fork+0x3f/0x70
>> [  399.240418]  [<ffffffff81074f90>] ? kthread_park+0x50/0x50
>> [  399.240433] Code: 89 04 24 e9 2d ff ff ff 49 8d bd d8 07 00 00 f0 49 83 ad d8 07 00 00 01 74 05 e9 8b fe ff ff 41 ff 95 e8 07 00 00 e9 7f fe ff ff <0f> 0b 55 48 63 c7 48 89 e5 41 54 53 48 89 f3 48 83 ec 28 48 0b 
>> [  399.240434] RIP  [<ffffffff813fcd9e>] generic_make_request+0x29e/0x2a0
>> [  399.240435]  RSP <ffff8800d72b7d10>
>>
>>
>> Regards,
>> Michael Wang
>>
>>>
>>> NeilBrown
>>>

  reply	other threads:[~2017-04-04 12:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-03 12:05 [RFC PATCH] blk: reset 'bi_next' when bio is done inside request Michael Wang
2017-04-03 21:25 ` NeilBrown
2017-04-04  8:13   ` Michael Wang
2017-04-04  9:37     ` NeilBrown
2017-04-04 10:23       ` Michael Wang
2017-04-04 12:24         ` Michael Wang
2017-04-04 12:48           ` Michael Wang [this message]
2017-04-04 21:52         ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04ef2050-cab0-27fa-8655-d56d2de0fc9b@profitbricks.com \
    --to=yun.wang@profitbricks.com \
    --cc=axboe@kernel.dk \
    --cc=jinpu.wang@profitbricks.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox