linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Konstantin Khlebnikov <koct9i@gmail.com>,
	Shaohua Li <shli@kernel.org>
Cc: "linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-raid@vger.kernel.org, linux-block@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>
Subject: Re: [BUG 4.4.26] bio->bi_bdev == NULL in raid6 return_io()
Date: Mon, 21 Nov 2016 12:23:10 +1100	[thread overview]
Message-ID: <87r365eidd.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <c7f3d4e3-0ef6-a6d8-7c8b-bbdb903af7a9@yandex-team.ru>

[-- Attachment #1: Type: text/plain, Size: 2095 bytes --]

On Sun, Nov 20 2016, Konstantin Khlebnikov wrote:

> On 07.11.2016 23:34, Konstantin Khlebnikov wrote:
>> On Mon, Nov 7, 2016 at 10:46 PM, Shaohua Li <shli@kernel.org> wrote:
>>> On Sat, Nov 05, 2016 at 01:48:45PM +0300, Konstantin Khlebnikov wrote:
>>>> return_io() resolves request_queue even if trace point isn't active:
>>>>
>>>> static inline struct request_queue *bdev_get_queue(struct block_device *bdev)
>>>> {
>>>>       return bdev->bd_disk->queue;    /* this is never NULL */
>>>> }
>>>>
>>>> static void return_io(struct bio_list *return_bi)
>>>> {
>>>>       struct bio *bi;
>>>>       while ((bi = bio_list_pop(return_bi)) != NULL) {
>>>>               bi->bi_iter.bi_size = 0;
>>>>               trace_block_bio_complete(bdev_get_queue(bi->bi_bdev),
>>>>                                        bi, 0);
>>>>               bio_endio(bi);
>>>>       }
>>>> }
>>>
>>> I can't see how this could happen. What kind of tests/environment are these running?
>>
>> That was a random piece of production somewhere.
>> Cording to time all crashes happened soon after reboot.
>> There're several raids, probably some of them were still under resync.
>>
>> For now we have only few machines with this kernel. But I'm sure that
>> I'll get much more soon =)
>
> I've added this debug patch for catching overflow of active stripes in bio
>
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -164,6 +164,7 @@ static inline void raid5_inc_bi_active_stripes(struct bio *bio)
>   {
>          atomic_t *segments = (atomic_t *)&bio->bi_phys_segments;
>          atomic_inc(segments);
> +       BUG_ON(!(atomic_read(segments) & 0xffff));
>   }
>
> And got this. Counter in %edx = 0x00010000
>
> So, looks like one bio (discard?) can cover more than 65535 stripes

65535 stripes - 256M.  I guess that is possible.  Christoph has
suggested that now would be a good time to stop using bi_phys_segments
like this.

I have some patches which should fix this.  I'll post them shortly.  I'd
appreciate it if you would test and confirm that they work (and don't
break anything else)

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

  reply	other threads:[~2016-11-21  1:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-05 10:48 [BUG 4.4.26] bio->bi_bdev == NULL in raid6 return_io() Konstantin Khlebnikov
2016-11-07 19:46 ` Shaohua Li
2016-11-07 20:34   ` Konstantin Khlebnikov
2016-11-20 10:55     ` Konstantin Khlebnikov
2016-11-21  1:23       ` NeilBrown [this message]
2016-11-21 15:32         ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r365eidd.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=khlebnikov@yandex-team.ru \
    --cc=koct9i@gmail.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).