Re: [dm-devel] [PATCH] Fix over-zealous flush_disk when changing device size.

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff Moyer <jmoyer@redhat.com>
To: NeilBrown <neilb@suse.de>
Cc: James Bottomley <James.Bottomley@suse.de>,
	device-mapper development <dm-devel@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [dm-devel] [PATCH] Fix over-zealous flush_disk when changing device size.
Date: Thu, 17 Mar 2011 13:33:38 -0400	[thread overview]
Message-ID: <x49ipvhslcd.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <20110317122833.30077397@notabene.brown> (NeilBrown's message of "Thu, 17 Mar 2011 12:28:33 +1100")

NeilBrown <neilb@suse.de> writes:

> On Wed, 16 Mar 2011 16:30:22 -0400 Jeff Moyer <jmoyer@redhat.com> wrote:
>
>> NeilBrown <neilb@suse.de> writes:
>> 
>> >> Synchronous notification of errors.  If we don't try to write everything
>> >> back immediately after the size change, we don't see dirty pages in
>> >> zapped regions until the writeout/page cache management takes it into
>> >> its head to try to clean the pages.
>> >> 
>> >
>> > So if you just want synchronous errors, I think you want:
>> >     fsync_bdev()
>> >
>> > which calls sync_filesystem() if it can find a filesystem, else
>> > sync_blockdev();  (sync_filesystem itself calls sync_blockdev too).
>> 
>> ... which deadlocks md.  ;-)  writeback_inodes_sb_nr is waiting for the
>> flusher thread to write back the dirty data.  The flusher thread is
>> stuck in md_write_start, here:
>> 
>>         wait_event(mddev->sb_wait,
>>                    !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>> 
>> This is after reverting your change, and replacing the flush_disk call
>> in check_disk_size_change with a call to fsync_bdev.  I'm not familiar
>> enough with md to really suggest a way forward.  Neil?
>
> That would be quite easy to avoid.
> Just call
>    md_write_start()
> before revalidate_disk, and
>    md_write_end()
> afterwards.

That does not avoid the problem (if I understood your suggestion).  You
instead end up with the following:

INFO: task md127_raid5:2282 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md127_raid5     D ffff88011c72d0a0  5688  2282      2 0x00000080
 ffff880118997c20 0000000000000046 ffff880100000000 0000000000000246
 0000000000014d00 ffff88011c72cb10 ffff88011c72d0a0 ffff880118997fd8
 ffff88011c72d0a8 0000000000014d00 ffff880118996010 0000000000014d00
Call Trace:
 [<ffffffff8138bbbd>] md_write_start+0xad/0x1d0
 [<ffffffff810801d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0311558>] raid5_finish_reshape+0x98/0x1e0 [raid456]
 [<ffffffff8138a933>] reap_sync_thread+0x63/0x130
 [<ffffffff8138c8b6>] md_check_recovery+0x1f6/0x6f0
 [<ffffffffa03150ab>] raid5d+0x3b/0x610 [raid456]
 [<ffffffff810804c9>] ? prepare_to_wait+0x59/0x90
 [<ffffffff81387ee9>] md_thread+0x119/0x150
 [<ffffffff810801d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81387dd0>] ? md_thread+0x0/0x150
 [<ffffffff8107fb56>] kthread+0x96/0xa0
 [<ffffffff8100cc04>] kernel_thread_helper+0x4/0x10
 [<ffffffff8107fac0>] ? kthread+0x0/0xa0
 [<ffffffff8100cc00>] ? kernel_thread_helper+0x0/0x10

I'll leave this to you to work out when you have time.

Cheers,
Jeff

     prev parent reply	other threads:[~2011-03-17 17:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-17  5:50 [PATCH] Fix over-zealous flush_disk when changing device size NeilBrown
2011-02-17 17:03 ` Jeff Moyer
2011-02-21 19:36 ` Jeff Moyer
2011-02-21 21:14   ` NeilBrown
2011-03-03 14:31 ` Christoph Hellwig
2011-03-04  0:16   ` NeilBrown
2011-03-04 17:25     ` Andrew Patterson
2011-03-06  6:47       ` NeilBrown
2011-03-07  4:22         ` Andrew Patterson
2011-03-07 16:46           ` [dm-devel] " James Bottomley
2011-03-07 22:44             ` NeilBrown
2011-03-07 22:56               ` James Bottomley
2011-03-08  0:04                 ` NeilBrown
2011-03-16 20:30                   ` Jeff Moyer
2011-03-17  1:28                     ` NeilBrown
2011-03-17 17:33                       ` Jeff Moyer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x49ipvhslcd.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    --cc=James.Bottomley@suse.de \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).