All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Moyer <jmoyer@redhat.com>
To: NeilBrown <neilb@suse.de>
Cc: James Bottomley <James.Bottomley@suse.de>,
	device-mapper development <dm-devel@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [dm-devel] [PATCH] Fix over-zealous flush_disk when changing device size.
Date: Thu, 17 Mar 2011 13:33:38 -0400	[thread overview]
Message-ID: <x49ipvhslcd.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <20110317122833.30077397@notabene.brown> (NeilBrown's message of "Thu, 17 Mar 2011 12:28:33 +1100")

NeilBrown <neilb@suse.de> writes:

> On Wed, 16 Mar 2011 16:30:22 -0400 Jeff Moyer <jmoyer@redhat.com> wrote:
>
>> NeilBrown <neilb@suse.de> writes:
>> 
>> >> Synchronous notification of errors.  If we don't try to write everything
>> >> back immediately after the size change, we don't see dirty pages in
>> >> zapped regions until the writeout/page cache management takes it into
>> >> its head to try to clean the pages.
>> >> 
>> >
>> > So if you just want synchronous errors, I think you want:
>> >     fsync_bdev()
>> >
>> > which calls sync_filesystem() if it can find a filesystem, else
>> > sync_blockdev();  (sync_filesystem itself calls sync_blockdev too).
>> 
>> ... which deadlocks md.  ;-)  writeback_inodes_sb_nr is waiting for the
>> flusher thread to write back the dirty data.  The flusher thread is
>> stuck in md_write_start, here:
>> 
>>         wait_event(mddev->sb_wait,
>>                    !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>> 
>> This is after reverting your change, and replacing the flush_disk call
>> in check_disk_size_change with a call to fsync_bdev.  I'm not familiar
>> enough with md to really suggest a way forward.  Neil?
>
> That would be quite easy to avoid.
> Just call
>    md_write_start()
> before revalidate_disk, and
>    md_write_end()
> afterwards.

That does not avoid the problem (if I understood your suggestion).  You
instead end up with the following:

INFO: task md127_raid5:2282 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md127_raid5     D ffff88011c72d0a0  5688  2282      2 0x00000080
 ffff880118997c20 0000000000000046 ffff880100000000 0000000000000246
 0000000000014d00 ffff88011c72cb10 ffff88011c72d0a0 ffff880118997fd8
 ffff88011c72d0a8 0000000000014d00 ffff880118996010 0000000000014d00
Call Trace:
 [<ffffffff8138bbbd>] md_write_start+0xad/0x1d0
 [<ffffffff810801d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0311558>] raid5_finish_reshape+0x98/0x1e0 [raid456]
 [<ffffffff8138a933>] reap_sync_thread+0x63/0x130
 [<ffffffff8138c8b6>] md_check_recovery+0x1f6/0x6f0
 [<ffffffffa03150ab>] raid5d+0x3b/0x610 [raid456]
 [<ffffffff810804c9>] ? prepare_to_wait+0x59/0x90
 [<ffffffff81387ee9>] md_thread+0x119/0x150
 [<ffffffff810801d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81387dd0>] ? md_thread+0x0/0x150
 [<ffffffff8107fb56>] kthread+0x96/0xa0
 [<ffffffff8100cc04>] kernel_thread_helper+0x4/0x10
 [<ffffffff8107fac0>] ? kthread+0x0/0xa0
 [<ffffffff8100cc00>] ? kernel_thread_helper+0x0/0x10

I'll leave this to you to work out when you have time.

Cheers,
Jeff

      reply	other threads:[~2011-03-17 17:33 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-17  5:50 [PATCH] Fix over-zealous flush_disk when changing device size NeilBrown
2011-02-17  5:50 ` NeilBrown
2011-02-17 17:03 ` Jeff Moyer
2011-02-23  8:48   ` Kwolek, Adam
2011-02-23 10:01     ` NeilBrown
2011-02-21 19:36 ` Jeff Moyer
2011-02-21 19:36   ` Jeff Moyer
2011-02-21 21:14   ` NeilBrown
2011-03-03 14:31 ` Christoph Hellwig
2011-03-04  0:16   ` NeilBrown
2011-03-04 17:25     ` Andrew Patterson
2011-03-04 17:25       ` Andrew Patterson
2011-03-06  6:47       ` NeilBrown
2011-03-07  4:22         ` Andrew Patterson
2011-03-07  4:22           ` Andrew Patterson
2011-03-07 16:46           ` [dm-devel] " James Bottomley
2011-03-07 22:44             ` NeilBrown
2011-03-07 22:56               ` James Bottomley
2011-03-08  0:04                 ` NeilBrown
2011-03-16 20:30                   ` Jeff Moyer
2011-03-17  1:28                     ` NeilBrown
2011-03-17 17:33                       ` Jeff Moyer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x49ipvhslcd.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    --cc=James.Bottomley@suse.de \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.