From: Jeff Moyer <jmoyer@redhat.com>
To: NeilBrown <neilb@suse.de>
Cc: James Bottomley <James.Bottomley@suse.de>,
device-mapper development <dm-devel@redhat.com>,
Jens Axboe <axboe@kernel.dk>,
linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: [dm-devel] [PATCH] Fix over-zealous flush_disk when changing device size.
Date: Thu, 17 Mar 2011 13:33:38 -0400 [thread overview]
Message-ID: <x49ipvhslcd.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <20110317122833.30077397@notabene.brown> (NeilBrown's message of "Thu, 17 Mar 2011 12:28:33 +1100")
NeilBrown <neilb@suse.de> writes:
> On Wed, 16 Mar 2011 16:30:22 -0400 Jeff Moyer <jmoyer@redhat.com> wrote:
>
>> NeilBrown <neilb@suse.de> writes:
>>
>> >> Synchronous notification of errors. If we don't try to write everything
>> >> back immediately after the size change, we don't see dirty pages in
>> >> zapped regions until the writeout/page cache management takes it into
>> >> its head to try to clean the pages.
>> >>
>> >
>> > So if you just want synchronous errors, I think you want:
>> > fsync_bdev()
>> >
>> > which calls sync_filesystem() if it can find a filesystem, else
>> > sync_blockdev(); (sync_filesystem itself calls sync_blockdev too).
>>
>> ... which deadlocks md. ;-) writeback_inodes_sb_nr is waiting for the
>> flusher thread to write back the dirty data. The flusher thread is
>> stuck in md_write_start, here:
>>
>> wait_event(mddev->sb_wait,
>> !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>>
>> This is after reverting your change, and replacing the flush_disk call
>> in check_disk_size_change with a call to fsync_bdev. I'm not familiar
>> enough with md to really suggest a way forward. Neil?
>
> That would be quite easy to avoid.
> Just call
> md_write_start()
> before revalidate_disk, and
> md_write_end()
> afterwards.
That does not avoid the problem (if I understood your suggestion). You
instead end up with the following:
INFO: task md127_raid5:2282 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
md127_raid5 D ffff88011c72d0a0 5688 2282 2 0x00000080
ffff880118997c20 0000000000000046 ffff880100000000 0000000000000246
0000000000014d00 ffff88011c72cb10 ffff88011c72d0a0 ffff880118997fd8
ffff88011c72d0a8 0000000000014d00 ffff880118996010 0000000000014d00
Call Trace:
[<ffffffff8138bbbd>] md_write_start+0xad/0x1d0
[<ffffffff810801d0>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa0311558>] raid5_finish_reshape+0x98/0x1e0 [raid456]
[<ffffffff8138a933>] reap_sync_thread+0x63/0x130
[<ffffffff8138c8b6>] md_check_recovery+0x1f6/0x6f0
[<ffffffffa03150ab>] raid5d+0x3b/0x610 [raid456]
[<ffffffff810804c9>] ? prepare_to_wait+0x59/0x90
[<ffffffff81387ee9>] md_thread+0x119/0x150
[<ffffffff810801d0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81387dd0>] ? md_thread+0x0/0x150
[<ffffffff8107fb56>] kthread+0x96/0xa0
[<ffffffff8100cc04>] kernel_thread_helper+0x4/0x10
[<ffffffff8107fac0>] ? kthread+0x0/0xa0
[<ffffffff8100cc00>] ? kernel_thread_helper+0x0/0x10
I'll leave this to you to work out when you have time.
Cheers,
Jeff
prev parent reply other threads:[~2011-03-17 17:34 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-17 5:50 [PATCH] Fix over-zealous flush_disk when changing device size NeilBrown
2011-02-17 17:03 ` Jeff Moyer
2011-02-21 19:36 ` Jeff Moyer
2011-02-21 21:14 ` NeilBrown
2011-03-03 14:31 ` Christoph Hellwig
2011-03-04 0:16 ` NeilBrown
2011-03-04 17:25 ` Andrew Patterson
2011-03-06 6:47 ` NeilBrown
2011-03-07 4:22 ` Andrew Patterson
2011-03-07 16:46 ` [dm-devel] " James Bottomley
2011-03-07 22:44 ` NeilBrown
2011-03-07 22:56 ` James Bottomley
2011-03-08 0:04 ` NeilBrown
2011-03-16 20:30 ` Jeff Moyer
2011-03-17 1:28 ` NeilBrown
2011-03-17 17:33 ` Jeff Moyer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=x49ipvhslcd.fsf@segfault.boston.devel.redhat.com \
--to=jmoyer@redhat.com \
--cc=James.Bottomley@suse.de \
--cc=axboe@kernel.dk \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).