linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Monakhov <dmonakhov@openvz.org>
To: Jan Kara <jack@suse.cz>
Cc: Eric Sandeen <sandeen@redhat.com>,
	Camille Moncelier <pix@devlife.org>,
	"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [ext3] Changes to block device after an ext3 mount point has been remounted readonly
Date: Wed, 24 Feb 2010 20:26:13 +0300	[thread overview]
Message-ID: <87eikao896.fsf@openvz.org> (raw)
In-Reply-To: <20100224170506.GN3687@quack.suse.cz> (Jan Kara's message of "Wed, 24 Feb 2010 18:05:06 +0100")

Jan Kara <jack@suse.cz> writes:

> On Wed 24-02-10 10:57:59, Eric Sandeen wrote:
>> Dmitry Monakhov wrote:
>> > Jan Kara <jack@suse.cz> writes:
>> >>> The fact is that I've been able to reproduce the problem on LVM block
>> >>> devices, and sd* block devices so it's definitely not a loop device
>> >>> specific problem.
>> >>>
>> >>> By the way, I tried several other things other than "echo s
>> >>>> /proc/sysrq_trigger" I tried multiple sync followed with a one minute
>> >>> "sleep",
>> >>>
>> >>> "echo 3 >/proc/sys/vm/drop_caches" seems to lower the chances of "hash
>> >>> changes" but doesn't stops them.
>> >>   Strange. When I use sync(1) in your script and use /dev/sda5 instead of a
>> >> /dev/loop0, I cannot reproduce the problem (was running the script for
>> >> something like an hour).
>> > Theoretically some pages may exist after rw=>ro remount
>> > because of generic race between write/sync, And they will be written
>> > in by writepage if page already has buffers. This not happen in ext4
>> > because. Each time it try to perform writepages it try to start_journal
>> > and this result in EROFS.
>> > The race bug will be closed some day but new one may appear again.
>> > 
>> > Let's be honest and change ext3 writepage like follows:
>> > - check ROFS flag inside write page
>> > - dump writepage's errors.
>> > 
>> > 
>> 
>> sounds like the wrong approach to me, we really need to fix the root
>> cause and make remount,ro finish the job, I think.
Off course, but still. This is just a sanity check. Similar check
in ext4 help me to find the generic issue. Off course it have to
be guarded by unlikely() statement
>> 
>> Throwing away writes which an application already thinks are completed
>> just because remount,ro didn't keep up sounds like a bad idea.  I think
>> I would much rather have the write complete shortly after the readonly
>> transition, if I had to choose...
>   Well, my opinion is that VFS should take care about the rw->ro transition
> so that it isn't racy...
No, My patch just try to nail the RO semantics in to writepage.
Since other places are already guarded by start_journal, writepage is
the only one which may has weakness.
About ENOSPC/EDQUOT spam. It may be not bad to print a error message
for crazy person who use mmap for space file.
>
>> I haven't looked at these paths at all but just hand-wavily,
>> remount,ro should follow pretty much the same path as freeze,
>> I think.  And if freeze isn't getting everything on-disk we have
>> an even bigger problem.
>   With freeze you can still keep dirty data in cache until the filesystem
> unfreezes so it's a different situation from rw->ro transition.
In fact freeze is also not absolutely io proof :)
When i've worked on COW device i use freeze-fs for consistent
image creation, And sometimes after filesystem was friezed
i still get bios. We do not investigate this too deeply
and just queue bios in to pending queue.

>
> 								Honza

  reply	other threads:[~2010-02-24 17:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-18 16:45 [ext3] Changes to block device after an ext3 mount point has been remounted readonly PiX
2010-02-18 16:50 ` Camille Moncelier
2010-02-18 21:41 ` Andreas Dilger
2010-02-19  7:38   ` Camille Moncelier
2010-02-22 22:32     ` Jan Kara
2010-02-22 23:05       ` Jan Kara
2010-02-22 23:09         ` Andreas Dilger
2010-02-23  8:42           ` Camille Moncelier
2010-02-23 13:55             ` Jan Kara
2010-02-24 16:01               ` Dmitry Monakhov
2010-02-24 16:26                 ` Camille Moncelier
2010-02-24 16:59                   ` Jan Kara
2010-02-24 16:56                 ` Jan Kara
2010-03-02  9:34                   ` Christoph Hellwig
2010-03-02 10:01                     ` Dmitry Monakhov
2010-03-02 13:26                       ` Jan Kara
2010-03-02 23:10                     ` Joel Becker
2010-02-24 16:57                 ` Eric Sandeen
2010-02-24 17:05                   ` Jan Kara
2010-02-24 17:26                     ` Dmitry Monakhov [this message]
2010-02-24 21:36                       ` Jan Kara
2010-03-02 10:29         ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eikao896.fsf@openvz.org \
    --to=dmonakhov@openvz.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=pix@devlife.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).