All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Monakhov <dmonakhov@openvz.org>
To: Jan Kara <jack@suse.cz>
Cc: Eric Sandeen <sandeen@redhat.com>,
	Camille Moncelier <pix@devlife.org>,
	"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [ext3] Changes to block device after an ext3 mount point has been remounted readonly
Date: Wed, 24 Feb 2010 20:26:13 +0300	[thread overview]
Message-ID: <87eikao896.fsf@openvz.org> (raw)
In-Reply-To: <20100224170506.GN3687@quack.suse.cz> (Jan Kara's message of "Wed, 24 Feb 2010 18:05:06 +0100")

Jan Kara <jack@suse.cz> writes:

> On Wed 24-02-10 10:57:59, Eric Sandeen wrote:
>> Dmitry Monakhov wrote:
>> > Jan Kara <jack@suse.cz> writes:
>> >>> The fact is that I've been able to reproduce the problem on LVM block
>> >>> devices, and sd* block devices so it's definitely not a loop device
>> >>> specific problem.
>> >>>
>> >>> By the way, I tried several other things other than "echo s
>> >>>> /proc/sysrq_trigger" I tried multiple sync followed with a one minute
>> >>> "sleep",
>> >>>
>> >>> "echo 3 >/proc/sys/vm/drop_caches" seems to lower the chances of "hash
>> >>> changes" but doesn't stops them.
>> >>   Strange. When I use sync(1) in your script and use /dev/sda5 instead of a
>> >> /dev/loop0, I cannot reproduce the problem (was running the script for
>> >> something like an hour).
>> > Theoretically some pages may exist after rw=>ro remount
>> > because of generic race between write/sync, And they will be written
>> > in by writepage if page already has buffers. This not happen in ext4
>> > because. Each time it try to perform writepages it try to start_journal
>> > and this result in EROFS.
>> > The race bug will be closed some day but new one may appear again.
>> > 
>> > Let's be honest and change ext3 writepage like follows:
>> > - check ROFS flag inside write page
>> > - dump writepage's errors.
>> > 
>> > 
>> 
>> sounds like the wrong approach to me, we really need to fix the root
>> cause and make remount,ro finish the job, I think.
Off course, but still. This is just a sanity check. Similar check
in ext4 help me to find the generic issue. Off course it have to
be guarded by unlikely() statement
>> 
>> Throwing away writes which an application already thinks are completed
>> just because remount,ro didn't keep up sounds like a bad idea.  I think
>> I would much rather have the write complete shortly after the readonly
>> transition, if I had to choose...
>   Well, my opinion is that VFS should take care about the rw->ro transition
> so that it isn't racy...
No, My patch just try to nail the RO semantics in to writepage.
Since other places are already guarded by start_journal, writepage is
the only one which may has weakness.
About ENOSPC/EDQUOT spam. It may be not bad to print a error message
for crazy person who use mmap for space file.
>
>> I haven't looked at these paths at all but just hand-wavily,
>> remount,ro should follow pretty much the same path as freeze,
>> I think.  And if freeze isn't getting everything on-disk we have
>> an even bigger problem.
>   With freeze you can still keep dirty data in cache until the filesystem
> unfreezes so it's a different situation from rw->ro transition.
In fact freeze is also not absolutely io proof :)
When i've worked on COW device i use freeze-fs for consistent
image creation, And sometimes after filesystem was friezed
i still get bios. We do not investigate this too deeply
and just queue bios in to pending queue.

>
> 								Honza

  reply	other threads:[~2010-02-24 17:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-18 16:45 [ext3] Changes to block device after an ext3 mount point has been remounted readonly PiX
2010-02-18 16:50 ` Camille Moncelier
2010-02-18 21:41 ` Andreas Dilger
2010-02-19  7:38   ` Camille Moncelier
2010-02-22 22:32     ` Jan Kara
2010-02-22 23:05       ` Jan Kara
2010-02-22 23:09         ` Andreas Dilger
2010-02-23  8:42           ` Camille Moncelier
2010-02-23 13:55             ` Jan Kara
2010-02-24 16:01               ` Dmitry Monakhov
2010-02-24 16:26                 ` Camille Moncelier
2010-02-24 16:59                   ` Jan Kara
2010-02-24 16:56                 ` Jan Kara
2010-03-02  9:34                   ` Christoph Hellwig
2010-03-02 10:01                     ` Dmitry Monakhov
2010-03-02 13:26                       ` Jan Kara
2010-03-02 23:10                     ` Joel Becker
2010-02-24 16:57                 ` Eric Sandeen
2010-02-24 17:05                   ` Jan Kara
2010-02-24 17:26                     ` Dmitry Monakhov [this message]
2010-02-24 21:36                       ` Jan Kara
2010-03-02 10:29         ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eikao896.fsf@openvz.org \
    --to=dmonakhov@openvz.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=pix@devlife.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.