From: Jan Kara <jack@suse.cz>
To: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Christoph Hellwig <hch@lst.de>, Jan Kara <jack@suse.cz>,
Camille Moncelier <pix@devlife.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
ext4 development <linux-ext4@vger.kernel.org>,
viro@zeniv.linux.org.uk
Subject: Re: [ext3] Changes to block device after an ext3 mount point has been remounted readonly
Date: Tue, 2 Mar 2010 14:26:43 +0100 [thread overview]
Message-ID: <20100302132643.GB3829@quack.suse.cz> (raw)
In-Reply-To: <874okz6nzj.fsf@openvz.org>
On Tue 02-03-10 13:01:52, Dmitry Monakhov wrote:
> Christoph Hellwig <hch@lst.de> writes:
> >> Al, Christoph, do I miss something or there is really nothing which
> >> prevents a process from opening a file after the fs_may_remount_ro() check
> >> in do_remount_sb()?
> >
> > No, there is nothing. We really do need a multi-stage remount read-only
> > process:
> >
> > 1) stop any writes from userland, that is opening new files writeable
> This is not quite good idea because sync may take really long time,
> #fsstress -p32 -d /mnt/TEST -l9999999 -n99999999 -z -f creat=100 -f write=100
> #sleep 60;
> #killall -9 fsstress
> #time mount mnt -oremount,ro
> it take several minutes to complete.
> And at the end it may fail but other reason.
Two points here:
1) Current writeback code has a bug that while we are umounting/remounting,
sync_filesystem() just degrades to doing all writeback in sync mode
(because any non-sync writeback fails to get s_umount sem for reading
and thus skips all the inodes of the superblock). This has considerable
impact on the speed of sync during umount / remount.
2) IMHO it's not bad to block all opens for writing during remounting RO
(and thus also during the sync). It's not a performance issue (remounting
RO does not happen often), it won't confuse any application or so even if
we later decide we cannot really finish remounting. Surely we'd have to
come up with a better waiting scheme than just cpu_relax() in
mnt_want_write() but that shouldn't be hard. The only thing I'm slightly
worried about is whether we won't hit some locking issues (i.e., caller
of mnt_want_write holding some lock needed to finish remount...).
> > 2) stop any periodic writeback from the VM or filesystem-internal
> > 3) write out all filesystem data and metadata
> > 4) mark the filesystem fully read-only
>
> I've tried to sole the issue in lightly another way
> Please take a look on this
> http://marc.info/?l=linux-fsdevel&m=126723036525624&w=2
> 1) Mark fs as GOING_TO_REMOUNT
> 2) any new writer will clear this flag
> This allow us to not block
> 3) check flag before fssync and after and return EBUSY in this case.
> 4) At this time we may to block writers (this is absent in my patch)
> It is acceptable to block writers at this time because later stages
> doesn't take too long.
> 5) perform fs-specific remount method.
> 6) Marks filesystem as MS_RDONLY.
I like my solution more since in my solution, admin does not have go
hunting for an application which keeps touching the filesystem while he is
trying to remount it read only (currently, using lsof is usually enough but
after your changes, running something like "while true; do touch /mnt/;
done" has much larger window to stop remounting RO).
But in principle your solution is acceptable for me as well.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2010-03-02 13:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-18 16:45 [ext3] Changes to block device after an ext3 mount point has been remounted readonly PiX
2010-02-18 16:50 ` Camille Moncelier
2010-02-18 21:41 ` Andreas Dilger
2010-02-19 7:38 ` Camille Moncelier
2010-02-22 22:32 ` Jan Kara
2010-02-22 23:05 ` Jan Kara
2010-02-22 23:09 ` Andreas Dilger
2010-02-23 8:42 ` Camille Moncelier
2010-02-23 13:55 ` Jan Kara
2010-02-24 16:01 ` Dmitry Monakhov
2010-02-24 16:26 ` Camille Moncelier
2010-02-24 16:59 ` Jan Kara
2010-02-24 16:56 ` Jan Kara
2010-03-02 9:34 ` Christoph Hellwig
2010-03-02 10:01 ` Dmitry Monakhov
2010-03-02 13:26 ` Jan Kara [this message]
2010-03-02 23:10 ` Joel Becker
2010-02-24 16:57 ` Eric Sandeen
2010-02-24 17:05 ` Jan Kara
2010-02-24 17:26 ` Dmitry Monakhov
2010-02-24 21:36 ` Jan Kara
2010-03-02 10:29 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100302132643.GB3829@quack.suse.cz \
--to=jack@suse.cz \
--cc=dmonakhov@openvz.org \
--cc=hch@lst.de \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=pix@devlife.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).