linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Alasdair G Kergon <agk@redhat.com>, Jan Kara <jack@suse.cz>,
	esandeen@redhat.com, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, linux-fsdevel@vger.kernel.org,
	Christopher Chaltain <christopher.chaltain@canonical.com>,
	Valerie Aurora <val@vaaconsulting.com>
Subject: Re: [dm-devel] [PATCH] deadlock with suspend and quotas
Date: Thu, 1 Dec 2011 01:34:07 +0100	[thread overview]
Message-ID: <20111201003407.GH4541@quack.suse.cz> (raw)
In-Reply-To: <Pine.LNX.4.64.1111301123320.5759@hs20-bc2-1.build.redhat.com>

On Wed 30-11-11 11:34:23, Mikulas Patocka wrote:
> On Wed, 30 Nov 2011, Alasdair G Kergon wrote:
> > On Tue, Nov 29, 2011 at 11:19:01AM +0100, Jan Kara wrote:
> > > On Mon 28-11-11 18:32:18, Mikulas Patocka wrote:
> > > > - skipping sync on frozen filesystem violates sync semantics. 
> > > > Applications, such as databases, assume that when sync finishes, data were 
> > > > written to stable storage. If we skip sync when the filesystem is frozen, 
> > > > we can cause data corruption in these applications (if the system crashes 
> > > > after we skipped a sync).
> > 
> > >   Here I don't agree. Filesystem must guarantee there are no dirty data on
> > > a frozen filesystem. Ext4 and XFS do this, ext3 would need proper
> > > page_mkwrite() implementation for this but that's the problem of ext3, not
> > > freezing code in general. If there are no dirty data, sync code (and also
> > > flusher thread) is free to return without doing anything.
> >  
> > Consider, during a 'create a snapshot' operation:
> >    I/O flow:  application -> filesystem -> LV -> disk
> > 
> > dm lockfs is issued by LVM.
> >   When this returns, the filesystem should be locked i.e. not issue any
> >   further I/O to the LV.  (But if it did happen to issue I/O, it
> >   wouldn't be a problem, as it would just get queued by dm and have no
> >   impact on the snapshot creation operation.)
> > 
> > The application is still running and might still be issuing writes to
> > the filesystem and might itself issue 'sync'.  But a 'sync' would only
> > be meaningful for already-completed writes and the lockfs process should
> > have already seen that they have hit disk.  So a sync issued while a
> > device is locked can always be skipped.  Have I missed something in this
> > reasoning, Mikulas?
> > 
> > Alasdair
> 
> You can't skip sync.
> 
> The problem is this (assume that you have non-journaled filesystem):
> 
> - A process issues a write() call. The write call goes to 
> __generic_file_aio_write, suppose that the process goes immediatelly after 
> "vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE);" and then is rescheduled.
> 
> - You suspend the filesystem
> 
> - A process that issued write() is scheduled to run, note that it already 
> passed "vfs_check_frozen", so it goes on even on syspended filesystem. 
> This process creates a dirty page in the page cache.
> 
> - The process eventually returns to userspace from the write() syscall, 
> the filesystem is still suspended. write() doesn't guarantee that the data 
> hit disk, so there is no problem so far.
> 
> - The applications calls sync. Now, if you skip sync on the suspended 
> filesystem, you violate sync semantics: when a process calls write() and 
> sync(), it can assume that data are written to stable storage.
> 
> So, in order to keep sync working, you must wait for the filesystem to 
> thaw and then write the dirty data.
  So now when we have establied a difference betweeen freezing a device and
freezing a filesystem, I think the answer is that sync can skip frozen
filesystems but cannot skip unfrozen filesystems on frozen devices. Agreed?
We'd need to change the freezing code to distinguish these two cases but
that's not hard...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2011-12-01  0:34 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-25 20:25 [PATCH] deadlock with suspend and quotas Mikulas Patocka
2011-11-28 15:04 ` Jan Kara
2011-11-28 21:00   ` Valerie Aurora
2011-11-28 21:14     ` Mikulas Patocka
2011-11-28 23:32       ` Mikulas Patocka
2011-11-29 10:19         ` Jan Kara
2011-11-29 10:21           ` Jan Kara
2011-11-29 11:06             ` Mikulas Patocka
2011-11-29 11:11               ` Jan Kara
2011-11-29 12:54                 ` Mikulas Patocka
2011-11-29 13:09                   ` Jan Kara
2011-11-29 13:18                     ` [dm-devel] " Alasdair G Kergon
2011-11-29 13:32                       ` Jan Kara
2011-11-29 16:33                         ` Eric Sandeen
2011-11-30  6:52                         ` Mikulas Patocka
2011-11-30 11:16                           ` Jan Kara
2011-11-30 12:14                             ` Mikulas Patocka
2011-11-30 13:05                               ` Alasdair G Kergon
2011-11-30 16:53                                 ` Jan Kara
2011-11-30 17:09                                   ` Mikulas Patocka
2011-11-30 13:33           ` Alasdair G Kergon
2011-11-30 13:48             ` Alasdair G Kergon
2011-11-30 16:36               ` Mikulas Patocka
2011-11-30 16:34             ` Mikulas Patocka
2011-12-01  0:34               ` Jan Kara [this message]
2011-11-30 14:09           ` Alasdair G Kergon
2011-11-30 16:53             ` Mikulas Patocka
2011-12-01  0:03               ` Jan Kara
2011-11-30 17:03             ` Mikulas Patocka
2011-11-29 20:00     ` Kamal Mostafa
2012-01-03  3:30 ` Al Viro
2012-01-03 18:22   ` Mikulas Patocka
2012-01-03 18:35     ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111201003407.GH4541@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=agk@redhat.com \
    --cc=christopher.chaltain@canonical.com \
    --cc=dm-devel@redhat.com \
    --cc=esandeen@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=val@vaaconsulting.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).