All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Kamal Mostafa <kamal@canonical.com>, Jan Kara <jack@suse.cz>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Matthew Wilcox <matthew@wil.cx>,
	Randy Dunlap <rdunlap@xenotime.net>, Theodore Tso <tytso@mit.edu>,
	linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Surbhi Palande <csurbhi@gmail.com>,
	Valerie Aurora <val@vaaconsulting.com>,
	Christopher Chaltain <christopher.chaltain@canonical.com>,
	"Peter M. Petrakis" <peter.petrakis@canonical.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Surbhi Palande <surbhi.palande@canonical.com>
Subject: Re: [PATCH v2 1/7] Adding support to freeze and unfreeze a journal
Date: Tue, 10 Jan 2012 22:31:04 +0100	[thread overview]
Message-ID: <20120110213104.GI4516@quack.suse.cz> (raw)
In-Reply-To: <4F0C9D87.8010006@sandeen.net>

On Tue 10-01-12 14:20:23, Eric Sandeen wrote:
> On 12/8/11 12:04 PM, Kamal Mostafa wrote:
> > From: Surbhi Palande <surbhi.palande@canonical.com>
> > 
> > The journal should be frozen when a filesystem freezes. What this means is
> > that until the filesystem is thawed again, no new transactions should be
> > accepted by the journal. When the filesystem thaws, inturn it should thaw the
> > journal and this should allow the journal to resume accepting new
> > transactions. While the the filesystem has frozen the journal, the clients of
> > the journal on calling jbd2_journal_start() will sleep on a wait queue.
> > Thawing the journal will wake up the sleeping clients and journalling can
> > progress normally.
> > 
> > An example of the race condition that can happen without this patch is as
> > follows:
> > 
> > Say the filesystem is thawed when we begin. Let tx be the time at unit x
> > 
> > P1: Process doing an aio write
> > t1) ext4_file_write()
> >   t2) generic_file_aio_write()
> >     t3) __generic_file_aio_write()
> >       // filesystem is not frozen, so we do not block in the next check.
> >       t4) vfs_check_frozen()
> >       t5) generic_write_checks()
> > ----------------- Prempted------------------
> > 
> > P2: Process that does filesystem freeze
> > 
> > t6) freeze_super()
> >   t7) sync_filesystem()
> >   t8) sync_blockdev()
> >   t9) sb->s_op->freeze_fs() (= ext4_freeze)
> >     t10) jbd2_journal_lock_updates()
> >     t11) jbd2_journal_flush()
> >     // Need to unlock the journal before returning to user space.
> >     t12) jbd2_journal_unlock_updates()
> >     // Journal is unlocked and so we can start accepting new transactions now.
> > 
> > // freezing process completes execution. Page cache is now clean and should
> > // remain clean till the filesystem is frozen.
> > --------------------------------------------
> > 
> > P1: writing process gets the control back
> > t13) generic_file_buffered_write()
> >   t14) generic_perform_write()
> >     t15) a_ops->write_begin() (= ext4_write_begin)
> >       t16) ext4_journal_start()
> > 	// New handle is started. We do not block here! Write continues
> > 	// dirtying the page cache while the filesystem is frozen!
> 
> Hrm let me think through this a little more; we actually do:
> 
> t16) ext4_journal_start()
>   t17) ext4_journal_start_sb()
>     t18) handle = ext4_journal_current_handle();
>     t19) if (!handle) vfs_check_frozen()
>     t20) ... jbd2_journal_start()
  Ah, right. I forgot.

> So actually we *do* block new handles, but let *existing* ones
> continue (see commits 6b0310fbf087ad6e9e3b8392adca97cd77184084
> and be4f27d324e8ddd57cc0d4d604fe85ee0425cba9)
> 
> So your assertion that a new handle is started is incorrect
> in general, isn't it?  So then does the fix seem necessary?
> Or, at least, in the fashion below - maybe we need to just make
> sure all started handles complete before the unlock_updates?
> Or am I missing something...?
  Well, the problem with running operations and freezing is more
fundamental I believe. See my email
http://marc.info/?l=linux-fsdevel&m=132585911925796&w=2

So I believe we'll need some better exclusion mechanism already in VFS.

								Honza

> > BugLink: https://bugs.launchpad.net/bugs/897421
> > Signed-off-by: Surbhi Palande <surbhi.palande@canonical.com>
> > Acked-by: Jan Kara <jack@suse.cz>
> > Reviewed-by: Andreas Dilger <adilger.kernel@dilger.ca>
> > Cc: Kamal Mostafa <kamal@canonical.com>
> > Tested-by: Peter M. Petrakis <peter.petrakis@canonical.com>
> > Signed-off-by: Kamal Mostafa <kamal@canonical.com>
> > ---
> >  fs/jbd2/journal.c     |    1 +
> >  fs/jbd2/transaction.c |   42 ++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/jbd2.h  |    7 +++++++
> >  3 files changed, 50 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> > index 0fa0123..f0170cc 100644
> > --- a/fs/jbd2/journal.c
> > +++ b/fs/jbd2/journal.c
> > @@ -894,6 +894,7 @@ static journal_t * journal_init_common (void)
> >  	init_waitqueue_head(&journal->j_wait_checkpoint);
> >  	init_waitqueue_head(&journal->j_wait_commit);
> >  	init_waitqueue_head(&journal->j_wait_updates);
> > +	init_waitqueue_head(&journal->j_wait_frozen);
> >  	mutex_init(&journal->j_barrier);
> >  	mutex_init(&journal->j_checkpoint_mutex);
> >  	spin_lock_init(&journal->j_revoke_lock);
> > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> > index a0e41a4..340ee35 100644
> > --- a/fs/jbd2/transaction.c
> > +++ b/fs/jbd2/transaction.c
> > @@ -173,6 +173,17 @@ repeat:
> >  				journal->j_barrier_count == 0);
> >  		goto repeat;
> >  	}
> > +	/* Don't let a new handle start when a journal is frozen.
> > +	 * jbd2_journal_freeze calls jbd2_journal_unlock_updates() only after
> > +	 * the j_flags indicate that the journal is frozen. So if the
> > +	 * j_barrier_count is 0, then check if this was made 0 by the freezing
> > +	 * process
> > +	 */
> > +	if (journal->j_flags & JBD2_FROZEN) {
> > +		read_unlock(&journal->j_state_lock);
> > +		wait_event(journal->j_wait_frozen, (journal->j_flags & JBD2_FROZEN));
> > +		goto repeat;
> > +	}
> >  
> >  	if (!journal->j_running_transaction) {
> >  		read_unlock(&journal->j_state_lock);
> > @@ -492,6 +503,37 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
> >  }
> >  EXPORT_SYMBOL(jbd2_journal_restart);
> >  
> > +int jbd2_journal_freeze(journal_t *journal)
> > +{
> > +	int error = 0;
> > +	/* Now we set up the journal barrier. */
> > +	jbd2_journal_lock_updates(journal);
> > +
> > +	/*
> > +	 * Don't clear the needs_recovery flag if we failed to flush
> > +	 * the journal.
> > +	 */
> > +	error = jbd2_journal_flush(journal);
> > +	if (error >= 0) {
> > +		write_lock(&journal->j_state_lock);
> > +		journal->j_flags |= JBD2_FROZEN;
> > +		write_unlock(&journal->j_state_lock);
> > +	}
> > +	jbd2_journal_unlock_updates(journal);
> > +	return error;
> > +}
> > +EXPORT_SYMBOL(jbd2_journal_freeze);
> > +
> > +void jbd2_journal_thaw(journal_t * journal)
> > +{
> > +	write_lock(&journal->j_state_lock);
> > +	journal->j_flags &= ~JBD2_FROZEN;
> > +	write_unlock(&journal->j_state_lock);
> > +	wake_up(&journal->j_wait_frozen);
> > +}
> > +EXPORT_SYMBOL(jbd2_journal_thaw);
> > +
> > +
> >  /**
> >   * void jbd2_journal_lock_updates () - establish a transaction barrier.
> >   * @journal:  Journal to establish a barrier on.
> > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> > index 2092ea2..bfa0752 100644
> > --- a/include/linux/jbd2.h
> > +++ b/include/linux/jbd2.h
> > @@ -658,6 +658,7 @@ jbd2_time_diff(unsigned long start, unsigned long end)
> >   * @j_wait_checkpoint:  Wait queue to trigger checkpointing
> >   * @j_wait_commit: Wait queue to trigger commit
> >   * @j_wait_updates: Wait queue to wait for updates to complete
> > + * @j_wait_frozen: Wait queue to wait for journal to thaw
> >   * @j_checkpoint_mutex: Mutex for locking against concurrent checkpoints
> >   * @j_head: Journal head - identifies the first unused block in the journal
> >   * @j_tail: Journal tail - identifies the oldest still-used block in the
> > @@ -775,6 +776,9 @@ struct journal_s
> >  	/* Wait queue to wait for updates to complete */
> >  	wait_queue_head_t	j_wait_updates;
> >  
> > +	/* Wait queue to wait for journal to thaw*/
> > +	wait_queue_head_t	j_wait_frozen;
> > +
> >  	/* Semaphore for locking against concurrent checkpoints */
> >  	struct mutex		j_checkpoint_mutex;
> >  
> > @@ -953,6 +957,7 @@ struct journal_s
> >  #define JBD2_ABORT_ON_SYNCDATA_ERR	0x040	/* Abort the journal on file
> >  						 * data write error in ordered
> >  						 * mode */
> > +#define JBD2_FROZEN	0x080 /* Journal thread frozen along with filesystem */
> >  
> >  /*
> >   * Function declarations for the journaling transaction and buffer
> > @@ -1060,6 +1065,8 @@ extern void	 jbd2_journal_invalidatepage(journal_t *,
> >  				struct page *, unsigned long);
> >  extern int	 jbd2_journal_try_to_free_buffers(journal_t *, struct page *, gfp_t);
> >  extern int	 jbd2_journal_stop(handle_t *);
> > +extern int	 jbd2_journal_freeze(journal_t *);
> > +extern void	 jbd2_journal_thaw(journal_t *);
> >  extern int	 jbd2_journal_flush (journal_t *);
> >  extern void	 jbd2_journal_lock_updates (journal_t *);
> >  extern void	 jbd2_journal_unlock_updates (journal_t *);
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2012-01-10 21:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-08 18:04 [PATCH v2 0/7] fix s_umount thaw/write and journal deadlock Kamal Mostafa
2011-12-08 18:04 ` [PATCH v2 1/7] Adding support to freeze and unfreeze a journal Kamal Mostafa
2012-01-10 20:20   ` Eric Sandeen
2012-01-10 21:31     ` Jan Kara [this message]
2012-01-10 21:55       ` Surbhi Palande
     [not found]       ` <CAMBkX3eVeKSmEzmYTe6Oe_D6kAMQTL5LYoi1-Axj7CcrM85Pow@mail.gmail.com>
2012-01-11  0:04         ` Jan Kara
2012-01-11  0:13           ` Surbhi Palande
2012-01-11  0:51             ` Jan Kara
2012-01-11  0:51               ` Jan Kara
2012-01-11  5:38             ` Surbhi Palande
2012-01-11  6:06               ` Surbhi Palande
2012-01-11 12:10               ` Jan Kara
2012-01-11 16:45                 ` Surbhi Palande
2012-01-11 18:13                   ` Jan Kara
2012-01-11  3:08       ` Eric Sandeen
2012-01-10 22:00     ` Surbhi Palande
2012-01-10 22:00       ` Surbhi Palande
2011-12-08 18:04 ` [PATCH v2 2/7] Freeze and thaw the journal on ext4 freeze Kamal Mostafa
2012-01-06  0:32   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 3/7] VFS: Fix s_umount thaw/write deadlock Kamal Mostafa
2012-01-06  1:50   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 4/7] VFS: Rename and refactor writeback_inodes_sb_if_idle Kamal Mostafa
2011-12-13  3:34   ` Miao Xie
2011-12-15  7:10     ` Miao Xie
2011-12-16 20:48     ` Kamal Mostafa
2012-01-06  0:33   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 5/7] VFS: Avoid read-write deadlock in try_to_writeback_inodes_sb Kamal Mostafa
2012-01-06  0:35   ` Jan Kara
2012-01-11 20:29     ` Kamal Mostafa
2012-01-12 15:53       ` Mikulas Patocka
2011-12-08 18:04 ` [PATCH v2 6/7] VFS: Document s_frozen state through freeze_super Kamal Mostafa
2012-01-06  0:36   ` Jan Kara
2011-12-08 18:04 ` [PATCH v2 7/7] Documentation: Correct s_umount state for freeze_fs/unfreeze_fs Kamal Mostafa
2012-01-06  0:36   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120110213104.GI4516@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=christopher.chaltain@canonical.com \
    --cc=csurbhi@gmail.com \
    --cc=kamal@canonical.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=mpatocka@redhat.com \
    --cc=peter.petrakis@canonical.com \
    --cc=rdunlap@xenotime.net \
    --cc=sandeen@sandeen.net \
    --cc=surbhi.palande@canonical.com \
    --cc=tytso@mit.edu \
    --cc=val@vaaconsulting.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.