All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sandeen.net>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, Dave Chinner <dchinner@redhat.com>,
	Surbhi Palande <csurbhi@gmail.com>,
	Kamal Mostafa <kamal@canonical.com>,
	Christoph Hellwig <hch@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 1/8] fs: Improve filesystem freezing handling
Date: Fri, 03 Feb 2012 21:03:20 -0600	[thread overview]
Message-ID: <4F2C9FF8.2010207@sandeen.net> (raw)
In-Reply-To: <1327091686-23177-2-git-send-email-jack@suse.cz>

On 1/20/12 2:34 PM, Jan Kara wrote:
> vfs_check_frozen() tests are racy since the filesystem can be frozen just after
> the test is performed. Thus in write paths we can end up marking some pages or
> inodes dirty even though filesystem is already frozen. This creates problems
> with flusher thread hanging on frozen filesystem.
> 
> Another problem is that exclusion between ->page_mkwrite() and filesystem
> freezing has been handled by setting page dirty and then verifying s_frozen.
> This guaranteed that either the freezing code sees the faulted page, writes it,
> and writeprotects it again or we see s_frozen set and bail out of page fault.
> This works to protect from page being marked writeable while filesystem
> freezing is running but has an unpleasant artefact of leaving dirty (although
> unmodified and writeprotected) pages on frozen filesystem resulting in similar
> problems with flusher thread as the first problem.
> 
> This patch aims at providing exclusion between write paths and filesystem
> freezing. We implement a writer-freeze read-write semaphores in the superblock
> for each freezing level (currently there are two - SB_FREEZE_WRITE for data and
> SB_FREEZE_TRANS for metadata). Write paths which should block freezing on given
> level (e.g. ->block_page_mkwrite(), ->aio_write() for SB_FREEZE_WRITE level;
> transaction lifetime for SB_FREEZE_TRANS level) hold reader side of the
> semaphore. Code freezing the filesystem to a given level takes the writer side.
> 
> Only that we don't really want to bounce cachelines of the semaphore between
> CPUs for each write happening. So we implement the reader side of the semaphore
> as a per-cpu counter and the writer side is implemented using s_frozen
> superblock field.
> 
> Acked-by: "Theodore Ts'o" <tytso@mit.edu>
> Signed-off-by: Jan Kara <jack@suse.cz>

...

> @@ -135,6 +157,11 @@ static struct super_block *alloc_super(struct file_system_type *type)
>  #else
>  		INIT_LIST_HEAD(&s->s_files);
>  #endif
> +		if (init_sb_writers(s, SB_FREEZE_WRITE, "sb_writers_write"))
> +			goto err_out;
> +		if (init_sb_writers(s, SB_FREEZE_TRANS, "sb_writers_trans"))
> +			goto err_out;
> +
>  		s->s_bdi = &default_backing_dev_info;
>  		INIT_LIST_HEAD(&s->s_instances);
>  		INIT_HLIST_BL_HEAD(&s->s_anon);
> @@ -186,6 +213,17 @@ static struct super_block *alloc_super(struct file_system_type *type)
>  	}
>  out:
>  	return s;
> +err_out:
> +	security_sb_free(s);
> +#ifdef CONFIG_SMP
> +	if (s->s_files)
> +		free_percpu(s->s_files);
> +#endif
> +	destroy_sb_writers(s, SB_FREEZE_WRITE);
> +	destroy_sb_writers(s, SB_FREEZE_TRANS);

You probably ran into this already but the writer percpu vars need
to be torn down in destroy_super() as well.

-Eric



WARNING: multiple messages have this Message-ID (diff)
From: Eric Sandeen <sandeen@sandeen.net>
To: Jan Kara <jack@suse.cz>
Cc: Surbhi Palande <csurbhi@gmail.com>,
	Kamal Mostafa <kamal@canonical.com>,
	LKML <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <dchinner@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 1/8] fs: Improve filesystem freezing handling
Date: Fri, 03 Feb 2012 21:03:20 -0600	[thread overview]
Message-ID: <4F2C9FF8.2010207@sandeen.net> (raw)
In-Reply-To: <1327091686-23177-2-git-send-email-jack@suse.cz>

On 1/20/12 2:34 PM, Jan Kara wrote:
> vfs_check_frozen() tests are racy since the filesystem can be frozen just after
> the test is performed. Thus in write paths we can end up marking some pages or
> inodes dirty even though filesystem is already frozen. This creates problems
> with flusher thread hanging on frozen filesystem.
> 
> Another problem is that exclusion between ->page_mkwrite() and filesystem
> freezing has been handled by setting page dirty and then verifying s_frozen.
> This guaranteed that either the freezing code sees the faulted page, writes it,
> and writeprotects it again or we see s_frozen set and bail out of page fault.
> This works to protect from page being marked writeable while filesystem
> freezing is running but has an unpleasant artefact of leaving dirty (although
> unmodified and writeprotected) pages on frozen filesystem resulting in similar
> problems with flusher thread as the first problem.
> 
> This patch aims at providing exclusion between write paths and filesystem
> freezing. We implement a writer-freeze read-write semaphores in the superblock
> for each freezing level (currently there are two - SB_FREEZE_WRITE for data and
> SB_FREEZE_TRANS for metadata). Write paths which should block freezing on given
> level (e.g. ->block_page_mkwrite(), ->aio_write() for SB_FREEZE_WRITE level;
> transaction lifetime for SB_FREEZE_TRANS level) hold reader side of the
> semaphore. Code freezing the filesystem to a given level takes the writer side.
> 
> Only that we don't really want to bounce cachelines of the semaphore between
> CPUs for each write happening. So we implement the reader side of the semaphore
> as a per-cpu counter and the writer side is implemented using s_frozen
> superblock field.
> 
> Acked-by: "Theodore Ts'o" <tytso@mit.edu>
> Signed-off-by: Jan Kara <jack@suse.cz>

...

> @@ -135,6 +157,11 @@ static struct super_block *alloc_super(struct file_system_type *type)
>  #else
>  		INIT_LIST_HEAD(&s->s_files);
>  #endif
> +		if (init_sb_writers(s, SB_FREEZE_WRITE, "sb_writers_write"))
> +			goto err_out;
> +		if (init_sb_writers(s, SB_FREEZE_TRANS, "sb_writers_trans"))
> +			goto err_out;
> +
>  		s->s_bdi = &default_backing_dev_info;
>  		INIT_LIST_HEAD(&s->s_instances);
>  		INIT_HLIST_BL_HEAD(&s->s_anon);
> @@ -186,6 +213,17 @@ static struct super_block *alloc_super(struct file_system_type *type)
>  	}
>  out:
>  	return s;
> +err_out:
> +	security_sb_free(s);
> +#ifdef CONFIG_SMP
> +	if (s->s_files)
> +		free_percpu(s->s_files);
> +#endif
> +	destroy_sb_writers(s, SB_FREEZE_WRITE);
> +	destroy_sb_writers(s, SB_FREEZE_TRANS);

You probably ran into this already but the writer percpu vars need
to be torn down in destroy_super() as well.

-Eric


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-02-04  3:03 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-20 20:34 [PATCH 0/8] Fix filesystem freezing Jan Kara
2012-01-20 20:34 ` Jan Kara
2012-01-20 20:34 ` [PATCH 1/8] fs: Improve filesystem freezing handling Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-02-04  3:03   ` Eric Sandeen [this message]
2012-02-04  3:03     ` Eric Sandeen
2012-02-06 15:17     ` Jan Kara
2012-02-06 15:17       ` Jan Kara
2012-01-20 20:34 ` [PATCH 2/8] vfs: Protect write paths by sb_start_write - sb_end_write Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-01-24  8:21   ` Dave Chinner
2012-01-24  8:21     ` Dave Chinner
2012-01-24 11:44     ` Jan Kara
2012-01-24 11:44       ` Jan Kara
2012-02-05  6:13   ` Eric Sandeen
2012-02-05  6:13     ` Eric Sandeen
2012-02-06 15:33     ` Jan Kara
2012-02-06 15:33       ` Jan Kara
2012-01-20 20:34 ` [PATCH 3/8] ext4: Protect ext4_page_mkwrite & ext4_setattr with " Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-01-20 20:34 ` [PATCH 4/8] xfs: Move ilock before transaction start in xfs_setattr_size() Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-01-24  6:59   ` Dave Chinner
2012-01-24  6:59     ` Dave Chinner
2012-01-24 11:52     ` Jan Kara
2012-01-24 11:52       ` Jan Kara
2012-01-20 20:34 ` [PATCH 5/8] xfs: Protect xfs_file_aio_write() & xfs_setattr_size() with sb_start_write - sb_end_write Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-01-24  7:19   ` Dave Chinner
2012-01-24  7:19     ` Dave Chinner
2012-01-24 19:35     ` Jan Kara
2012-01-24 19:35       ` Jan Kara
2012-02-04  4:30   ` Eric Sandeen
2012-02-04  4:30     ` Eric Sandeen
2012-02-04  4:50     ` Eric Sandeen
2012-02-04  4:50       ` Eric Sandeen
2012-02-05 23:11     ` Dave Chinner
2012-02-05 23:11       ` Dave Chinner
2012-01-20 20:34 ` [PATCH 6/8] xfs: Use generic writers counter instead of m_active_trans counter Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-01-24  8:05   ` Dave Chinner
2012-01-24  8:05     ` Dave Chinner
2012-02-04  2:13     ` Eric Sandeen
2012-02-04  2:13       ` Eric Sandeen
2012-02-04  2:42   ` Eric Sandeen
2012-02-04  2:42     ` Eric Sandeen
2012-02-04  4:34   ` Eric Sandeen
2012-02-04  4:34     ` Eric Sandeen
2012-01-20 20:34 ` [PATCH 7/8] Documentation: Correct s_umount state for freeze_fs/unfreeze_fs Jan Kara
2012-01-20 20:34   ` Jan Kara
2012-01-20 20:34 ` [PATCH 8/8] vfs: Document s_frozen state through freeze_super Jan Kara
2012-01-20 20:34   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F2C9FF8.2010207@sandeen.net \
    --to=sandeen@sandeen.net \
    --cc=csurbhi@gmail.com \
    --cc=dchinner@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=kamal@canonical.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.