All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shuge <shugelinux@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-ext4@vger.kernel.org, Kevin <kevin@allwinnertech.com>,
	Theodore Ts'o <tytso@mit.edu>, Jens Axboe <axboe@kernel.dk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v3] mm: Make snapshotting pages for stable writes a per-bio operation
Date: Tue, 19 Mar 2013 09:54:48 +0100	[thread overview]
Message-ID: <20130319085448.GA5222@quack.suse.cz> (raw)
In-Reply-To: <20130318230259.GP5313@blackbox.djwong.org>

On Mon 18-03-13 16:02:59, Darrick J. Wong wrote:
> Walking a bio's page mappings has proved problematic, so create a new bio flag
> to indicate that a bio's data needs to be snapshotted in order to guarantee
> stable pages during writeback.  Next, for the one user (ext3/jbd) of
> snapshotting, hook all the places where writes can be initiated without
> PG_writeback set, and set BIO_SNAP_STABLE there.  We must also flag journal
> "metadata" bios for stable writeout, since file data can be written through the
> journal.  Finally, the MS_SNAP_STABLE mount flag (only used by ext3) is now
> superfluous, so get rid of it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> [darrick.wong@oracle.com: Fold in a couple of small cleanups from akpm]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
  OK, now I'm happy with the patch :) You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/buffer.c                 |    9 ++++++++-
>  fs/ext3/super.c             |    1 -
>  fs/jbd/commit.c             |   25 ++++++++++++++++++++++---
>  include/linux/blk_types.h   |    3 ++-
>  include/linux/buffer_head.h |    1 +
>  include/uapi/linux/fs.h     |    1 -
>  mm/bounce.c                 |   21 +--------------------
>  mm/page-writeback.c         |    4 ----
>  8 files changed, 34 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index b4dcb34..71578d6 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2949,7 +2949,7 @@ static void guard_bh_eod(int rw, struct bio *bio, struct buffer_head *bh)
>  	}
>  }
>  
> -int submit_bh(int rw, struct buffer_head * bh)
> +int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
>  {
>  	struct bio *bio;
>  	int ret = 0;
> @@ -2984,6 +2984,7 @@ int submit_bh(int rw, struct buffer_head * bh)
>  
>  	bio->bi_end_io = end_bio_bh_io_sync;
>  	bio->bi_private = bh;
> +	bio->bi_flags |= bio_flags;
>  
>  	/* Take care of bh's that straddle the end of the device */
>  	guard_bh_eod(rw, bio, bh);
> @@ -2997,6 +2998,12 @@ int submit_bh(int rw, struct buffer_head * bh)
>  	bio_put(bio);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(_submit_bh);
> +
> +int submit_bh(int rw, struct buffer_head *bh)
> +{
> +	return _submit_bh(rw, bh, 0);
> +}
>  EXPORT_SYMBOL(submit_bh);
>  
>  /**
> diff --git a/fs/ext3/super.c b/fs/ext3/super.c
> index fb5120a..3dc48cc 100644
> --- a/fs/ext3/super.c
> +++ b/fs/ext3/super.c
> @@ -2067,7 +2067,6 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
>  		test_opt(sb,DATA_FLAGS) == EXT3_MOUNT_JOURNAL_DATA ? "journal":
>  		test_opt(sb,DATA_FLAGS) == EXT3_MOUNT_ORDERED_DATA ? "ordered":
>  		"writeback");
> -	sb->s_flags |= MS_SNAP_STABLE;
>  
>  	return 0;
>  
> diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
> index 86b39b1..11bb11f 100644
> --- a/fs/jbd/commit.c
> +++ b/fs/jbd/commit.c
> @@ -162,8 +162,17 @@ static void journal_do_submit_data(struct buffer_head **wbuf, int bufs,
>  
>  	for (i = 0; i < bufs; i++) {
>  		wbuf[i]->b_end_io = end_buffer_write_sync;
> -		/* We use-up our safety reference in submit_bh() */
> -		submit_bh(write_op, wbuf[i]);
> +		/*
> +		 * Here we write back pagecache data that may be mmaped. Since
> +		 * we cannot afford to clean the page and set PageWriteback
> +		 * here due to lock ordering (page lock ranks above transaction
> +		 * start), the data can change while IO is in flight. Tell the
> +		 * block layer it should bounce the bio pages if stable data
> +		 * during write is required.
> +		 *
> +		 * We use up our safety reference in submit_bh().
> +		 */
> +		_submit_bh(write_op, wbuf[i], 1 << BIO_SNAP_STABLE);
>  	}
>  }
>  
> @@ -667,7 +676,17 @@ start_journal_io:
>  				clear_buffer_dirty(bh);
>  				set_buffer_uptodate(bh);
>  				bh->b_end_io = journal_end_buffer_io_sync;
> -				submit_bh(write_op, bh);
> +				/*
> +				 * In data=journal mode, here we can end up
> +				 * writing pagecache data that might be
> +				 * mmapped. Since we can't afford to clean the
> +				 * page and set PageWriteback (see the comment
> +				 * near the other use of _submit_bh()), the
> +				 * data can change while the write is in
> +				 * flight.  Tell the block layer to bounce the
> +				 * bio pages if stable pages are required.
> +				 */
> +				_submit_bh(write_op, bh, 1 << BIO_SNAP_STABLE);
>  			}
>  			cond_resched();
>  
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index cdf1119..22990cf 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -111,12 +111,13 @@ struct bio {
>  #define BIO_FS_INTEGRITY 9	/* fs owns integrity data, not block layer */
>  #define BIO_QUIET	10	/* Make BIO Quiet */
>  #define BIO_MAPPED_INTEGRITY 11/* integrity metadata has been remapped */
> +#define BIO_SNAP_STABLE	12	/* bio data must be snapshotted during write */
>  
>  /*
>   * Flags starting here get preserved by bio_reset() - this includes
>   * BIO_POOL_IDX()
>   */
> -#define BIO_RESET_BITS	12
> +#define BIO_RESET_BITS	13
>  
>  #define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
>  
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index 5afc4f9..4c16c4a 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -181,6 +181,7 @@ void ll_rw_block(int, int, struct buffer_head * bh[]);
>  int sync_dirty_buffer(struct buffer_head *bh);
>  int __sync_dirty_buffer(struct buffer_head *bh, int rw);
>  void write_dirty_buffer(struct buffer_head *bh, int rw);
> +int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags);
>  int submit_bh(int, struct buffer_head *);
>  void write_boundary_block(struct block_device *bdev,
>  			sector_t bblock, unsigned blocksize);
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index c7fc1e6..a4ed56c 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -88,7 +88,6 @@ struct inodes_stat_t {
>  #define MS_STRICTATIME	(1<<24) /* Always perform atime updates */
>  
>  /* These sb flags are internal to the kernel */
> -#define MS_SNAP_STABLE	(1<<27) /* Snapshot pages during writeback, if needed */
>  #define MS_NOSEC	(1<<28)
>  #define MS_BORN		(1<<29)
>  #define MS_ACTIVE	(1<<30)
> diff --git a/mm/bounce.c b/mm/bounce.c
> index 5f89017..a5c2ec3 100644
> --- a/mm/bounce.c
> +++ b/mm/bounce.c
> @@ -181,32 +181,13 @@ static void bounce_end_io_read_isa(struct bio *bio, int err)
>  #ifdef CONFIG_NEED_BOUNCE_POOL
>  static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
>  {
> -	struct page *page;
> -	struct backing_dev_info *bdi;
> -	struct address_space *mapping;
> -	struct bio_vec *from;
> -	int i;
> -
>  	if (bio_data_dir(bio) != WRITE)
>  		return 0;
>  
>  	if (!bdi_cap_stable_pages_required(&q->backing_dev_info))
>  		return 0;
>  
> -	/*
> -	 * Based on the first page that has a valid mapping, decide whether or
> -	 * not we have to employ bounce buffering to guarantee stable pages.
> -	 */
> -	bio_for_each_segment(from, bio, i) {
> -		page = from->bv_page;
> -		mapping = page_mapping(page);
> -		if (!mapping)
> -			continue;
> -		bdi = mapping->backing_dev_info;
> -		return mapping->host->i_sb->s_flags & MS_SNAP_STABLE;
> -	}
> -
> -	return 0;
> +	return test_bit(BIO_SNAP_STABLE, &bio->bi_flags);
>  }
>  #else
>  static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index efe6814..4514ad7 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2311,10 +2311,6 @@ void wait_for_stable_page(struct page *page)
>  
>  	if (!bdi_cap_stable_pages_required(bdi))
>  		return;
> -#ifdef CONFIG_NEED_BOUNCE_POOL
> -	if (mapping->host->i_sb->s_flags & MS_SNAP_STABLE)
> -		return;
> -#endif /* CONFIG_NEED_BOUNCE_POOL */
>  
>  	wait_on_page_writeback(page);
>  }
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: jack@suse.cz (Jan Kara)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3] mm: Make snapshotting pages for stable writes a per-bio operation
Date: Tue, 19 Mar 2013 09:54:48 +0100	[thread overview]
Message-ID: <20130319085448.GA5222@quack.suse.cz> (raw)
In-Reply-To: <20130318230259.GP5313@blackbox.djwong.org>

On Mon 18-03-13 16:02:59, Darrick J. Wong wrote:
> Walking a bio's page mappings has proved problematic, so create a new bio flag
> to indicate that a bio's data needs to be snapshotted in order to guarantee
> stable pages during writeback.  Next, for the one user (ext3/jbd) of
> snapshotting, hook all the places where writes can be initiated without
> PG_writeback set, and set BIO_SNAP_STABLE there.  We must also flag journal
> "metadata" bios for stable writeout, since file data can be written through the
> journal.  Finally, the MS_SNAP_STABLE mount flag (only used by ext3) is now
> superfluous, so get rid of it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> [darrick.wong at oracle.com: Fold in a couple of small cleanups from akpm]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
  OK, now I'm happy with the patch :) You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/buffer.c                 |    9 ++++++++-
>  fs/ext3/super.c             |    1 -
>  fs/jbd/commit.c             |   25 ++++++++++++++++++++++---
>  include/linux/blk_types.h   |    3 ++-
>  include/linux/buffer_head.h |    1 +
>  include/uapi/linux/fs.h     |    1 -
>  mm/bounce.c                 |   21 +--------------------
>  mm/page-writeback.c         |    4 ----
>  8 files changed, 34 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index b4dcb34..71578d6 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2949,7 +2949,7 @@ static void guard_bh_eod(int rw, struct bio *bio, struct buffer_head *bh)
>  	}
>  }
>  
> -int submit_bh(int rw, struct buffer_head * bh)
> +int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
>  {
>  	struct bio *bio;
>  	int ret = 0;
> @@ -2984,6 +2984,7 @@ int submit_bh(int rw, struct buffer_head * bh)
>  
>  	bio->bi_end_io = end_bio_bh_io_sync;
>  	bio->bi_private = bh;
> +	bio->bi_flags |= bio_flags;
>  
>  	/* Take care of bh's that straddle the end of the device */
>  	guard_bh_eod(rw, bio, bh);
> @@ -2997,6 +2998,12 @@ int submit_bh(int rw, struct buffer_head * bh)
>  	bio_put(bio);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(_submit_bh);
> +
> +int submit_bh(int rw, struct buffer_head *bh)
> +{
> +	return _submit_bh(rw, bh, 0);
> +}
>  EXPORT_SYMBOL(submit_bh);
>  
>  /**
> diff --git a/fs/ext3/super.c b/fs/ext3/super.c
> index fb5120a..3dc48cc 100644
> --- a/fs/ext3/super.c
> +++ b/fs/ext3/super.c
> @@ -2067,7 +2067,6 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
>  		test_opt(sb,DATA_FLAGS) == EXT3_MOUNT_JOURNAL_DATA ? "journal":
>  		test_opt(sb,DATA_FLAGS) == EXT3_MOUNT_ORDERED_DATA ? "ordered":
>  		"writeback");
> -	sb->s_flags |= MS_SNAP_STABLE;
>  
>  	return 0;
>  
> diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
> index 86b39b1..11bb11f 100644
> --- a/fs/jbd/commit.c
> +++ b/fs/jbd/commit.c
> @@ -162,8 +162,17 @@ static void journal_do_submit_data(struct buffer_head **wbuf, int bufs,
>  
>  	for (i = 0; i < bufs; i++) {
>  		wbuf[i]->b_end_io = end_buffer_write_sync;
> -		/* We use-up our safety reference in submit_bh() */
> -		submit_bh(write_op, wbuf[i]);
> +		/*
> +		 * Here we write back pagecache data that may be mmaped. Since
> +		 * we cannot afford to clean the page and set PageWriteback
> +		 * here due to lock ordering (page lock ranks above transaction
> +		 * start), the data can change while IO is in flight. Tell the
> +		 * block layer it should bounce the bio pages if stable data
> +		 * during write is required.
> +		 *
> +		 * We use up our safety reference in submit_bh().
> +		 */
> +		_submit_bh(write_op, wbuf[i], 1 << BIO_SNAP_STABLE);
>  	}
>  }
>  
> @@ -667,7 +676,17 @@ start_journal_io:
>  				clear_buffer_dirty(bh);
>  				set_buffer_uptodate(bh);
>  				bh->b_end_io = journal_end_buffer_io_sync;
> -				submit_bh(write_op, bh);
> +				/*
> +				 * In data=journal mode, here we can end up
> +				 * writing pagecache data that might be
> +				 * mmapped. Since we can't afford to clean the
> +				 * page and set PageWriteback (see the comment
> +				 * near the other use of _submit_bh()), the
> +				 * data can change while the write is in
> +				 * flight.  Tell the block layer to bounce the
> +				 * bio pages if stable pages are required.
> +				 */
> +				_submit_bh(write_op, bh, 1 << BIO_SNAP_STABLE);
>  			}
>  			cond_resched();
>  
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index cdf1119..22990cf 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -111,12 +111,13 @@ struct bio {
>  #define BIO_FS_INTEGRITY 9	/* fs owns integrity data, not block layer */
>  #define BIO_QUIET	10	/* Make BIO Quiet */
>  #define BIO_MAPPED_INTEGRITY 11/* integrity metadata has been remapped */
> +#define BIO_SNAP_STABLE	12	/* bio data must be snapshotted during write */
>  
>  /*
>   * Flags starting here get preserved by bio_reset() - this includes
>   * BIO_POOL_IDX()
>   */
> -#define BIO_RESET_BITS	12
> +#define BIO_RESET_BITS	13
>  
>  #define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
>  
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index 5afc4f9..4c16c4a 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -181,6 +181,7 @@ void ll_rw_block(int, int, struct buffer_head * bh[]);
>  int sync_dirty_buffer(struct buffer_head *bh);
>  int __sync_dirty_buffer(struct buffer_head *bh, int rw);
>  void write_dirty_buffer(struct buffer_head *bh, int rw);
> +int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags);
>  int submit_bh(int, struct buffer_head *);
>  void write_boundary_block(struct block_device *bdev,
>  			sector_t bblock, unsigned blocksize);
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index c7fc1e6..a4ed56c 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -88,7 +88,6 @@ struct inodes_stat_t {
>  #define MS_STRICTATIME	(1<<24) /* Always perform atime updates */
>  
>  /* These sb flags are internal to the kernel */
> -#define MS_SNAP_STABLE	(1<<27) /* Snapshot pages during writeback, if needed */
>  #define MS_NOSEC	(1<<28)
>  #define MS_BORN		(1<<29)
>  #define MS_ACTIVE	(1<<30)
> diff --git a/mm/bounce.c b/mm/bounce.c
> index 5f89017..a5c2ec3 100644
> --- a/mm/bounce.c
> +++ b/mm/bounce.c
> @@ -181,32 +181,13 @@ static void bounce_end_io_read_isa(struct bio *bio, int err)
>  #ifdef CONFIG_NEED_BOUNCE_POOL
>  static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
>  {
> -	struct page *page;
> -	struct backing_dev_info *bdi;
> -	struct address_space *mapping;
> -	struct bio_vec *from;
> -	int i;
> -
>  	if (bio_data_dir(bio) != WRITE)
>  		return 0;
>  
>  	if (!bdi_cap_stable_pages_required(&q->backing_dev_info))
>  		return 0;
>  
> -	/*
> -	 * Based on the first page that has a valid mapping, decide whether or
> -	 * not we have to employ bounce buffering to guarantee stable pages.
> -	 */
> -	bio_for_each_segment(from, bio, i) {
> -		page = from->bv_page;
> -		mapping = page_mapping(page);
> -		if (!mapping)
> -			continue;
> -		bdi = mapping->backing_dev_info;
> -		return mapping->host->i_sb->s_flags & MS_SNAP_STABLE;
> -	}
> -
> -	return 0;
> +	return test_bit(BIO_SNAP_STABLE, &bio->bi_flags);
>  }
>  #else
>  static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index efe6814..4514ad7 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2311,10 +2311,6 @@ void wait_for_stable_page(struct page *page)
>  
>  	if (!bdi_cap_stable_pages_required(bdi))
>  		return;
> -#ifdef CONFIG_NEED_BOUNCE_POOL
> -	if (mapping->host->i_sb->s_flags & MS_SNAP_STABLE)
> -		return;
> -#endif /* CONFIG_NEED_BOUNCE_POOL */
>  
>  	wait_on_page_writeback(page);
>  }
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shuge <shugelinux@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-ext4@vger.kernel.org, Kevin <kevin@allwinnertech.com>,
	"Theodore Ts'o" <tytso@mit.edu>, Jens Axboe <axboe@kernel.dk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v3] mm: Make snapshotting pages for stable writes a per-bio operation
Date: Tue, 19 Mar 2013 09:54:48 +0100	[thread overview]
Message-ID: <20130319085448.GA5222@quack.suse.cz> (raw)
In-Reply-To: <20130318230259.GP5313@blackbox.djwong.org>

On Mon 18-03-13 16:02:59, Darrick J. Wong wrote:
> Walking a bio's page mappings has proved problematic, so create a new bio flag
> to indicate that a bio's data needs to be snapshotted in order to guarantee
> stable pages during writeback.  Next, for the one user (ext3/jbd) of
> snapshotting, hook all the places where writes can be initiated without
> PG_writeback set, and set BIO_SNAP_STABLE there.  We must also flag journal
> "metadata" bios for stable writeout, since file data can be written through the
> journal.  Finally, the MS_SNAP_STABLE mount flag (only used by ext3) is now
> superfluous, so get rid of it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> [darrick.wong@oracle.com: Fold in a couple of small cleanups from akpm]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
  OK, now I'm happy with the patch :) You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/buffer.c                 |    9 ++++++++-
>  fs/ext3/super.c             |    1 -
>  fs/jbd/commit.c             |   25 ++++++++++++++++++++++---
>  include/linux/blk_types.h   |    3 ++-
>  include/linux/buffer_head.h |    1 +
>  include/uapi/linux/fs.h     |    1 -
>  mm/bounce.c                 |   21 +--------------------
>  mm/page-writeback.c         |    4 ----
>  8 files changed, 34 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index b4dcb34..71578d6 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2949,7 +2949,7 @@ static void guard_bh_eod(int rw, struct bio *bio, struct buffer_head *bh)
>  	}
>  }
>  
> -int submit_bh(int rw, struct buffer_head * bh)
> +int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags)
>  {
>  	struct bio *bio;
>  	int ret = 0;
> @@ -2984,6 +2984,7 @@ int submit_bh(int rw, struct buffer_head * bh)
>  
>  	bio->bi_end_io = end_bio_bh_io_sync;
>  	bio->bi_private = bh;
> +	bio->bi_flags |= bio_flags;
>  
>  	/* Take care of bh's that straddle the end of the device */
>  	guard_bh_eod(rw, bio, bh);
> @@ -2997,6 +2998,12 @@ int submit_bh(int rw, struct buffer_head * bh)
>  	bio_put(bio);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(_submit_bh);
> +
> +int submit_bh(int rw, struct buffer_head *bh)
> +{
> +	return _submit_bh(rw, bh, 0);
> +}
>  EXPORT_SYMBOL(submit_bh);
>  
>  /**
> diff --git a/fs/ext3/super.c b/fs/ext3/super.c
> index fb5120a..3dc48cc 100644
> --- a/fs/ext3/super.c
> +++ b/fs/ext3/super.c
> @@ -2067,7 +2067,6 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
>  		test_opt(sb,DATA_FLAGS) == EXT3_MOUNT_JOURNAL_DATA ? "journal":
>  		test_opt(sb,DATA_FLAGS) == EXT3_MOUNT_ORDERED_DATA ? "ordered":
>  		"writeback");
> -	sb->s_flags |= MS_SNAP_STABLE;
>  
>  	return 0;
>  
> diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
> index 86b39b1..11bb11f 100644
> --- a/fs/jbd/commit.c
> +++ b/fs/jbd/commit.c
> @@ -162,8 +162,17 @@ static void journal_do_submit_data(struct buffer_head **wbuf, int bufs,
>  
>  	for (i = 0; i < bufs; i++) {
>  		wbuf[i]->b_end_io = end_buffer_write_sync;
> -		/* We use-up our safety reference in submit_bh() */
> -		submit_bh(write_op, wbuf[i]);
> +		/*
> +		 * Here we write back pagecache data that may be mmaped. Since
> +		 * we cannot afford to clean the page and set PageWriteback
> +		 * here due to lock ordering (page lock ranks above transaction
> +		 * start), the data can change while IO is in flight. Tell the
> +		 * block layer it should bounce the bio pages if stable data
> +		 * during write is required.
> +		 *
> +		 * We use up our safety reference in submit_bh().
> +		 */
> +		_submit_bh(write_op, wbuf[i], 1 << BIO_SNAP_STABLE);
>  	}
>  }
>  
> @@ -667,7 +676,17 @@ start_journal_io:
>  				clear_buffer_dirty(bh);
>  				set_buffer_uptodate(bh);
>  				bh->b_end_io = journal_end_buffer_io_sync;
> -				submit_bh(write_op, bh);
> +				/*
> +				 * In data=journal mode, here we can end up
> +				 * writing pagecache data that might be
> +				 * mmapped. Since we can't afford to clean the
> +				 * page and set PageWriteback (see the comment
> +				 * near the other use of _submit_bh()), the
> +				 * data can change while the write is in
> +				 * flight.  Tell the block layer to bounce the
> +				 * bio pages if stable pages are required.
> +				 */
> +				_submit_bh(write_op, bh, 1 << BIO_SNAP_STABLE);
>  			}
>  			cond_resched();
>  
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index cdf1119..22990cf 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -111,12 +111,13 @@ struct bio {
>  #define BIO_FS_INTEGRITY 9	/* fs owns integrity data, not block layer */
>  #define BIO_QUIET	10	/* Make BIO Quiet */
>  #define BIO_MAPPED_INTEGRITY 11/* integrity metadata has been remapped */
> +#define BIO_SNAP_STABLE	12	/* bio data must be snapshotted during write */
>  
>  /*
>   * Flags starting here get preserved by bio_reset() - this includes
>   * BIO_POOL_IDX()
>   */
> -#define BIO_RESET_BITS	12
> +#define BIO_RESET_BITS	13
>  
>  #define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
>  
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index 5afc4f9..4c16c4a 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -181,6 +181,7 @@ void ll_rw_block(int, int, struct buffer_head * bh[]);
>  int sync_dirty_buffer(struct buffer_head *bh);
>  int __sync_dirty_buffer(struct buffer_head *bh, int rw);
>  void write_dirty_buffer(struct buffer_head *bh, int rw);
> +int _submit_bh(int rw, struct buffer_head *bh, unsigned long bio_flags);
>  int submit_bh(int, struct buffer_head *);
>  void write_boundary_block(struct block_device *bdev,
>  			sector_t bblock, unsigned blocksize);
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index c7fc1e6..a4ed56c 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -88,7 +88,6 @@ struct inodes_stat_t {
>  #define MS_STRICTATIME	(1<<24) /* Always perform atime updates */
>  
>  /* These sb flags are internal to the kernel */
> -#define MS_SNAP_STABLE	(1<<27) /* Snapshot pages during writeback, if needed */
>  #define MS_NOSEC	(1<<28)
>  #define MS_BORN		(1<<29)
>  #define MS_ACTIVE	(1<<30)
> diff --git a/mm/bounce.c b/mm/bounce.c
> index 5f89017..a5c2ec3 100644
> --- a/mm/bounce.c
> +++ b/mm/bounce.c
> @@ -181,32 +181,13 @@ static void bounce_end_io_read_isa(struct bio *bio, int err)
>  #ifdef CONFIG_NEED_BOUNCE_POOL
>  static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
>  {
> -	struct page *page;
> -	struct backing_dev_info *bdi;
> -	struct address_space *mapping;
> -	struct bio_vec *from;
> -	int i;
> -
>  	if (bio_data_dir(bio) != WRITE)
>  		return 0;
>  
>  	if (!bdi_cap_stable_pages_required(&q->backing_dev_info))
>  		return 0;
>  
> -	/*
> -	 * Based on the first page that has a valid mapping, decide whether or
> -	 * not we have to employ bounce buffering to guarantee stable pages.
> -	 */
> -	bio_for_each_segment(from, bio, i) {
> -		page = from->bv_page;
> -		mapping = page_mapping(page);
> -		if (!mapping)
> -			continue;
> -		bdi = mapping->backing_dev_info;
> -		return mapping->host->i_sb->s_flags & MS_SNAP_STABLE;
> -	}
> -
> -	return 0;
> +	return test_bit(BIO_SNAP_STABLE, &bio->bi_flags);
>  }
>  #else
>  static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index efe6814..4514ad7 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2311,10 +2311,6 @@ void wait_for_stable_page(struct page *page)
>  
>  	if (!bdi_cap_stable_pages_required(bdi))
>  		return;
> -#ifdef CONFIG_NEED_BOUNCE_POOL
> -	if (mapping->host->i_sb->s_flags & MS_SNAP_STABLE)
> -		return;
> -#endif /* CONFIG_NEED_BOUNCE_POOL */
>  
>  	wait_on_page_writeback(page);
>  }
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2013-03-19  8:54 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-08 12:37 [PATCH] bounce:fix bug, avoid to flush dcache on slab page from jbd2 Shuge
2013-03-08 12:37 ` Shuge
2013-03-12 22:32 ` Andrew Morton
2013-03-12 22:32   ` Andrew Morton
2013-03-13  1:10   ` Darrick J. Wong
2013-03-13  1:10     ` Darrick J. Wong
2013-03-13  1:10     ` Darrick J. Wong
2013-03-13  3:35     ` Shuge
2013-03-13  3:35       ` Shuge
2013-03-13  3:35       ` Shuge
2013-03-13  4:11       ` Andrew Morton
2013-03-13  4:11         ` Andrew Morton
2013-03-13  4:11         ` Andrew Morton
2013-03-13  9:42         ` Russell King - ARM Linux
2013-03-13  9:42           ` Russell King - ARM Linux
2013-03-13  9:42           ` Russell King - ARM Linux
2013-03-13  8:50     ` Jan Kara
2013-03-13  8:50       ` Jan Kara
2013-03-13  8:50       ` Jan Kara
2013-03-13 19:44       ` Darrick J. Wong
2013-03-13 19:44         ` Darrick J. Wong
2013-03-13 19:44         ` Darrick J. Wong
2013-03-13 21:02         ` Jan Kara
2013-03-13 21:02           ` Jan Kara
2013-03-13 21:02           ` Jan Kara
2013-03-14 22:42           ` Darrick J. Wong
2013-03-14 22:42             ` Darrick J. Wong
2013-03-14 22:42             ` Darrick J. Wong
2013-03-14 23:01             ` Andrew Morton
2013-03-14 23:01               ` Andrew Morton
2013-03-14 23:01               ` Andrew Morton
2013-03-15 10:01             ` Jan Kara
2013-03-15 10:01               ` Jan Kara
2013-03-15 10:01               ` Jan Kara
2013-03-15 17:54               ` Darrick J. Wong
2013-03-15 17:54                 ` Darrick J. Wong
2013-03-15 17:54                 ` Darrick J. Wong
2013-03-18 17:32                 ` Jan Kara
2013-03-18 17:32                   ` Jan Kara
2013-03-18 17:32                   ` Jan Kara
2013-03-15 23:28               ` [PATCH] mm: Make snapshotting pages for stable writes a per-bio operation Darrick J. Wong
2013-03-15 23:28                 ` Darrick J. Wong
2013-03-15 23:28                 ` Darrick J. Wong
2013-03-18 17:41                 ` Jan Kara
2013-03-18 17:41                   ` Jan Kara
2013-03-18 17:41                   ` Jan Kara
2013-03-18 23:01                   ` Darrick J. Wong
2013-03-18 23:01                     ` Darrick J. Wong
2013-03-18 23:01                     ` Darrick J. Wong
2013-03-18 23:02                   ` [PATCH v3] " Darrick J. Wong
2013-03-18 23:02                     ` Darrick J. Wong
2013-03-18 23:02                     ` Darrick J. Wong
2013-03-19  8:54                     ` Jan Kara [this message]
2013-03-19  8:54                       ` Jan Kara
2013-03-19  8:54                       ` Jan Kara
2013-04-02 17:01                     ` Darrick J. Wong
2013-04-02 17:01                       ` Darrick J. Wong
2013-04-02 17:01                       ` Darrick J. Wong
2013-04-02 17:01                       ` Darrick J. Wong
2013-04-03 14:20                       ` Mel Gorman
2013-04-03 14:20                         ` Mel Gorman
2013-04-03 14:20                         ` Mel Gorman
2013-04-03 14:42                         ` Jan Kara
2013-04-03 14:42                           ` Jan Kara
2013-04-03 14:42                           ` Jan Kara
2013-04-09 18:03                           ` Darrick J. Wong
2013-04-09 18:03                             ` Darrick J. Wong
2013-04-09 18:03                             ` Darrick J. Wong
2013-03-14 22:46           ` [PATCH] bounce:fix bug, avoid to flush dcache on slab page from jbd2 Andrew Morton
2013-03-14 22:46             ` Andrew Morton
2013-03-14 22:46             ` Andrew Morton
2013-03-14 23:27             ` Darrick J. Wong
2013-03-14 23:27               ` Darrick J. Wong
2013-03-14 23:27               ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130319085448.GA5222@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=catalin.marinas@arm.com \
    --cc=darrick.wong@oracle.com \
    --cc=kevin@allwinnertech.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shugelinux@gmail.com \
    --cc=tytso@mit.edu \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.