All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Hans Holmberg <hans.holmberg@wdc.com>
Cc: damien.lemoal@wdc.com, linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [RFC PATCH] f2fs: preserve direct write semantics when buffering is forced
Date: Thu, 23 Mar 2023 15:37:20 -0700	[thread overview]
Message-ID: <ZBzUoJ9sydeS4TpI@google.com> (raw)
In-Reply-To: <20230220122004.26555-1-hans.holmberg@wdc.com>

On 02/20, Hans Holmberg wrote:
> In some cases, e.g. for zoned block devices, direct writes are
> forced into buffered writes that will populate the page cache
> and be written out just like buffered io.
> 
> Direct reads, on the other hand, is supported for the zoned
> block device case. This has the effect that applications
> built for direct io will fill up the page cache with data
> that will never be read, and that is a waste of resources.
> 
> If we agree that this is a problem, how do we fix it?
> 
> A) Supporting proper direct writes for zoned block devices would
> be the best, but it is currently not supported (probably for
> a good but non-obvious reason). Would it be feasible to
> implement proper direct IO?
> 
> B) Avoid the cost of keeping unwanted data by syncing and throwing
> out the cached pages for buffered O_DIRECT writes before completion.
> 
> This patch implements B) by reusing the code for how partial
> block writes are flushed out on the "normal" direct write path.
> 
> Note that this changes the performance characteristics of f2fs
> quite a bit.
> 
> Direct IO performance for zoned block devices is lower for
> small writes after this patch, but this should be expected
> with direct IO and in line with how f2fs behaves on top of
> conventional block devices.
> 
> Another open question is if the flushing should be done for
> all cases where buffered writes are forced.
> 
> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
> ---
>  fs/f2fs/file.c | 38 ++++++++++++++++++++++++++++++--------
>  1 file changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index ecbc8c135b49..4e57c37bce35 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -4513,6 +4513,19 @@ static const struct iomap_dio_ops f2fs_iomap_dio_write_ops = {
>  	.end_io = f2fs_dio_write_end_io,
>  };
>  
> +static void f2fs_flush_buffered_write(struct address_space *mapping,
> +				      loff_t start_pos, loff_t end_pos)
> +{
> +	int ret;
> +
> +	ret = filemap_write_and_wait_range(mapping, start_pos, end_pos);
> +	if (ret < 0)
> +		return;
> +	invalidate_mapping_pages(mapping,
> +				 start_pos >> PAGE_SHIFT,
> +				 end_pos >> PAGE_SHIFT);
> +}
> +
>  static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
>  				   bool *may_need_sync)
>  {
> @@ -4612,14 +4625,9 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
>  
>  			ret += ret2;
>  
> -			ret2 = filemap_write_and_wait_range(file->f_mapping,
> -							    bufio_start_pos,
> -							    bufio_end_pos);
> -			if (ret2 < 0)
> -				goto out;
> -			invalidate_mapping_pages(file->f_mapping,
> -						 bufio_start_pos >> PAGE_SHIFT,
> -						 bufio_end_pos >> PAGE_SHIFT);
> +			f2fs_flush_buffered_write(file->f_mapping,
> +						  bufio_start_pos,
> +						  bufio_end_pos);
>  		}
>  	} else {
>  		/* iomap_dio_rw() already handled the generic_write_sync(). */
> @@ -4717,8 +4725,22 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	inode_unlock(inode);
>  out:
>  	trace_f2fs_file_write_iter(inode, orig_pos, orig_count, ret);
> +
>  	if (ret > 0 && may_need_sync)
>  		ret = generic_write_sync(iocb, ret);
> +
> +	/* If buffered IO was forced, flush and drop the data from
> +	 * the page cache to preserve O_DIRECT semantics
> +	 */
> +	if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT)) {
> +		struct file *file = iocb->ki_filp;
> +		loff_t end_pos = orig_pos + ret - 1;
> +
> +		f2fs_flush_buffered_write(file->f_mapping,
> +					  orig_pos,
> +					  end_pos);

I applied a minor change:

        /* If buffered IO was forced, flush and drop the data from
         * the page cache to preserve O_DIRECT semantics
         */
-       if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT)) {
-               struct file *file = iocb->ki_filp;
-               loff_t end_pos = orig_pos + ret - 1;
-
-               f2fs_flush_buffered_write(file->f_mapping,
+       if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT))
+               f2fs_flush_buffered_write(iocb->ki_filp->f_mapping,
                                          orig_pos,
-                                         end_pos);
-       }
+                                         orig_pos + ret - 1);

        return ret;
 }


> +	}
> +
>  	return ret;
>  }
>  
> -- 
> 2.25.1


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Hans Holmberg <hans.holmberg@wdc.com>
Cc: Chao Yu <chao@kernel.org>,
	linux-f2fs-devel@lists.sourceforge.net, damien.lemoal@wdc.com,
	aravind.ramesh@wdc.com, hans@owltronix.com,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] f2fs: preserve direct write semantics when buffering is forced
Date: Thu, 23 Mar 2023 15:37:20 -0700	[thread overview]
Message-ID: <ZBzUoJ9sydeS4TpI@google.com> (raw)
In-Reply-To: <20230220122004.26555-1-hans.holmberg@wdc.com>

On 02/20, Hans Holmberg wrote:
> In some cases, e.g. for zoned block devices, direct writes are
> forced into buffered writes that will populate the page cache
> and be written out just like buffered io.
> 
> Direct reads, on the other hand, is supported for the zoned
> block device case. This has the effect that applications
> built for direct io will fill up the page cache with data
> that will never be read, and that is a waste of resources.
> 
> If we agree that this is a problem, how do we fix it?
> 
> A) Supporting proper direct writes for zoned block devices would
> be the best, but it is currently not supported (probably for
> a good but non-obvious reason). Would it be feasible to
> implement proper direct IO?
> 
> B) Avoid the cost of keeping unwanted data by syncing and throwing
> out the cached pages for buffered O_DIRECT writes before completion.
> 
> This patch implements B) by reusing the code for how partial
> block writes are flushed out on the "normal" direct write path.
> 
> Note that this changes the performance characteristics of f2fs
> quite a bit.
> 
> Direct IO performance for zoned block devices is lower for
> small writes after this patch, but this should be expected
> with direct IO and in line with how f2fs behaves on top of
> conventional block devices.
> 
> Another open question is if the flushing should be done for
> all cases where buffered writes are forced.
> 
> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
> ---
>  fs/f2fs/file.c | 38 ++++++++++++++++++++++++++++++--------
>  1 file changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index ecbc8c135b49..4e57c37bce35 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -4513,6 +4513,19 @@ static const struct iomap_dio_ops f2fs_iomap_dio_write_ops = {
>  	.end_io = f2fs_dio_write_end_io,
>  };
>  
> +static void f2fs_flush_buffered_write(struct address_space *mapping,
> +				      loff_t start_pos, loff_t end_pos)
> +{
> +	int ret;
> +
> +	ret = filemap_write_and_wait_range(mapping, start_pos, end_pos);
> +	if (ret < 0)
> +		return;
> +	invalidate_mapping_pages(mapping,
> +				 start_pos >> PAGE_SHIFT,
> +				 end_pos >> PAGE_SHIFT);
> +}
> +
>  static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
>  				   bool *may_need_sync)
>  {
> @@ -4612,14 +4625,9 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
>  
>  			ret += ret2;
>  
> -			ret2 = filemap_write_and_wait_range(file->f_mapping,
> -							    bufio_start_pos,
> -							    bufio_end_pos);
> -			if (ret2 < 0)
> -				goto out;
> -			invalidate_mapping_pages(file->f_mapping,
> -						 bufio_start_pos >> PAGE_SHIFT,
> -						 bufio_end_pos >> PAGE_SHIFT);
> +			f2fs_flush_buffered_write(file->f_mapping,
> +						  bufio_start_pos,
> +						  bufio_end_pos);
>  		}
>  	} else {
>  		/* iomap_dio_rw() already handled the generic_write_sync(). */
> @@ -4717,8 +4725,22 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	inode_unlock(inode);
>  out:
>  	trace_f2fs_file_write_iter(inode, orig_pos, orig_count, ret);
> +
>  	if (ret > 0 && may_need_sync)
>  		ret = generic_write_sync(iocb, ret);
> +
> +	/* If buffered IO was forced, flush and drop the data from
> +	 * the page cache to preserve O_DIRECT semantics
> +	 */
> +	if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT)) {
> +		struct file *file = iocb->ki_filp;
> +		loff_t end_pos = orig_pos + ret - 1;
> +
> +		f2fs_flush_buffered_write(file->f_mapping,
> +					  orig_pos,
> +					  end_pos);

I applied a minor change:

        /* If buffered IO was forced, flush and drop the data from
         * the page cache to preserve O_DIRECT semantics
         */
-       if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT)) {
-               struct file *file = iocb->ki_filp;
-               loff_t end_pos = orig_pos + ret - 1;
-
-               f2fs_flush_buffered_write(file->f_mapping,
+       if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT))
+               f2fs_flush_buffered_write(iocb->ki_filp->f_mapping,
                                          orig_pos,
-                                         end_pos);
-       }
+                                         orig_pos + ret - 1);

        return ret;
 }


> +	}
> +
>  	return ret;
>  }
>  
> -- 
> 2.25.1

  parent reply	other threads:[~2023-03-23 22:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230220123747epcas2p4c72ace14d10031df7aa116999ad5fe25@epcms2p8>
2023-02-20 12:20 ` [f2fs-dev] [RFC PATCH] f2fs: preserve direct write semantics when buffering is forced Hans Holmberg via Linux-f2fs-devel
2023-02-20 12:20   ` Hans Holmberg
2023-02-22 11:08   ` [f2fs-dev] " Yonggil Song
2023-02-22 11:08     ` Yonggil Song
2023-03-20 13:42   ` Christoph Hellwig
2023-03-20 13:42     ` Christoph Hellwig
2023-03-23 22:14     ` [f2fs-dev] " Jaegeuk Kim
2023-03-23 22:14       ` Jaegeuk Kim
2023-03-23 23:02       ` [f2fs-dev] " Damien Le Moal via Linux-f2fs-devel
2023-03-23 23:02         ` Damien Le Moal
2023-03-23 23:46         ` [f2fs-dev] " Jaegeuk Kim
2023-03-23 23:46           ` Jaegeuk Kim
2023-03-24  0:06           ` [f2fs-dev] " Damien Le Moal via Linux-f2fs-devel
2023-03-24  0:06             ` Damien Le Moal
2023-03-24  0:46             ` [f2fs-dev] " Jaegeuk Kim
2023-03-24  0:46               ` Jaegeuk Kim
2023-03-26 23:39               ` [f2fs-dev] " hch
2023-03-26 23:39                 ` hch
2023-06-05 11:56                 ` [f2fs-dev] " Hans Holmberg via Linux-f2fs-devel
2023-06-05 11:56                   ` Hans Holmberg
2023-06-05 19:36                   ` [f2fs-dev] " Jaegeuk Kim
2023-06-05 19:36                     ` Jaegeuk Kim
2023-03-23 22:37   ` Jaegeuk Kim [this message]
2023-03-23 22:37     ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBzUoJ9sydeS4TpI@google.com \
    --to=jaegeuk@kernel.org \
    --cc=damien.lemoal@wdc.com \
    --cc=hans.holmberg@wdc.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.