* [RFC PATCH] filemap: Convert generic_perform_write() to support large folios
@ 2023-08-22 20:09 Matthew Wilcox (Oracle)
2023-08-22 21:17 ` Darrick J. Wong
0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-08-22 20:09 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Matthew Wilcox (Oracle)
Modelled after the loop in iomap_write_iter(), copy larger chunks from
userspace if the filesystem has created large folios.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
This patch dependss on patches currently in the iomap tree. Sending it
out now for feedback, but I'll resend it after rc1.
mm/filemap.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index bf6219d9aaac..fd28767c760a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3908,6 +3908,7 @@ EXPORT_SYMBOL(generic_file_direct_write);
ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
{
struct file *file = iocb->ki_filp;
+ size_t chunk = PAGE_SIZE << MAX_PAGECACHE_ORDER;
loff_t pos = iocb->ki_pos;
struct address_space *mapping = file->f_mapping;
const struct address_space_operations *a_ops = mapping->a_ops;
@@ -3916,16 +3917,16 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
do {
struct page *page;
- unsigned long offset; /* Offset into pagecache page */
- unsigned long bytes; /* Bytes to write to page */
+ struct folio *folio;
+ size_t offset; /* Offset into folio */
+ size_t bytes; /* Bytes to write to folio */
size_t copied; /* Bytes copied from user */
void *fsdata = NULL;
- offset = (pos & (PAGE_SIZE - 1));
- bytes = min_t(unsigned long, PAGE_SIZE - offset,
- iov_iter_count(i));
+ offset = pos & (chunk - 1);
+ bytes = min(chunk - offset, iov_iter_count(i));
+ balance_dirty_pages_ratelimited(mapping);
-again:
/*
* Bring in the user page that we will copy from _first_.
* Otherwise there's a nasty deadlock on copying from the
@@ -3947,11 +3948,16 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
if (unlikely(status < 0))
break;
+ folio = page_folio(page);
+ offset = offset_in_folio(folio, pos);
+ if (bytes > folio_size(folio) - offset)
+ bytes = folio_size(folio) - offset;
+
if (mapping_writably_mapped(mapping))
- flush_dcache_page(page);
+ flush_dcache_folio(folio);
- copied = copy_page_from_iter_atomic(page, offset, bytes, i);
- flush_dcache_page(page);
+ copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
+ flush_dcache_folio(folio);
status = a_ops->write_end(file, mapping, pos, bytes, copied,
page, fsdata);
@@ -3971,12 +3977,12 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
*/
if (copied)
bytes = copied;
- goto again;
+ if (chunk > PAGE_SIZE)
+ chunk /= 2;
+ } else {
+ pos += status;
+ written += status;
}
- pos += status;
- written += status;
-
- balance_dirty_pages_ratelimited(mapping);
} while (iov_iter_count(i));
if (!written)
--
2.40.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC PATCH] filemap: Convert generic_perform_write() to support large folios
2023-08-22 20:09 [RFC PATCH] filemap: Convert generic_perform_write() to support large folios Matthew Wilcox (Oracle)
@ 2023-08-22 21:17 ` Darrick J. Wong
2023-08-22 21:56 ` Matthew Wilcox
0 siblings, 1 reply; 3+ messages in thread
From: Darrick J. Wong @ 2023-08-22 21:17 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel
On Tue, Aug 22, 2023 at 09:09:37PM +0100, Matthew Wilcox (Oracle) wrote:
> Modelled after the loop in iomap_write_iter(), copy larger chunks from
> userspace if the filesystem has created large folios.
Hum. Which filesystems are those? Is this for the in-memory ones like
tmpfs?
--D
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
> This patch dependss on patches currently in the iomap tree. Sending it
> out now for feedback, but I'll resend it after rc1.
>
> mm/filemap.c | 34 ++++++++++++++++++++--------------
> 1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index bf6219d9aaac..fd28767c760a 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3908,6 +3908,7 @@ EXPORT_SYMBOL(generic_file_direct_write);
> ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
> {
> struct file *file = iocb->ki_filp;
> + size_t chunk = PAGE_SIZE << MAX_PAGECACHE_ORDER;
> loff_t pos = iocb->ki_pos;
> struct address_space *mapping = file->f_mapping;
> const struct address_space_operations *a_ops = mapping->a_ops;
> @@ -3916,16 +3917,16 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
>
> do {
> struct page *page;
> - unsigned long offset; /* Offset into pagecache page */
> - unsigned long bytes; /* Bytes to write to page */
> + struct folio *folio;
> + size_t offset; /* Offset into folio */
> + size_t bytes; /* Bytes to write to folio */
> size_t copied; /* Bytes copied from user */
> void *fsdata = NULL;
>
> - offset = (pos & (PAGE_SIZE - 1));
> - bytes = min_t(unsigned long, PAGE_SIZE - offset,
> - iov_iter_count(i));
> + offset = pos & (chunk - 1);
> + bytes = min(chunk - offset, iov_iter_count(i));
> + balance_dirty_pages_ratelimited(mapping);
>
> -again:
> /*
> * Bring in the user page that we will copy from _first_.
> * Otherwise there's a nasty deadlock on copying from the
> @@ -3947,11 +3948,16 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
> if (unlikely(status < 0))
> break;
>
> + folio = page_folio(page);
> + offset = offset_in_folio(folio, pos);
> + if (bytes > folio_size(folio) - offset)
> + bytes = folio_size(folio) - offset;
> +
> if (mapping_writably_mapped(mapping))
> - flush_dcache_page(page);
> + flush_dcache_folio(folio);
>
> - copied = copy_page_from_iter_atomic(page, offset, bytes, i);
> - flush_dcache_page(page);
> + copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
> + flush_dcache_folio(folio);
>
> status = a_ops->write_end(file, mapping, pos, bytes, copied,
> page, fsdata);
> @@ -3971,12 +3977,12 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
> */
> if (copied)
> bytes = copied;
> - goto again;
> + if (chunk > PAGE_SIZE)
> + chunk /= 2;
> + } else {
> + pos += status;
> + written += status;
> }
> - pos += status;
> - written += status;
> -
> - balance_dirty_pages_ratelimited(mapping);
> } while (iov_iter_count(i));
>
> if (!written)
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC PATCH] filemap: Convert generic_perform_write() to support large folios
2023-08-22 21:17 ` Darrick J. Wong
@ 2023-08-22 21:56 ` Matthew Wilcox
0 siblings, 0 replies; 3+ messages in thread
From: Matthew Wilcox @ 2023-08-22 21:56 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-fsdevel
On Tue, Aug 22, 2023 at 02:17:20PM -0700, Darrick J. Wong wrote:
> On Tue, Aug 22, 2023 at 09:09:37PM +0100, Matthew Wilcox (Oracle) wrote:
> > Modelled after the loop in iomap_write_iter(), copy larger chunks from
> > userspace if the filesystem has created large folios.
>
> Hum. Which filesystems are those? Is this for the in-memory ones like
> tmpfs?
Alas tmpfs uses its own shmem_file_read_iter() and doesn't call back
into generic_perform_write(). But I was looking at the ramfs aops and
thinking those looked ripe for large folio support, so I thought I'd take
care of this part first since it potentially affects every filesystem
that uses generic_file_write_iter() / __generic_file_write_iter() /
generic_perform_write().
This is also a great opporunity for someone to tell me "Actually I have
plans in this area and ..."
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-08-22 21:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-22 20:09 [RFC PATCH] filemap: Convert generic_perform_write() to support large folios Matthew Wilcox (Oracle)
2023-08-22 21:17 ` Darrick J. Wong
2023-08-22 21:56 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).