linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>
To: david@fromorbit.com, willy@infradead.org,
	chandan.babu@oracle.com, djwong@kernel.org, brauner@kernel.org,
	akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, yang@os.amperecomputing.com,
	linux-mm@kvack.org, john.g.garry@oracle.com,
	linux-fsdevel@vger.kernel.org, hare@suse.de,
	p.raghav@samsung.com, mcgrof@kernel.org, gost.dev@samsung.com,
	cl@os.amperecomputing.com, linux-xfs@vger.kernel.org,
	kernel@pankajraghav.com, hch@lst.de, Zi Yan <zi.yan@sent.com>
Subject: [PATCH v8 02/10] filemap: allocate mapping_min_order folios in the page cache
Date: Tue, 25 Jun 2024 11:44:12 +0000	[thread overview]
Message-ID: <20240625114420.719014-3-kernel@pankajraghav.com> (raw)
In-Reply-To: <20240625114420.719014-1-kernel@pankajraghav.com>

From: Pankaj Raghav <p.raghav@samsung.com>

filemap_create_folio() and do_read_cache_folio() were always allocating
folio of order 0. __filemap_get_folio was trying to allocate higher
order folios when fgp_flags had higher order hint set but it will default
to order 0 folio if higher order memory allocation fails.

Supporting mapping_min_order implies that we guarantee each folio in the
page cache has at least an order of mapping_min_order. When adding new
folios to the page cache we must also ensure the index used is aligned to
the mapping_min_order as the page cache requires the index to be aligned
to the order of the folio.

Co-developed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 include/linux/pagemap.h | 23 ++++++++++++++++++++++-
 mm/filemap.c            | 24 ++++++++++++++++--------
 2 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 0c51154cdb57..7f1355abd8a2 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -439,6 +439,27 @@ unsigned int mapping_min_folio_order(const struct address_space *mapping)
 	return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN;
 }
 
+static inline
+unsigned long mapping_min_folio_nrpages(struct address_space *mapping)
+{
+	return 1UL << mapping_min_folio_order(mapping);
+}
+
+/**
+ * mapping_align_index() - Align index based on the min
+ * folio order of the page cache.
+ * @mapping: The address_space.
+ *
+ * The index of a folio must be naturally aligned.  If you are adding a
+ * new folio to the page cache and need to know what index to give it,
+ * call this function.
+ */
+static inline pgoff_t mapping_align_index(struct address_space *mapping,
+						pgoff_t index)
+{
+	return round_down(index, mapping_min_folio_nrpages(mapping));
+}
+
 /*
  * Large folio support currently depends on THP.  These dependencies are
  * being worked on but are not yet fixed.
@@ -1165,7 +1186,7 @@ static inline vm_fault_t folio_lock_or_retry(struct folio *folio,
 void folio_wait_bit(struct folio *folio, int bit_nr);
 int folio_wait_bit_killable(struct folio *folio, int bit_nr);
 
-/* 
+/*
  * Wait for a folio to be unlocked.
  *
  * This must be called with the caller "holding" the folio,
diff --git a/mm/filemap.c b/mm/filemap.c
index d617c9afca51..8eafbd4a4d0c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -859,6 +859,8 @@ noinline int __filemap_add_folio(struct address_space *mapping,
 
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 	VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio);
+	VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping),
+			folio);
 	mapping_set_update(&xas, mapping);
 
 	VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio);
@@ -1919,8 +1921,10 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 		folio_wait_stable(folio);
 no_page:
 	if (!folio && (fgp_flags & FGP_CREAT)) {
-		unsigned order = FGF_GET_ORDER(fgp_flags);
+		unsigned int min_order = mapping_min_folio_order(mapping);
+		unsigned int order = max(min_order, FGF_GET_ORDER(fgp_flags));
 		int err;
+		index = mapping_align_index(mapping, index);
 
 		if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping))
 			gfp |= __GFP_WRITE;
@@ -1943,7 +1947,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 			gfp_t alloc_gfp = gfp;
 
 			err = -ENOMEM;
-			if (order > 0)
+			if (order > min_order)
 				alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN;
 			folio = filemap_alloc_folio(alloc_gfp, order);
 			if (!folio)
@@ -1958,7 +1962,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 				break;
 			folio_put(folio);
 			folio = NULL;
-		} while (order-- > 0);
+		} while (order-- > min_order);
 
 		if (err == -EEXIST)
 			goto repeat;
@@ -2451,13 +2455,15 @@ static int filemap_update_page(struct kiocb *iocb,
 }
 
 static int filemap_create_folio(struct file *file,
-		struct address_space *mapping, pgoff_t index,
+		struct address_space *mapping, loff_t pos,
 		struct folio_batch *fbatch)
 {
 	struct folio *folio;
 	int error;
+	unsigned int min_order = mapping_min_folio_order(mapping);
+	pgoff_t index;
 
-	folio = filemap_alloc_folio(mapping_gfp_mask(mapping), 0);
+	folio = filemap_alloc_folio(mapping_gfp_mask(mapping), min_order);
 	if (!folio)
 		return -ENOMEM;
 
@@ -2475,6 +2481,7 @@ static int filemap_create_folio(struct file *file,
 	 * well to keep locking rules simple.
 	 */
 	filemap_invalidate_lock_shared(mapping);
+	index = (pos >> (PAGE_SHIFT + min_order)) << min_order;
 	error = filemap_add_folio(mapping, folio, index,
 			mapping_gfp_constraint(mapping, GFP_KERNEL));
 	if (error == -EEXIST)
@@ -2535,8 +2542,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
 	if (!folio_batch_count(fbatch)) {
 		if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ))
 			return -EAGAIN;
-		err = filemap_create_folio(filp, mapping,
-				iocb->ki_pos >> PAGE_SHIFT, fbatch);
+		err = filemap_create_folio(filp, mapping, iocb->ki_pos, fbatch);
 		if (err == AOP_TRUNCATED_PAGE)
 			goto retry;
 		return err;
@@ -3752,9 +3758,11 @@ static struct folio *do_read_cache_folio(struct address_space *mapping,
 repeat:
 	folio = filemap_get_folio(mapping, index);
 	if (IS_ERR(folio)) {
-		folio = filemap_alloc_folio(gfp, 0);
+		folio = filemap_alloc_folio(gfp,
+					    mapping_min_folio_order(mapping));
 		if (!folio)
 			return ERR_PTR(-ENOMEM);
+		index = mapping_align_index(mapping, index);
 		err = filemap_add_folio(mapping, folio, index, gfp);
 		if (unlikely(err)) {
 			folio_put(folio);
-- 
2.44.1


  parent reply	other threads:[~2024-06-25 11:44 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-25 11:44 [PATCH v8 00/10] enable bs > ps in XFS Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes Pankaj Raghav (Samsung)
2024-07-04 12:23   ` Ryan Roberts
2024-07-04 15:20     ` Matthew Wilcox
2024-07-04 15:52       ` Ryan Roberts
2024-07-04 21:28       ` Pankaj Raghav (Samsung)
2024-07-04 22:06       ` Dave Chinner
2024-07-04 23:56         ` Matthew Wilcox
2024-07-05  4:32           ` Dave Chinner
2024-07-05  9:03             ` Ryan Roberts
2024-07-05 12:45               ` Pankaj Raghav (Samsung)
2024-07-05 13:24             ` Pankaj Raghav (Samsung)
2024-07-05 13:31               ` Ryan Roberts
2024-07-05 14:14                 ` Pankaj Raghav (Samsung)
2024-07-08 23:01                 ` Dave Chinner
2024-07-09  8:11                   ` Ryan Roberts
2024-07-09 13:08                   ` Pankaj Raghav (Samsung)
2024-07-05 15:14             ` Matthew Wilcox
2024-07-04 21:34     ` Pankaj Raghav (Samsung)
2024-07-09 16:29   ` Pankaj Raghav (Samsung)
2024-07-09 16:38     ` Matthew Wilcox
2024-07-09 17:33       ` Pankaj Raghav (Samsung)
2024-07-09 16:50     ` Darrick J. Wong
2024-07-09 21:08       ` Pankaj Raghav (Samsung)
2024-07-09 21:59         ` Darrick J. Wong
2024-06-25 11:44 ` Pankaj Raghav (Samsung) [this message]
2024-06-25 15:52   ` [PATCH v8 02/10] filemap: allocate mapping_min_order folios in the page cache Matthew Wilcox
2024-06-25 18:06     ` Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 03/10] readahead: allocate folios with mapping_min_order in readahead Pankaj Raghav (Samsung)
2024-07-02 19:38   ` Darrick J. Wong
2024-07-03 14:10     ` Pankaj Raghav (Samsung)
2024-07-04 14:24   ` Ryan Roberts
2024-07-04 14:29     ` Matthew Wilcox
2024-06-25 11:44 ` [PATCH v8 04/10] mm: split a folio in minimum folio order chunks Pankaj Raghav (Samsung)
2024-06-25 14:45   ` Zi Yan
2024-06-25 17:20     ` Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 05/10] filemap: cap PTE range to be created to allowed zero fill in folio_map_range() Pankaj Raghav (Samsung)
2024-07-01 23:39   ` Darrick J. Wong
2024-06-25 11:44 ` [PATCH v8 06/10] iomap: fix iomap_dio_zero() for fs bs > system page size Pankaj Raghav (Samsung)
2024-07-01  2:37   ` Dave Chinner
2024-07-01 11:22     ` Pankaj Raghav (Samsung)
2024-07-01 23:40   ` Darrick J. Wong
2024-07-02  7:42   ` Christoph Hellwig
2024-07-02 10:15     ` Pankaj Raghav (Samsung)
2024-07-02 12:02       ` Christoph Hellwig
2024-07-02 14:01         ` Pankaj Raghav (Samsung)
2024-07-02 15:42           ` Christoph Hellwig
2024-07-02 16:13             ` Pankaj Raghav (Samsung)
2024-07-02 16:51               ` Matthew Wilcox
2024-07-02 17:10                 ` Pankaj Raghav (Samsung)
2024-07-03  5:16                   ` Christoph Hellwig
2024-07-02 16:50         ` Matthew Wilcox
2024-07-02 13:49       ` Luis Chamberlain
2024-06-25 11:44 ` [PATCH v8 07/10] xfs: use kvmalloc for xattr buffers Pankaj Raghav (Samsung)
2024-06-25 18:07   ` Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 08/10] xfs: expose block size in stat Pankaj Raghav (Samsung)
2024-07-01  2:33   ` Dave Chinner
2024-06-25 11:44 ` [PATCH v8 09/10] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Pankaj Raghav (Samsung)
2024-07-01  2:34   ` Dave Chinner
2024-06-25 11:44 ` [PATCH v8 10/10] xfs: enable block size larger than page size support Pankaj Raghav (Samsung)
2024-07-01  2:34   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240625114420.719014-3-kernel@pankajraghav.com \
    --to=kernel@pankajraghav.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=cl@os.amperecomputing.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=zi.yan@sent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).