Re: [PATCH 02/10] ext2: Convert ext2_check_page to ext2_check_folio

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Matthew Wilcox <willy@infradead.org>
To: Jan Kara <jack@suse.cz>
Cc: Jan Kara <jack@suse.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Fabio M . De Francesco" <fmdefrancesco@gmail.com>
Subject: Re: [PATCH 02/10] ext2: Convert ext2_check_page to ext2_check_folio
Date: Thu, 5 Oct 2023 19:27:15 +0100	[thread overview]
Message-ID: <ZR8AAy6scCAjXhiS@casper.infradead.org> (raw)
In-Reply-To: <20231003105224.3j47fjxpiudwvupn@quack3>

On Tue, Oct 03, 2023 at 12:52:24PM +0200, Jan Kara wrote:
> On Tue 03-10-23 12:40:17, Jan Kara wrote:
> > On Thu 21-09-23 21:07:39, Matthew Wilcox (Oracle) wrote:
> > > Support in this function for large folios is limited to supporting
> > > filesystems with block size > PAGE_SIZE.  This new functionality will only
> > > be supported on machines without HIGHMEM, so the problem of kmap_local
> > > only being able to map a single page in the folio can be ignored.
> > > We will not use large folios for ext2 directories on HIGHMEM machines.
> > 
> > OK, but can we perhaps enforce this with some checks & error messages
> > instead of a silent failure? Like:
> > 
> > #ifdef CONFIG_HIGHMEM
> > 	if (sb->s_blocksize > PAGE_SIZE)
> > 		bail with error
> > #endif
> > 
> > somewhere in ext2_fill_super()? Or maybe force allocation of lowmem pages
> > when blocksize > PAGE_SIZE?
> > 
> > > @@ -195,9 +195,9 @@ static void *ext2_get_page(struct inode *dir, unsigned long n,
> > >  
> > >  	if (IS_ERR(folio))
> > >  		return ERR_CAST(folio);
> > > -	page_addr = kmap_local_folio(folio, n & (folio_nr_pages(folio) - 1));
> > > +	page_addr = kmap_local_folio(folio, 0);
> 
> Oh, and I think this change breaks the code whenever we get back higher
> order folio because the page_addr we get back is not for the page index
> 'n'.

> A look like a fool replying to the same patch multiple times ;) but how is
> this supposed to work even without HIGHMEM? Suppose we have a page size 4k
> and block size 16k. Directory entries in a block can then straddle 4k
> boundary so if kmap_local() is mapping only a single page, then we are
> going to have hard time parsing this all?
> 
> Oh, I guess you are pointing to the fact kmap_local_folio() gives you back
> address usable for the whole folio access if !HIGHMEM. But then all the
> iterations (e.g. in ext2_readdir()) has to be folio-size based and not
> page-size based as they currently are? You didn't change these iterations
> in your patches which has confused me...

I'm going to reply to all three emails as one ;-)

The goal for this patchset is simply to do the folio conversion.
Earlier work was focused on "make sure everything behaves as it did
before", but now I've started to give some thought to "How will this
work" without actually doing the higher level work.

Today, ext2 does not permit large folios to be used for directory
inodes.  Whoever adds the call to mapping_set_large_folios() gets the
joy of finding all the places that contain assumptions and fixing them.
I don't particularly want to do that myself; I'm just trying to sort
out the underpinnings so that whoever does it has a somewhat sane
API to work against.

So I'm really just trying to do two things here:

 - Convert uses of struct page -> struct folio
 - Not leave any landmines for whoever comes after me

I had a quick look at converting ext2_readdir() to work for
large folios.  It's a bit intrusive, and I haven't done any
serious testing (let alone with the bs>PS patches).  But it could
look something like this:


diff --git a/fs/ext2/dir.c b/fs/ext2/dir.c
index 6807df637112..5eeb57ce6a18 100644
--- a/fs/ext2/dir.c
+++ b/fs/ext2/dir.c
@@ -257,45 +257,45 @@ static inline void ext2_set_de_type(ext2_dirent *de, struct inode *inode)
 static int
 ext2_readdir(struct file *file, struct dir_context *ctx)
 {
-	loff_t pos = ctx->pos;
 	struct inode *inode = file_inode(file);
+	loff_t pos = ctx->pos;
+	loff_t isize = i_size_read(inode);
 	struct super_block *sb = inode->i_sb;
-	unsigned int offset = pos & ~PAGE_MASK;
-	unsigned long n = pos >> PAGE_SHIFT;
-	unsigned long npages = dir_pages(inode);
 	unsigned chunk_mask = ~(ext2_chunk_size(inode)-1);
 	bool need_revalidate = !inode_eq_iversion(inode, file->f_version);
 	bool has_filetype;
 
-	if (pos > inode->i_size - EXT2_DIR_REC_LEN(1))
+	if (pos > isize - EXT2_DIR_REC_LEN(1))
 		return 0;
 
 	has_filetype =
 		EXT2_HAS_INCOMPAT_FEATURE(sb, EXT2_FEATURE_INCOMPAT_FILETYPE);
 
-	for ( ; n < npages; n++, offset = 0) {
+	while (pos < isize) {
 		ext2_dirent *de;
 		struct folio *folio;
-		char *kaddr = ext2_get_folio(inode, n, 0, &folio);
+		char *kaddr = ext2_get_folio(inode, pos / PAGE_SIZE, 0, &folio);
+		size_t offset = offset_in_folio(folio, pos);
 		char *limit;
 
 		if (IS_ERR(kaddr)) {
 			ext2_error(sb, __func__,
 				   "bad page in #%lu",
 				   inode->i_ino);
-			ctx->pos += PAGE_SIZE - offset;
+			ctx->pos += folio_size(folio) - offset;
 			return PTR_ERR(kaddr);
 		}
 		if (unlikely(need_revalidate)) {
 			if (offset) {
 				offset = ext2_validate_entry(kaddr, offset, chunk_mask);
-				ctx->pos = (n<<PAGE_SHIFT) + offset;
+				ctx->pos = folio_pos(folio) + offset;
 			}
 			file->f_version = inode_query_iversion(inode);
 			need_revalidate = false;
 		}
 		de = (ext2_dirent *)(kaddr+offset);
-		limit = kaddr + ext2_last_byte(inode, n) - EXT2_DIR_REC_LEN(1);
+		limit = kaddr + ext2_last_byte(inode, folio->index) -
+				EXT2_DIR_REC_LEN(1);
 		for ( ;(char*)de <= limit; de = ext2_next_entry(de)) {
 			if (de->rec_len == 0) {
 				ext2_error(sb, __func__,
@@ -319,6 +319,7 @@ ext2_readdir(struct file *file, struct dir_context *ctx)
 			ctx->pos += ext2_rec_len_from_disk(de->rec_len);
 		}
 		folio_release_kmap(folio, kaddr);
+		pos = (loff_t)folio_next_index(folio) * PAGE_SIZE;
 	}
 	return 0;
 }

ext2_last_byte() is clearly now incorrect.  It should probably take
isize and folio as inputs and return either the last byte in the folio
or the last byte in the file.  But then we need to convert all the
callers, and I didn't want to do that for this demonstration patch.

next prev parent reply	other threads:[~2023-10-05 18:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-21 20:07 [PATCH 01/10] highmem: Add folio_release_kmap() Matthew Wilcox (Oracle)
2023-09-21 20:07 ` [PATCH 02/10] ext2: Convert ext2_check_page to ext2_check_folio Matthew Wilcox (Oracle)
2023-10-03 10:40   ` Jan Kara
2023-10-03 10:52     ` Jan Kara
2023-10-05 18:27       ` Matthew Wilcox [this message]
2023-10-03 11:02   ` Jan Kara
2023-09-21 20:07 ` [PATCH 03/10] ext2: Add ext2_get_folio() Matthew Wilcox (Oracle)
2023-09-21 20:07 ` [PATCH 04/10] ext2: Convert ext2_readdir to use a folio Matthew Wilcox (Oracle)
2023-09-21 20:07 ` [PATCH 05/10] ext2: Convert ext2_add_link() " Matthew Wilcox (Oracle)
2023-09-21 20:07 ` [PATCH 06/10] ext2: Convert ext2_empty_dir() " Matthew Wilcox (Oracle)
2023-09-21 20:07 ` [PATCH 07/10] ext2: Handle large block size directories in ext2_delete_entry() Matthew Wilcox (Oracle)
2023-10-25 18:17   ` Jan Kara
2023-09-21 20:07 ` [PATCH 08/10] ext2: Convert ext2_unlink() and ext2_rename() to use folios Matthew Wilcox (Oracle)
2023-10-25 18:20   ` Jan Kara
2023-09-21 20:07 ` [PATCH 09/10] ext2: Convert ext2_make_empty() to use a folio Matthew Wilcox (Oracle)
2023-10-03 10:41 ` [PATCH 01/10] highmem: Add folio_release_kmap() Jan Kara
2023-10-03 11:55   ` Matthew Wilcox

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:6807df63711 dfblob:5eeb57ce6a1 )
 OR (
bs:"Re: [PATCH 02/10] ext2: Convert ext2_check_page to ext2_check_folio" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZR8AAy6scCAjXhiS@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=fmdefrancesco@gmail.com \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).