[PATCH v2 0/2] erofs: support large folios for fscache mode

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/2] erofs: support large folios for fscache mode
@ 2022-11-28  2:50 Jingbo Xu
  2022-11-28  2:50 ` [PATCH v2 1/2] " Jingbo Xu
  2022-11-28  2:50 ` [PATCH v2 2/2] erofs: enable " Jingbo Xu
  0 siblings, 2 replies; 5+ messages in thread
From: Jingbo Xu @ 2022-11-28  2:50 UTC (permalink / raw)
  To: xiang, chao, linux-erofs; +Cc: linux-kernel

v2:
- patch 2: keep the enabling for iomap and fscache mode in separate
  patches; don't enable the feature for the meta data routine for now
  (Gao Xiang)

v1: https://lore.kernel.org/all/20221126005756.7662-1-jefflexu@linux.alibaba.com/


Patch 1 is the main part of supporting large folios for fscache mode. It
relies on a pending patch[1] adding .prepare_ondemand_read() interface
in Cachefiles.

Patch 2 just turns the switch on and enables the feature for fscache
mode. It relies on a previous patch[2] which enables this feature for
iomap mode.

[1] https://lore.kernel.org/all/20221124034212.81892-1-jefflexu@linux.alibaba.com/
[2] https://lore.kernel.org/all/20221110074023.8059-1-jefflexu@linux.alibaba.com/


Jingbo Xu (2):
  erofs: support large folios for fscache mode
  erofs: enable large folios for fscache mode

 fs/erofs/fscache.c | 116 +++++++++++++++++++--------------------------
 fs/erofs/inode.c   |   3 +-
 2 files changed, 49 insertions(+), 70 deletions(-)

-- 
2.19.1.6.gb485710b


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] erofs: support large folios for fscache mode
  2022-11-28  2:50 [PATCH v2 0/2] erofs: support large folios for fscache mode Jingbo Xu
@ 2022-11-28  2:50 ` Jingbo Xu
  2022-11-28 13:54   ` [Phishing Risk] [External] " Jia Zhu
  2022-11-28  2:50 ` [PATCH v2 2/2] erofs: enable " Jingbo Xu
  1 sibling, 1 reply; 5+ messages in thread
From: Jingbo Xu @ 2022-11-28  2:50 UTC (permalink / raw)
  To: xiang, chao, linux-erofs; +Cc: linux-kernel

When large folios supported, one folio can be split into several slices,
each of which may be mapped to META/UNMAPPED/MAPPED, and the folio can
be unlocked as a whole only when all slices have completed.

Thus always allocate erofs_fscache_request for each .read_folio() or
.readahead(). In this case, only when all slices of the folio or folio
range have completed, the request will be marked as completed and the
folio or folio range will be unlocked then.

Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/fscache.c | 116 +++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 68 deletions(-)

diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 3cfe1af7a46e..0643b205c7eb 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -167,32 +167,18 @@ static int erofs_fscache_meta_read_folio(struct file *data, struct folio *folio)
 	return ret;
 }
 
-/*
- * Read into page cache in the range described by (@pos, @len).
- *
- * On return, if the output @unlock is true, the caller is responsible for page
- * unlocking; otherwise the callee will take this responsibility through request
- * completion.
- *
- * The return value is the number of bytes successfully handled, or negative
- * error code on failure. The only exception is that, the length of the range
- * instead of the error code is returned on failure after request is allocated,
- * so that .readahead() could advance rac accordingly.
- */
-static int erofs_fscache_data_read(struct address_space *mapping,
-				   loff_t pos, size_t len, bool *unlock)
+static int erofs_fscache_data_read_slice(struct erofs_fscache_request *req)
 {
+	struct address_space *mapping = req->mapping;
 	struct inode *inode = mapping->host;
 	struct super_block *sb = inode->i_sb;
-	struct erofs_fscache_request *req;
+	loff_t pos = req->start + req->submitted;
 	struct erofs_map_blocks map;
 	struct erofs_map_dev mdev;
 	struct iov_iter iter;
 	size_t count;
 	int ret;
 
-	*unlock = true;
-
 	map.m_la = pos;
 	ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
 	if (ret)
@@ -201,36 +187,37 @@ static int erofs_fscache_data_read(struct address_space *mapping,
 	if (map.m_flags & EROFS_MAP_META) {
 		struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
 		erofs_blk_t blknr;
-		size_t offset, size;
+		size_t offset;
 		void *src;
 
 		/* For tail packing layout, the offset may be non-zero. */
 		offset = erofs_blkoff(map.m_pa);
 		blknr = erofs_blknr(map.m_pa);
-		size = map.m_llen;
+		count = map.m_llen;
 
 		src = erofs_read_metabuf(&buf, sb, blknr, EROFS_KMAP);
 		if (IS_ERR(src))
 			return PTR_ERR(src);
 
-		iov_iter_xarray(&iter, READ, &mapping->i_pages, pos, PAGE_SIZE);
-		if (copy_to_iter(src + offset, size, &iter) != size) {
+		iov_iter_xarray(&iter, READ, &mapping->i_pages, pos, count);
+		if (copy_to_iter(src + offset, count, &iter) != count) {
 			erofs_put_metabuf(&buf);
 			return -EFAULT;
 		}
-		iov_iter_zero(PAGE_SIZE - size, &iter);
 		erofs_put_metabuf(&buf);
-		return PAGE_SIZE;
+		req->submitted += count;
+		return 0;
 	}
 
+	count = req->len - req->submitted;
 	if (!(map.m_flags & EROFS_MAP_MAPPED)) {
-		count = len;
 		iov_iter_xarray(&iter, READ, &mapping->i_pages, pos, count);
 		iov_iter_zero(count, &iter);
-		return count;
+		req->submitted += count;
+		return 0;
 	}
 
-	count = min_t(size_t, map.m_llen - (pos - map.m_la), len);
+	count = min_t(size_t, map.m_llen - (pos - map.m_la), count);
 	DBG_BUGON(!count || count % PAGE_SIZE);
 
 	mdev = (struct erofs_map_dev) {
@@ -241,68 +228,61 @@ static int erofs_fscache_data_read(struct address_space *mapping,
 	if (ret)
 		return ret;
 
-	req = erofs_fscache_req_alloc(mapping, pos, count);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	*unlock = false;
-	ret = erofs_fscache_read_folios_async(mdev.m_fscache->cookie,
+	return erofs_fscache_read_folios_async(mdev.m_fscache->cookie,
 			req, mdev.m_pa + (pos - map.m_la), count);
-	if (ret)
-		req->error = ret;
+}
 
-	erofs_fscache_req_put(req);
-	return count;
+/*
+ * Read into page cache in the range described by (req->start, req->len).
+ */
+static int erofs_fscache_data_read(struct erofs_fscache_request *req)
+{
+	int ret;
+
+	do {
+		ret = erofs_fscache_data_read_slice(req);
+		if (ret)
+			req->error = ret;
+	} while (!ret && req->submitted < req->len);
+
+	return ret;
 }
 
 static int erofs_fscache_read_folio(struct file *file, struct folio *folio)
 {
-	bool unlock;
+	struct erofs_fscache_request *req;
 	int ret;
 
-	DBG_BUGON(folio_size(folio) != EROFS_BLKSIZ);
-
-	ret = erofs_fscache_data_read(folio_mapping(folio), folio_pos(folio),
-				      folio_size(folio), &unlock);
-	if (unlock) {
-		if (ret > 0)
-			folio_mark_uptodate(folio);
+	req = erofs_fscache_req_alloc(folio_mapping(folio),
+			folio_pos(folio), folio_size(folio));
+	if (IS_ERR(req)) {
 		folio_unlock(folio);
+		return PTR_ERR(req);
 	}
-	return ret < 0 ? ret : 0;
+
+	ret = erofs_fscache_data_read(req);
+	erofs_fscache_req_put(req);
+	return ret;
 }
 
 static void erofs_fscache_readahead(struct readahead_control *rac)
 {
-	struct folio *folio;
-	size_t len, done = 0;
-	loff_t start, pos;
-	bool unlock;
-	int ret, size;
+	struct erofs_fscache_request *req;
 
 	if (!readahead_count(rac))
 		return;
 
-	start = readahead_pos(rac);
-	len = readahead_length(rac);
+	req = erofs_fscache_req_alloc(rac->mapping,
+			readahead_pos(rac), readahead_length(rac));
+	if (IS_ERR(req))
+		return;
 
-	do {
-		pos = start + done;
-		ret = erofs_fscache_data_read(rac->mapping, pos,
-					      len - done, &unlock);
-		if (ret <= 0)
-			return;
+	/* The request completion will drop refs on the folios. */
+	while (readahead_folio(rac))
+		;
 
-		size = ret;
-		while (size) {
-			folio = readahead_folio(rac);
-			size -= folio_size(folio);
-			if (unlock) {
-				folio_mark_uptodate(folio);
-				folio_unlock(folio);
-			}
-		}
-	} while ((done += ret) < len);
+	erofs_fscache_data_read(req);
+	erofs_fscache_req_put(req);
 }
 
 static const struct address_space_operations erofs_fscache_meta_aops = {
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Phishing Risk] [External] [PATCH v2 1/2] erofs: support large folios for fscache mode
  2022-11-28  2:50 ` [PATCH v2 1/2] " Jingbo Xu
@ 2022-11-28 13:54   ` Jia Zhu
  0 siblings, 0 replies; 5+ messages in thread
From: Jia Zhu @ 2022-11-28 13:54 UTC (permalink / raw)
  To: Jingbo Xu, xiang, chao, linux-erofs; +Cc: linux-kernel



在 2022/11/28 10:50, Jingbo Xu 写道:
> When large folios supported, one folio can be split into several slices,
> each of which may be mapped to META/UNMAPPED/MAPPED, and the folio can
> be unlocked as a whole only when all slices have completed.
> 
> Thus always allocate erofs_fscache_request for each .read_folio() or
> .readahead(). In this case, only when all slices of the folio or folio
> range have completed, the request will be marked as completed and the
> folio or folio range will be unlocked then.
> 
> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>

Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>

Thanks.
> ---
>   fs/erofs/fscache.c | 116 +++++++++++++++++++--------------------------
>   1 file changed, 48 insertions(+), 68 deletions(-)
> 
> diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
> index 3cfe1af7a46e..0643b205c7eb 100644
> --- a/fs/erofs/fscache.c
> +++ b/fs/erofs/fscache.c
> @@ -167,32 +167,18 @@ static int erofs_fscache_meta_read_folio(struct file *data, struct folio *folio)
>   	return ret;
>   }
>   
> -/*
> - * Read into page cache in the range described by (@pos, @len).
> - *
> - * On return, if the output @unlock is true, the caller is responsible for page
> - * unlocking; otherwise the callee will take this responsibility through request
> - * completion.
> - *
> - * The return value is the number of bytes successfully handled, or negative
> - * error code on failure. The only exception is that, the length of the range
> - * instead of the error code is returned on failure after request is allocated,
> - * so that .readahead() could advance rac accordingly.
> - */
> -static int erofs_fscache_data_read(struct address_space *mapping,
> -				   loff_t pos, size_t len, bool *unlock)
> +static int erofs_fscache_data_read_slice(struct erofs_fscache_request *req)
>   {
> +	struct address_space *mapping = req->mapping;
>   	struct inode *inode = mapping->host;
>   	struct super_block *sb = inode->i_sb;
> -	struct erofs_fscache_request *req;
> +	loff_t pos = req->start + req->submitted;
>   	struct erofs_map_blocks map;
>   	struct erofs_map_dev mdev;
>   	struct iov_iter iter;
>   	size_t count;
>   	int ret;
>   
> -	*unlock = true;
> -
>   	map.m_la = pos;
>   	ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
>   	if (ret)
> @@ -201,36 +187,37 @@ static int erofs_fscache_data_read(struct address_space *mapping,
>   	if (map.m_flags & EROFS_MAP_META) {
>   		struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
>   		erofs_blk_t blknr;
> -		size_t offset, size;
> +		size_t offset;
>   		void *src;
>   
>   		/* For tail packing layout, the offset may be non-zero. */
>   		offset = erofs_blkoff(map.m_pa);
>   		blknr = erofs_blknr(map.m_pa);
> -		size = map.m_llen;
> +		count = map.m_llen;
>   
>   		src = erofs_read_metabuf(&buf, sb, blknr, EROFS_KMAP);
>   		if (IS_ERR(src))
>   			return PTR_ERR(src);
>   
> -		iov_iter_xarray(&iter, READ, &mapping->i_pages, pos, PAGE_SIZE);
> -		if (copy_to_iter(src + offset, size, &iter) != size) {
> +		iov_iter_xarray(&iter, READ, &mapping->i_pages, pos, count);
> +		if (copy_to_iter(src + offset, count, &iter) != count) {
>   			erofs_put_metabuf(&buf);
>   			return -EFAULT;
>   		}
> -		iov_iter_zero(PAGE_SIZE - size, &iter);
>   		erofs_put_metabuf(&buf);
> -		return PAGE_SIZE;
> +		req->submitted += count;
> +		return 0;
>   	}
>   
> +	count = req->len - req->submitted;
>   	if (!(map.m_flags & EROFS_MAP_MAPPED)) {
> -		count = len;
>   		iov_iter_xarray(&iter, READ, &mapping->i_pages, pos, count);
>   		iov_iter_zero(count, &iter);
> -		return count;
> +		req->submitted += count;
> +		return 0;
>   	}
>   
> -	count = min_t(size_t, map.m_llen - (pos - map.m_la), len);
> +	count = min_t(size_t, map.m_llen - (pos - map.m_la), count);
>   	DBG_BUGON(!count || count % PAGE_SIZE);
>   
>   	mdev = (struct erofs_map_dev) {
> @@ -241,68 +228,61 @@ static int erofs_fscache_data_read(struct address_space *mapping,
>   	if (ret)
>   		return ret;
>   
> -	req = erofs_fscache_req_alloc(mapping, pos, count);
> -	if (IS_ERR(req))
> -		return PTR_ERR(req);
> -
> -	*unlock = false;
> -	ret = erofs_fscache_read_folios_async(mdev.m_fscache->cookie,
> +	return erofs_fscache_read_folios_async(mdev.m_fscache->cookie,
>   			req, mdev.m_pa + (pos - map.m_la), count);
> -	if (ret)
> -		req->error = ret;
> +}
>   
> -	erofs_fscache_req_put(req);
> -	return count;
> +/*
> + * Read into page cache in the range described by (req->start, req->len).
> + */
> +static int erofs_fscache_data_read(struct erofs_fscache_request *req)
> +{
> +	int ret;
> +
> +	do {
> +		ret = erofs_fscache_data_read_slice(req);
> +		if (ret)
> +			req->error = ret;
> +	} while (!ret && req->submitted < req->len);
> +
> +	return ret;
>   }
>   
>   static int erofs_fscache_read_folio(struct file *file, struct folio *folio)
>   {
> -	bool unlock;
> +	struct erofs_fscache_request *req;
>   	int ret;
>   
> -	DBG_BUGON(folio_size(folio) != EROFS_BLKSIZ);
> -
> -	ret = erofs_fscache_data_read(folio_mapping(folio), folio_pos(folio),
> -				      folio_size(folio), &unlock);
> -	if (unlock) {
> -		if (ret > 0)
> -			folio_mark_uptodate(folio);
> +	req = erofs_fscache_req_alloc(folio_mapping(folio),
> +			folio_pos(folio), folio_size(folio));
> +	if (IS_ERR(req)) {
>   		folio_unlock(folio);
> +		return PTR_ERR(req);
>   	}
> -	return ret < 0 ? ret : 0;
> +
> +	ret = erofs_fscache_data_read(req);
> +	erofs_fscache_req_put(req);
> +	return ret;
>   }
>   
>   static void erofs_fscache_readahead(struct readahead_control *rac)
>   {
> -	struct folio *folio;
> -	size_t len, done = 0;
> -	loff_t start, pos;
> -	bool unlock;
> -	int ret, size;
> +	struct erofs_fscache_request *req;
>   
>   	if (!readahead_count(rac))
>   		return;
>   
> -	start = readahead_pos(rac);
> -	len = readahead_length(rac);
> +	req = erofs_fscache_req_alloc(rac->mapping,
> +			readahead_pos(rac), readahead_length(rac));
> +	if (IS_ERR(req))
> +		return;
>   
> -	do {
> -		pos = start + done;
> -		ret = erofs_fscache_data_read(rac->mapping, pos,
> -					      len - done, &unlock);
> -		if (ret <= 0)
> -			return;
> +	/* The request completion will drop refs on the folios. */
> +	while (readahead_folio(rac))
> +		;
>   
> -		size = ret;
> -		while (size) {
> -			folio = readahead_folio(rac);
> -			size -= folio_size(folio);
> -			if (unlock) {
> -				folio_mark_uptodate(folio);
> -				folio_unlock(folio);
> -			}
> -		}
> -	} while ((done += ret) < len);
> +	erofs_fscache_data_read(req);
> +	erofs_fscache_req_put(req);
>   }
>   
>   static const struct address_space_operations erofs_fscache_meta_aops = {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] erofs: enable large folios for fscache mode
  2022-11-28  2:50 [PATCH v2 0/2] erofs: support large folios for fscache mode Jingbo Xu
  2022-11-28  2:50 ` [PATCH v2 1/2] " Jingbo Xu
@ 2022-11-28  2:50 ` Jingbo Xu
  2022-11-28 13:56   ` [Phishing Risk] [External] " Jia Zhu
  1 sibling, 1 reply; 5+ messages in thread
From: Jingbo Xu @ 2022-11-28  2:50 UTC (permalink / raw)
  To: xiang, chao, linux-erofs; +Cc: linux-kernel

Enable large folios for fscache mode.  Enable this feature for
non-compressed format for now, until the compression part supports large
folios later.

One thing worth noting is that, the feature is not enabled for the meta
data routine since meta inodes don't need large folios for now, nor do
they support readahead yet.

Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/inode.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index e457b8a59ee7..85932086d23f 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -295,8 +295,7 @@ static int erofs_fill_inode(struct inode *inode)
 		goto out_unlock;
 	}
 	inode->i_mapping->a_ops = &erofs_raw_access_aops;
-	if (!erofs_is_fscache_mode(inode->i_sb))
-		mapping_set_large_folios(inode->i_mapping);
+	mapping_set_large_folios(inode->i_mapping);
 #ifdef CONFIG_EROFS_FS_ONDEMAND
 	if (erofs_is_fscache_mode(inode->i_sb))
 		inode->i_mapping->a_ops = &erofs_fscache_access_aops;
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Phishing Risk] [External] [PATCH v2 2/2] erofs: enable large folios for fscache mode
  2022-11-28  2:50 ` [PATCH v2 2/2] erofs: enable " Jingbo Xu
@ 2022-11-28 13:56   ` Jia Zhu
  0 siblings, 0 replies; 5+ messages in thread
From: Jia Zhu @ 2022-11-28 13:56 UTC (permalink / raw)
  To: Jingbo Xu, xiang, chao, linux-erofs; +Cc: linux-kernel



在 2022/11/28 10:50, Jingbo Xu 写道:
> Enable large folios for fscache mode.  Enable this feature for
> non-compressed format for now, until the compression part supports large
> folios later.
> 
> One thing worth noting is that, the feature is not enabled for the meta
> data routine since meta inodes don't need large folios for now, nor do
> they support readahead yet.
> 
> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>

Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>

Thanks.
> ---
>   fs/erofs/inode.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
> index e457b8a59ee7..85932086d23f 100644
> --- a/fs/erofs/inode.c
> +++ b/fs/erofs/inode.c
> @@ -295,8 +295,7 @@ static int erofs_fill_inode(struct inode *inode)
>   		goto out_unlock;
>   	}
>   	inode->i_mapping->a_ops = &erofs_raw_access_aops;
> -	if (!erofs_is_fscache_mode(inode->i_sb))
> -		mapping_set_large_folios(inode->i_mapping);
> +	mapping_set_large_folios(inode->i_mapping);
>   #ifdef CONFIG_EROFS_FS_ONDEMAND
>   	if (erofs_is_fscache_mode(inode->i_sb))
>   		inode->i_mapping->a_ops = &erofs_fscache_access_aops;

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-28 13:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-28  2:50 [PATCH v2 0/2] erofs: support large folios for fscache mode Jingbo Xu
2022-11-28  2:50 ` [PATCH v2 1/2] " Jingbo Xu
2022-11-28 13:54   ` [Phishing Risk] [External] " Jia Zhu
2022-11-28  2:50 ` [PATCH v2 2/2] erofs: enable " Jingbo Xu
2022-11-28 13:56   ` [Phishing Risk] [External] " Jia Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox