All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts
@ 2025-05-13 11:34 Sheng Yong
  2025-05-13 11:34 ` [PATCH 2/2] erofs: avoid using multiple devices with different type Sheng Yong
  2025-05-13 13:56 ` [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Hongbo Li
  0 siblings, 2 replies; 8+ messages in thread
From: Sheng Yong @ 2025-05-13 11:34 UTC (permalink / raw)
  To: xiang, chao, zbestahu, jefflexu, lihongbo22, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong, Wang Shuai

From: Sheng Yong <shengyong1@xiaomi.com>

When attempting to use an archive file, such as APEX on android,
as a file-backed mount source, it fails because EROFS image within
the archive file does not start at offset 0. As a result, a loop
device is still needed to attach the image file at an appropriate
offset first. Similarly, if an EROFS image within a block device
does not start at offset 0, it cannot be mounted directly either.

To address this issue, this patch adds a new mount option `fsoffset=x'
to accept a start offset for both file-backed and bdev-based mounts.
The offset should be aligned to block size. EROFS will add this offset
before performing read requests.

Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
---
 Documentation/filesystems/erofs.rst |  1 +
 fs/erofs/data.c                     |  8 ++++++--
 fs/erofs/fileio.c                   |  3 ++-
 fs/erofs/internal.h                 |  2 ++
 fs/erofs/super.c                    | 12 +++++++++++-
 fs/erofs/zdata.c                    |  3 ++-
 6 files changed, 24 insertions(+), 5 deletions(-)
---
v5: * fix fsoffset on multiple device by adding off when creating io
      request, erofs_map_device selects the target device an only
      primary device has an off
    * remove unnecessary checks of fsoffset value
    * try to combine off and dax_part_off, but it is not easy to do
      that, because dax_part_off is not needed when reading metadata

v4: * change mount option `offset=x' to `fsoffset=x'
https://lore.kernel.org/linux-erofs/c5110e03-90ea-40be-b05f-bc920332a1e1@linux.alibaba.com

v3: * rename `offs' to `off'
    * parse offset using fsparam_u64 and validate it in fill_super
    * update bi_sector inline
    https://lore.kernel.org/linux-erofs/98585dd8-d0b6-4000-b46d-a08c64eae44d@linux.alibaba.com

v2: * add a new mount option `offset=X' for start offset, and offset
       should be aligned to PAGE_SIZE
    * add start offset for both file-backed and bdev-based mounts
    https://lore.kernel.org/linux-erofs/0725c2ec-528c-42a8-9557-4713e7e35153@linux.alibaba.com

v1: https://lore.kernel.org/all/20250324022849.2715578-1-shengyong1@xiaomi.com/

diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
index c293f8e37468..0fa4c7826203 100644
--- a/Documentation/filesystems/erofs.rst
+++ b/Documentation/filesystems/erofs.rst
@@ -128,6 +128,7 @@ device=%s              Specify a path to an extra device to be used together.
 fsid=%s                Specify a filesystem image ID for Fscache back-end.
 domain_id=%s           Specify a domain ID in fscache mode so that different images
                        with the same blobs under a given domain ID can share storage.
+fsoffset=%s            Specify image offset for file-backed or bdev-based mounts.
 ===================    =========================================================
 
 Sysfs Entries
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 2409d2ab0c28..599a44d5d782 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -27,9 +27,12 @@ void erofs_put_metabuf(struct erofs_buf *buf)
 
 void *erofs_bread(struct erofs_buf *buf, erofs_off_t offset, bool need_kmap)
 {
-	pgoff_t index = offset >> PAGE_SHIFT;
+	pgoff_t index;
 	struct folio *folio = NULL;
 
+	offset += buf->off;
+	index = offset >> PAGE_SHIFT;
+
 	if (buf->page) {
 		folio = page_folio(buf->page);
 		if (folio_file_page(folio, index) != buf->page)
@@ -54,6 +57,7 @@ void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb)
 	struct erofs_sb_info *sbi = EROFS_SB(sb);
 
 	buf->file = NULL;
+	buf->off = sbi->dif0.off;
 	if (erofs_is_fileio_mode(sbi)) {
 		buf->file = sbi->dif0.file;	/* some fs like FUSE needs it */
 		buf->mapping = buf->file->f_mapping;
@@ -299,7 +303,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 		iomap->private = buf.base;
 	} else {
 		iomap->type = IOMAP_MAPPED;
-		iomap->addr = mdev.m_pa;
+		iomap->addr = mdev.m_dif->off + mdev.m_pa;
 		if (flags & IOMAP_DAX)
 			iomap->addr += mdev.m_dif->dax_part_off;
 	}
diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c
index 60c7cc4c105c..a2c7001ff789 100644
--- a/fs/erofs/fileio.c
+++ b/fs/erofs/fileio.c
@@ -147,7 +147,8 @@ static int erofs_fileio_scan_folio(struct erofs_fileio *io, struct folio *folio)
 				if (err)
 					break;
 				io->rq = erofs_fileio_rq_alloc(&io->dev);
-				io->rq->bio.bi_iter.bi_sector = io->dev.m_pa >> 9;
+				io->rq->bio.bi_iter.bi_sector =
+					(io->dev.m_dif->off + io->dev.m_pa) >> 9;
 				attached = 0;
 			}
 			if (!bio_add_folio(&io->rq->bio, folio, len, cur))
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 4ac188d5d894..10656bd986bd 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -43,6 +43,7 @@ struct erofs_device_info {
 	char *path;
 	struct erofs_fscache *fscache;
 	struct file *file;
+	loff_t off;
 	struct dax_device *dax_dev;
 	u64 dax_part_off;
 
@@ -199,6 +200,7 @@ enum {
 struct erofs_buf {
 	struct address_space *mapping;
 	struct file *file;
+	loff_t off;
 	struct page *page;
 	void *base;
 };
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index da6ee7c39290..512877d7d855 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -356,7 +356,7 @@ static void erofs_default_options(struct erofs_sb_info *sbi)
 
 enum {
 	Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum,
-	Opt_device, Opt_fsid, Opt_domain_id, Opt_directio,
+	Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, Opt_fsoffset,
 };
 
 static const struct constant_table erofs_param_cache_strategy[] = {
@@ -383,6 +383,7 @@ static const struct fs_parameter_spec erofs_fs_parameters[] = {
 	fsparam_string("fsid",		Opt_fsid),
 	fsparam_string("domain_id",	Opt_domain_id),
 	fsparam_flag_no("directio",	Opt_directio),
+	fsparam_u64("fsoffset",		Opt_fsoffset),
 	{}
 };
 
@@ -506,6 +507,9 @@ static int erofs_fc_parse_param(struct fs_context *fc,
 		errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name);
 #endif
 		break;
+	case Opt_fsoffset:
+		sbi->dif0.off = result.uint_64;
+		break;
 	}
 	return 0;
 }
@@ -599,6 +603,10 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 				&sbi->dif0.dax_part_off, NULL, NULL);
 	}
 
+	if (sbi->dif0.off & ((1 << sbi->blkszbits) - 1))
+		return invalfc(fc, "fsoffset %lld not aligned to block size",
+			       sbi->dif0.off);
+
 	err = erofs_read_superblock(sb);
 	if (err)
 		return err;
@@ -947,6 +955,8 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
 	if (sbi->domain_id)
 		seq_printf(seq, ",domain_id=%s", sbi->domain_id);
 #endif
+	if (sbi->dif0.off)
+		seq_printf(seq, ",fsoffset=%lld", sbi->dif0.off);
 	return 0;
 }
 
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index b8e6b76c23d5..4f910d7ffb2f 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -1707,7 +1707,8 @@ static void z_erofs_submit_queue(struct z_erofs_frontend *f,
 					bio = bio_alloc(mdev.m_bdev, BIO_MAX_VECS,
 							REQ_OP_READ, GFP_NOIO);
 				bio->bi_end_io = z_erofs_endio;
-				bio->bi_iter.bi_sector = cur >> 9;
+				bio->bi_iter.bi_sector =
+						(mdev.m_dif->off + cur) >> 9;
 				bio->bi_private = q[JQ_SUBMIT];
 				if (readahead)
 					bio->bi_opf |= REQ_RAHEAD;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] erofs: avoid using multiple devices with different type
  2025-05-13 11:34 [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Sheng Yong
@ 2025-05-13 11:34 ` Sheng Yong
  2025-05-13 15:10   ` Hongbo Li
  2025-05-14 11:51   ` Gao Xiang
  2025-05-13 13:56 ` [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Hongbo Li
  1 sibling, 2 replies; 8+ messages in thread
From: Sheng Yong @ 2025-05-13 11:34 UTC (permalink / raw)
  To: xiang, chao, zbestahu, jefflexu, lihongbo22, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong

From: Sheng Yong <shengyong1@xiaomi.com>

For multiple devices, both primary and extra devices should be the
same type. `erofs_init_device` has already guaranteed that if the
primary is a file-backed device, extra devices should also be
regular files.

However, if the primary is a block device while the extra device
is a file-backed device, `erofs_init_device` will get an ENOTBLK,
which is not treated as an error in `erofs_fc_get_tree`, and that
leads to an UAF:

  erofs_fc_get_tree
    get_tree_bdev_flags(erofs_fc_fill_super)
      erofs_read_superblock
        erofs_init_device  // sbi->dif0 is not inited yet,
                           // return -ENOTBLK
      deactivate_locked_super
        free(sbi)
    if (err is -ENOTBLK)
      sbi->dif0.file = filp_open()  // sbi UAF

So if -ENOTBLK is hitted in `erofs_init_device`, it means the
primary device must be a block device, and the extra device
is not a block device. The error can be converted to -EINVAL.

Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
---
 fs/erofs/super.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 512877d7d855..16b5b1f66584 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -165,8 +165,11 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
 				filp_open(dif->path, O_RDONLY | O_LARGEFILE, 0) :
 				bdev_file_open_by_path(dif->path,
 						BLK_OPEN_READ, sb->s_type, NULL);
-		if (IS_ERR(file))
+		if (IS_ERR(file)) {
+			if (PTR_ERR(file) == -ENOTBLK)
+				file = ERR_PTR(-EINVAL);
 			return PTR_ERR(file);
+		}
 
 		if (!erofs_is_fileio_mode(sbi)) {
 			dif->dax_dev = fs_dax_get_by_bdev(file_bdev(file),
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts
  2025-05-13 11:34 [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Sheng Yong
  2025-05-13 11:34 ` [PATCH 2/2] erofs: avoid using multiple devices with different type Sheng Yong
@ 2025-05-13 13:56 ` Hongbo Li
  2025-05-13 13:59   ` Hongbo Li
  1 sibling, 1 reply; 8+ messages in thread
From: Hongbo Li @ 2025-05-13 13:56 UTC (permalink / raw)
  To: Sheng Yong, xiang, chao, zbestahu, jefflexu, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong, Wang Shuai



On 2025/5/13 19:34, Sheng Yong wrote:
> From: Sheng Yong <shengyong1@xiaomi.com>
> 
> When attempting to use an archive file, such as APEX on android,
> as a file-backed mount source, it fails because EROFS image within
> the archive file does not start at offset 0. As a result, a loop
> device is still needed to attach the image file at an appropriate
> offset first. Similarly, if an EROFS image within a block device
> does not start at offset 0, it cannot be mounted directly either.
> 
> To address this issue, this patch adds a new mount option `fsoffset=x'
> to accept a start offset for both file-backed and bdev-based mounts.
> The offset should be aligned to block size. EROFS will add this offset
> before performing read requests.
> 
> Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
> Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
> ---
>   Documentation/filesystems/erofs.rst |  1 +
>   fs/erofs/data.c                     |  8 ++++++--
>   fs/erofs/fileio.c                   |  3 ++-
>   fs/erofs/internal.h                 |  2 ++
>   fs/erofs/super.c                    | 12 +++++++++++-
>   fs/erofs/zdata.c                    |  3 ++-
>   6 files changed, 24 insertions(+), 5 deletions(-)
> ---
> v5: * fix fsoffset on multiple device by adding off when creating io
>        request, erofs_map_device selects the target device an only
>        primary device has an off
>      * remove unnecessary checks of fsoffset value
>      * try to combine off and dax_part_off, but it is not easy to do
>        that, because dax_part_off is not needed when reading metadata
> 
> v4: * change mount option `offset=x' to `fsoffset=x'
> https://lore.kernel.org/linux-erofs/c5110e03-90ea-40be-b05f-bc920332a1e1@linux.alibaba.com
> 
> v3: * rename `offs' to `off'
>      * parse offset using fsparam_u64 and validate it in fill_super
>      * update bi_sector inline
>      https://lore.kernel.org/linux-erofs/98585dd8-d0b6-4000-b46d-a08c64eae44d@linux.alibaba.com
> 
> v2: * add a new mount option `offset=X' for start offset, and offset
>         should be aligned to PAGE_SIZE
>      * add start offset for both file-backed and bdev-based mounts
>      https://lore.kernel.org/linux-erofs/0725c2ec-528c-42a8-9557-4713e7e35153@linux.alibaba.com
> 
> v1: https://lore.kernel.org/all/20250324022849.2715578-1-shengyong1@xiaomi.com/
> 
> diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
> index c293f8e37468..0fa4c7826203 100644
> --- a/Documentation/filesystems/erofs.rst
> +++ b/Documentation/filesystems/erofs.rst
> @@ -128,6 +128,7 @@ device=%s              Specify a path to an extra device to be used together.
>   fsid=%s                Specify a filesystem image ID for Fscache back-end.
>   domain_id=%s           Specify a domain ID in fscache mode so that different images
>                          with the same blobs under a given domain ID can share storage.
> +fsoffset=%s            Specify image offset for file-backed or bdev-based mounts.
>   ===================    =========================================================
>   
>   Sysfs Entries
> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
> index 2409d2ab0c28..599a44d5d782 100644
> --- a/fs/erofs/data.c
> +++ b/fs/erofs/data.c
> @@ -27,9 +27,12 @@ void erofs_put_metabuf(struct erofs_buf *buf)
>   
>   void *erofs_bread(struct erofs_buf *buf, erofs_off_t offset, bool need_kmap)
>   {
> -	pgoff_t index = offset >> PAGE_SHIFT;
> +	pgoff_t index;
>   	struct folio *folio = NULL;
>   
> +	offset += buf->off;
> +	index = offset >> PAGE_SHIFT;
> +
>   	if (buf->page) {
>   		folio = page_folio(buf->page);
>   		if (folio_file_page(folio, index) != buf->page)
> @@ -54,6 +57,7 @@ void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb)
>   	struct erofs_sb_info *sbi = EROFS_SB(sb);
>   
>   	buf->file = NULL;
> +	buf->off = sbi->dif0.off;
>   	if (erofs_is_fileio_mode(sbi)) {
>   		buf->file = sbi->dif0.file;	/* some fs like FUSE needs it */
>   		buf->mapping = buf->file->f_mapping;
> @@ -299,7 +303,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
>   		iomap->private = buf.base;
>   	} else {
>   		iomap->type = IOMAP_MAPPED;
> -		iomap->addr = mdev.m_pa;
> +		iomap->addr = mdev.m_dif->off + mdev.m_pa;
>   		if (flags & IOMAP_DAX)
>   			iomap->addr += mdev.m_dif->dax_part_off;
>   	}
> diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c
> index 60c7cc4c105c..a2c7001ff789 100644
> --- a/fs/erofs/fileio.c
> +++ b/fs/erofs/fileio.c
> @@ -147,7 +147,8 @@ static int erofs_fileio_scan_folio(struct erofs_fileio *io, struct folio *folio)
>   				if (err)
>   					break;
>   				io->rq = erofs_fileio_rq_alloc(&io->dev);
> -				io->rq->bio.bi_iter.bi_sector = io->dev.m_pa >> 9;
> +				io->rq->bio.bi_iter.bi_sector =
> +					(io->dev.m_dif->off + io->dev.m_pa) >> 9;
>   				attached = 0;
>   			}
>   			if (!bio_add_folio(&io->rq->bio, folio, len, cur))
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index 4ac188d5d894..10656bd986bd 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -43,6 +43,7 @@ struct erofs_device_info {
>   	char *path;
>   	struct erofs_fscache *fscache;
>   	struct file *file;
> +	loff_t off;

Use u64 is better?

>   	struct dax_device *dax_dev;
>   	u64 dax_part_off;
>   
> @@ -199,6 +200,7 @@ enum {
>   struct erofs_buf {
>   	struct address_space *mapping;
>   	struct file *file;
> +	loff_t off;

Same here.

>   	struct page *page;
>   	void *base;
>   };
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index da6ee7c39290..512877d7d855 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -356,7 +356,7 @@ static void erofs_default_options(struct erofs_sb_info *sbi)
>   
>   enum {
>   	Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum,
> -	Opt_device, Opt_fsid, Opt_domain_id, Opt_directio,
> +	Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, Opt_fsoffset,
>   };
>   
>   static const struct constant_table erofs_param_cache_strategy[] = {
> @@ -383,6 +383,7 @@ static const struct fs_parameter_spec erofs_fs_parameters[] = {
>   	fsparam_string("fsid",		Opt_fsid),
>   	fsparam_string("domain_id",	Opt_domain_id),
>   	fsparam_flag_no("directio",	Opt_directio),
> +	fsparam_u64("fsoffset",		Opt_fsoffset),
>   	{}
>   };
>   
> @@ -506,6 +507,9 @@ static int erofs_fc_parse_param(struct fs_context *fc,
>   		errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name);
>   #endif
>   		break;
> +	case Opt_fsoffset:
> +		sbi->dif0.off = result.uint_64;
> +		break;
>   	}
>   	return 0;
>   }
> @@ -599,6 +603,10 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
>   				&sbi->dif0.dax_part_off, NULL, NULL);
>   	}
>   
> +	if (sbi->dif0.off & ((1 << sbi->blkszbits) - 1))
> +		return invalfc(fc, "fsoffset %lld not aligned to block size",
> +			       sbi->dif0.off);
> +
>   	err = erofs_read_superblock(sb);
>   	if (err)
>   		return err;
> @@ -947,6 +955,8 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
>   	if (sbi->domain_id)
>   		seq_printf(seq, ",domain_id=%s", sbi->domain_id);
>   #endif
> +	if (sbi->dif0.off)
> +		seq_printf(seq, ",fsoffset=%lld", sbi->dif0.off);
>   	return 0;
>   }
>   
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index b8e6b76c23d5..4f910d7ffb2f 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -1707,7 +1707,8 @@ static void z_erofs_submit_queue(struct z_erofs_frontend *f,
>   					bio = bio_alloc(mdev.m_bdev, BIO_MAX_VECS,
>   							REQ_OP_READ, GFP_NOIO);
>   				bio->bi_end_io = z_erofs_endio;
> -				bio->bi_iter.bi_sector = cur >> 9;
> +				bio->bi_iter.bi_sector =
> +						(mdev.m_dif->off + cur) >> 9;
>   				bio->bi_private = q[JQ_SUBMIT];
>   				if (readahead)
>   					bio->bi_opf |= REQ_RAHEAD;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts
  2025-05-13 13:56 ` [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Hongbo Li
@ 2025-05-13 13:59   ` Hongbo Li
  2025-05-14  1:41     ` Sheng Yong
  0 siblings, 1 reply; 8+ messages in thread
From: Hongbo Li @ 2025-05-13 13:59 UTC (permalink / raw)
  To: Sheng Yong, xiang, chao, zbestahu, jefflexu, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong, Wang Shuai



On 2025/5/13 21:56, Hongbo Li wrote:
> 
> 
> On 2025/5/13 19:34, Sheng Yong wrote:
>> From: Sheng Yong <shengyong1@xiaomi.com>
>>
>> When attempting to use an archive file, such as APEX on android,
>> as a file-backed mount source, it fails because EROFS image within
>> the archive file does not start at offset 0. As a result, a loop
>> device is still needed to attach the image file at an appropriate
>> offset first. Similarly, if an EROFS image within a block device
>> does not start at offset 0, it cannot be mounted directly either.
>>
>> To address this issue, this patch adds a new mount option `fsoffset=x'
>> to accept a start offset for both file-backed and bdev-based mounts.
>> The offset should be aligned to block size. EROFS will add this offset
>> before performing read requests.
>>
>> Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
>> Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
>> ---
>>   Documentation/filesystems/erofs.rst |  1 +
>>   fs/erofs/data.c                     |  8 ++++++--
>>   fs/erofs/fileio.c                   |  3 ++-
>>   fs/erofs/internal.h                 |  2 ++
>>   fs/erofs/super.c                    | 12 +++++++++++-
>>   fs/erofs/zdata.c                    |  3 ++-
>>   6 files changed, 24 insertions(+), 5 deletions(-)
>> ---
>> v5: * fix fsoffset on multiple device by adding off when creating io
>>        request, erofs_map_device selects the target device an only
>>        primary device has an off
>>      * remove unnecessary checks of fsoffset value
>>      * try to combine off and dax_part_off, but it is not easy to do
>>        that, because dax_part_off is not needed when reading metadata
>>
>> v4: * change mount option `offset=x' to `fsoffset=x'
>> https://lore.kernel.org/linux-erofs/c5110e03-90ea-40be-b05f-bc920332a1e1@linux.alibaba.com
>>
>> v3: * rename `offs' to `off'
>>      * parse offset using fsparam_u64 and validate it in fill_super
>>      * update bi_sector inline
>>      
>> https://lore.kernel.org/linux-erofs/98585dd8-d0b6-4000-b46d-a08c64eae44d@linux.alibaba.com
>>
>> v2: * add a new mount option `offset=X' for start offset, and offset
>>         should be aligned to PAGE_SIZE
>>      * add start offset for both file-backed and bdev-based mounts
>>      
>> https://lore.kernel.org/linux-erofs/0725c2ec-528c-42a8-9557-4713e7e35153@linux.alibaba.com
>>
>> v1: 
>> https://lore.kernel.org/all/20250324022849.2715578-1-shengyong1@xiaomi.com/
>>
>> diff --git a/Documentation/filesystems/erofs.rst 
>> b/Documentation/filesystems/erofs.rst
>> index c293f8e37468..0fa4c7826203 100644
>> --- a/Documentation/filesystems/erofs.rst
>> +++ b/Documentation/filesystems/erofs.rst
>> @@ -128,6 +128,7 @@ device=%s              Specify a path to an extra 
>> device to be used together.
>>   fsid=%s                Specify a filesystem image ID for Fscache 
>> back-end.
>>   domain_id=%s           Specify a domain ID in fscache mode so that 
>> different images
>>                          with the same blobs under a given domain ID 
>> can share storage.
>> +fsoffset=%s            Specify image offset for file-backed or 
>> bdev-based mounts.
Hi, Yong

fsoffset should be formatted with %lu ?

Thanks,
Hongbo

>>   ===================    
>> =========================================================
>>   Sysfs Entries
>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>> index 2409d2ab0c28..599a44d5d782 100644
>> --- a/fs/erofs/data.c
>> +++ b/fs/erofs/data.c
>> @@ -27,9 +27,12 @@ void erofs_put_metabuf(struct erofs_buf *buf)
>>   void *erofs_bread(struct erofs_buf *buf, erofs_off_t offset, bool 
>> need_kmap)
>>   {
>> -    pgoff_t index = offset >> PAGE_SHIFT;
>> +    pgoff_t index;
>>       struct folio *folio = NULL;
>> +    offset += buf->off;
>> +    index = offset >> PAGE_SHIFT;
>> +
>>       if (buf->page) {
>>           folio = page_folio(buf->page);
>>           if (folio_file_page(folio, index) != buf->page)
>> @@ -54,6 +57,7 @@ void erofs_init_metabuf(struct erofs_buf *buf, 
>> struct super_block *sb)
>>       struct erofs_sb_info *sbi = EROFS_SB(sb);
>>       buf->file = NULL;
>> +    buf->off = sbi->dif0.off;
>>       if (erofs_is_fileio_mode(sbi)) {
>>           buf->file = sbi->dif0.file;    /* some fs like FUSE needs it */
>>           buf->mapping = buf->file->f_mapping;
>> @@ -299,7 +303,7 @@ static int erofs_iomap_begin(struct inode *inode, 
>> loff_t offset, loff_t length,
>>           iomap->private = buf.base;
>>       } else {
>>           iomap->type = IOMAP_MAPPED;
>> -        iomap->addr = mdev.m_pa;
>> +        iomap->addr = mdev.m_dif->off + mdev.m_pa;
>>           if (flags & IOMAP_DAX)
>>               iomap->addr += mdev.m_dif->dax_part_off;
>>       }
>> diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c
>> index 60c7cc4c105c..a2c7001ff789 100644
>> --- a/fs/erofs/fileio.c
>> +++ b/fs/erofs/fileio.c
>> @@ -147,7 +147,8 @@ static int erofs_fileio_scan_folio(struct 
>> erofs_fileio *io, struct folio *folio)
>>                   if (err)
>>                       break;
>>                   io->rq = erofs_fileio_rq_alloc(&io->dev);
>> -                io->rq->bio.bi_iter.bi_sector = io->dev.m_pa >> 9;
>> +                io->rq->bio.bi_iter.bi_sector =
>> +                    (io->dev.m_dif->off + io->dev.m_pa) >> 9;
>>                   attached = 0;
>>               }
>>               if (!bio_add_folio(&io->rq->bio, folio, len, cur))
>> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
>> index 4ac188d5d894..10656bd986bd 100644
>> --- a/fs/erofs/internal.h
>> +++ b/fs/erofs/internal.h
>> @@ -43,6 +43,7 @@ struct erofs_device_info {
>>       char *path;
>>       struct erofs_fscache *fscache;
>>       struct file *file;
>> +    loff_t off;
> 
> Use u64 is better?
> 
>>       struct dax_device *dax_dev;
>>       u64 dax_part_off;
>> @@ -199,6 +200,7 @@ enum {
>>   struct erofs_buf {
>>       struct address_space *mapping;
>>       struct file *file;
>> +    loff_t off;
> 
> Same here.
> 
>>       struct page *page;
>>       void *base;
>>   };
>> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
>> index da6ee7c39290..512877d7d855 100644
>> --- a/fs/erofs/super.c
>> +++ b/fs/erofs/super.c
>> @@ -356,7 +356,7 @@ static void erofs_default_options(struct 
>> erofs_sb_info *sbi)
>>   enum {
>>       Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum,
>> -    Opt_device, Opt_fsid, Opt_domain_id, Opt_directio,
>> +    Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, Opt_fsoffset,
>>   };
>>   static const struct constant_table erofs_param_cache_strategy[] = {
>> @@ -383,6 +383,7 @@ static const struct fs_parameter_spec 
>> erofs_fs_parameters[] = {
>>       fsparam_string("fsid",        Opt_fsid),
>>       fsparam_string("domain_id",    Opt_domain_id),
>>       fsparam_flag_no("directio",    Opt_directio),
>> +    fsparam_u64("fsoffset",        Opt_fsoffset),
>>       {}
>>   };
>> @@ -506,6 +507,9 @@ static int erofs_fc_parse_param(struct fs_context 
>> *fc,
>>           errorfc(fc, "%s option not supported", 
>> erofs_fs_parameters[opt].name);
>>   #endif
>>           break;
>> +    case Opt_fsoffset:
>> +        sbi->dif0.off = result.uint_64;
>> +        break;
>>       }
>>       return 0;
>>   }
>> @@ -599,6 +603,10 @@ static int erofs_fc_fill_super(struct super_block 
>> *sb, struct fs_context *fc)
>>                   &sbi->dif0.dax_part_off, NULL, NULL);
>>       }
>> +    if (sbi->dif0.off & ((1 << sbi->blkszbits) - 1))
>> +        return invalfc(fc, "fsoffset %lld not aligned to block size",
>> +                   sbi->dif0.off);
>> +
>>       err = erofs_read_superblock(sb);
>>       if (err)
>>           return err;
>> @@ -947,6 +955,8 @@ static int erofs_show_options(struct seq_file 
>> *seq, struct dentry *root)
>>       if (sbi->domain_id)
>>           seq_printf(seq, ",domain_id=%s", sbi->domain_id);
>>   #endif
>> +    if (sbi->dif0.off)
>> +        seq_printf(seq, ",fsoffset=%lld", sbi->dif0.off);
>>       return 0;
>>   }
>> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
>> index b8e6b76c23d5..4f910d7ffb2f 100644
>> --- a/fs/erofs/zdata.c
>> +++ b/fs/erofs/zdata.c
>> @@ -1707,7 +1707,8 @@ static void z_erofs_submit_queue(struct 
>> z_erofs_frontend *f,
>>                       bio = bio_alloc(mdev.m_bdev, BIO_MAX_VECS,
>>                               REQ_OP_READ, GFP_NOIO);
>>                   bio->bi_end_io = z_erofs_endio;
>> -                bio->bi_iter.bi_sector = cur >> 9;
>> +                bio->bi_iter.bi_sector =
>> +                        (mdev.m_dif->off + cur) >> 9;
>>                   bio->bi_private = q[JQ_SUBMIT];
>>                   if (readahead)
>>                       bio->bi_opf |= REQ_RAHEAD;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] erofs: avoid using multiple devices with different type
  2025-05-13 11:34 ` [PATCH 2/2] erofs: avoid using multiple devices with different type Sheng Yong
@ 2025-05-13 15:10   ` Hongbo Li
  2025-05-14 10:07     ` Sheng Yong
  2025-05-14 11:51   ` Gao Xiang
  1 sibling, 1 reply; 8+ messages in thread
From: Hongbo Li @ 2025-05-13 15:10 UTC (permalink / raw)
  To: Sheng Yong, xiang, chao, zbestahu, jefflexu, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong



On 2025/5/13 19:34, Sheng Yong wrote:
> From: Sheng Yong <shengyong1@xiaomi.com>
> 
> For multiple devices, both primary and extra devices should be the
> same type. `erofs_init_device` has already guaranteed that if the
> primary is a file-backed device, extra devices should also be
> regular files.
> 
> However, if the primary is a block device while the extra device
> is a file-backed device, `erofs_init_device` will get an ENOTBLK,
> which is not treated as an error in `erofs_fc_get_tree`, and that
> leads to an UAF:
> 
>    erofs_fc_get_tree
>      get_tree_bdev_flags(erofs_fc_fill_super)
>        erofs_read_superblock
>          erofs_init_device  // sbi->dif0 is not inited yet,
>                             // return -ENOTBLK
>        deactivate_locked_super
>          free(sbi)
>      if (err is -ENOTBLK)
>        sbi->dif0.file = filp_open()  // sbi UAF
> 
> So if -ENOTBLK is hitted in `erofs_init_device`, it means the
> primary device must be a block device, and the extra device
> is not a block device. The error can be converted to -EINVAL.
> 
> Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
> ---
>   fs/erofs/super.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 512877d7d855..16b5b1f66584 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -165,8 +165,11 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
>   				filp_open(dif->path, O_RDONLY | O_LARGEFILE, 0) :
>   				bdev_file_open_by_path(dif->path,
>   						BLK_OPEN_READ, sb->s_type, NULL);
> -		if (IS_ERR(file))
> +		if (IS_ERR(file)) {
> +			if (PTR_ERR(file) == -ENOTBLK)
> +				file = ERR_PTR(-EINVAL);
>   			return PTR_ERR(file);

Hi, Yong

Thank you, I think it is indeed a UAF problem. This fixes the problem 
introduced by fb176750266a ("erofs: add file-backed mount support"). How 
about considering adding the fixes tag?

In addition, I wonder may be we can only check the fc->s_fs_info (we can 
set it to NULL in .kill_sb) in erofs_fc_get_tree rather than change the 
error code. So this way we can reback the correct error message to user.

Thanks,
Hongbo

> +		}
>   
>   		if (!erofs_is_fileio_mode(sbi)) {
>   			dif->dax_dev = fs_dax_get_by_bdev(file_bdev(file),


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts
  2025-05-13 13:59   ` Hongbo Li
@ 2025-05-14  1:41     ` Sheng Yong
  0 siblings, 0 replies; 8+ messages in thread
From: Sheng Yong @ 2025-05-14  1:41 UTC (permalink / raw)
  To: Hongbo Li, xiang, chao, zbestahu, jefflexu, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong, Wang Shuai

On 5/13/25 21:59, Hongbo Li wrote:
> 
> 
> On 2025/5/13 21:56, Hongbo Li wrote:
>>
>>
>> On 2025/5/13 19:34, Sheng Yong wrote:
>>> From: Sheng Yong <shengyong1@xiaomi.com>
>>>
[...]
>>> can share storage.
>>> +fsoffset=%s            Specify image offset for file-backed or bdev- 
>>> based mounts.
> Hi, Yong
> 
> fsoffset should be formatted with %lu ?

Oops, yes, it should be %lu.

thanks
> 
> Thanks,
> Hongbo
> 
[...]



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] erofs: avoid using multiple devices with different type
  2025-05-13 15:10   ` Hongbo Li
@ 2025-05-14 10:07     ` Sheng Yong
  0 siblings, 0 replies; 8+ messages in thread
From: Sheng Yong @ 2025-05-14 10:07 UTC (permalink / raw)
  To: linux-erofs

On 5/13/25 23:10, Hongbo Li wrote:
> 
[...]
>>
>> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
>> index 512877d7d855..16b5b1f66584 100644
>> --- a/fs/erofs/super.c
>> +++ b/fs/erofs/super.c
>> @@ -165,8 +165,11 @@ static int erofs_init_device(struct erofs_buf 
>> *buf, struct super_block *sb,
>>                   filp_open(dif->path, O_RDONLY | O_LARGEFILE, 0) :
>>                   bdev_file_open_by_path(dif->path,
>>                           BLK_OPEN_READ, sb->s_type, NULL);
>> -        if (IS_ERR(file))
>> +        if (IS_ERR(file)) {
>> +            if (PTR_ERR(file) == -ENOTBLK)
>> +                file = ERR_PTR(-EINVAL);
>>               return PTR_ERR(file);
> 
> Hi, Yong
> 
> Thank you, I think it is indeed a UAF problem. This fixes the problem 
> introduced by fb176750266a ("erofs: add file-backed mount support"). How 
> about considering adding the fixes tag?

Hi, Hongbo,

Thanks for the comment. Will add a fix tag.
> 
> In addition, I wonder may be we can only check the fc->s_fs_info (we can 
> set it to NULL in .kill_sb) in erofs_fc_get_tree rather than change the 
> error code. So this way we can reback the correct error message to user.

fc is not available in .kill_sb. And in this scenario, fc->s_fs_info is
already set to NULL in get_tree_bdev_flags(primary bdev)=>sget_dev()=>
sget_fc() before fill_super. So we cannot use fc->s_fs_info to indicate
the error.
Since EROFS already handles case of primary=regular & extra=bdev, I
think we could return the same errno (-EINVAL).

thanks,
shengyong

> 
> Thanks,
> Hongbo
> 
>> +        }
>>           if (!erofs_is_fileio_mode(sbi)) {
>>               dif->dax_dev = fs_dax_get_by_bdev(file_bdev(file),
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] erofs: avoid using multiple devices with different type
  2025-05-13 11:34 ` [PATCH 2/2] erofs: avoid using multiple devices with different type Sheng Yong
  2025-05-13 15:10   ` Hongbo Li
@ 2025-05-14 11:51   ` Gao Xiang
  1 sibling, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2025-05-14 11:51 UTC (permalink / raw)
  To: Sheng Yong, xiang, chao, zbestahu, jefflexu, lihongbo22, dhavale
  Cc: linux-erofs, linux-kernel, Sheng Yong

Hi Yong,

On 2025/5/13 19:34, Sheng Yong wrote:
> From: Sheng Yong <shengyong1@xiaomi.com>
> 
> For multiple devices, both primary and extra devices should be the
> same type. `erofs_init_device` has already guaranteed that if the
> primary is a file-backed device, extra devices should also be
> regular files.
> 
> However, if the primary is a block device while the extra device
> is a file-backed device, `erofs_init_device` will get an ENOTBLK,
> which is not treated as an error in `erofs_fc_get_tree`, and that
> leads to an UAF:
> 
>    erofs_fc_get_tree
>      get_tree_bdev_flags(erofs_fc_fill_super)
>        erofs_read_superblock
>          erofs_init_device  // sbi->dif0 is not inited yet,
>                             // return -ENOTBLK
>        deactivate_locked_super
>          free(sbi)
>      if (err is -ENOTBLK)
>        sbi->dif0.file = filp_open()  // sbi UAF
> 
> So if -ENOTBLK is hitted in `erofs_init_device`, it means the
> primary device must be a block device, and the extra device
> is not a block device. The error can be converted to -EINVAL.

Yeah, nice catch.

As Hongbo said, it'd be better to add "Fixes:" tag
in the next version.

> 
> Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
> ---
>   fs/erofs/super.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 512877d7d855..16b5b1f66584 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -165,8 +165,11 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
>   				filp_open(dif->path, O_RDONLY | O_LARGEFILE, 0) :
>   				bdev_file_open_by_path(dif->path,
>   						BLK_OPEN_READ, sb->s_type, NULL);
> -		if (IS_ERR(file))
> +		if (IS_ERR(file)) {
> +			if (PTR_ERR(file) == -ENOTBLK)

It's preferred to use:
			if (file == ERR_PTR(-ENOTBLK))
				return -EINVAL;

Otherwise it looks good to me.

Could you submit it as a seperate patch so I
could apply directly?

Thanks,
Gao Xiang



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-05-14 11:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-13 11:34 [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Sheng Yong
2025-05-13 11:34 ` [PATCH 2/2] erofs: avoid using multiple devices with different type Sheng Yong
2025-05-13 15:10   ` Hongbo Li
2025-05-14 10:07     ` Sheng Yong
2025-05-14 11:51   ` Gao Xiang
2025-05-13 13:56 ` [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts Hongbo Li
2025-05-13 13:59   ` Hongbo Li
2025-05-14  1:41     ` Sheng Yong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.