linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] loop: sync filesystem cache before getting file size in get_size()
@ 2025-08-07 23:25 Rajeev Mishra
  2025-08-08  2:47 ` Yu Kuai
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Rajeev Mishra @ 2025-08-07 23:25 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-kernel, Rajeev Mishra

The get_size() function now uses vfs_getattr_nosec() with AT_STATX_SYNC_AS_STAT
to ensure filesystem cache is synchronized before retrieving file size. This
provides more accurate size information, especially when:

- The backing file size has been changed by another process
- The file is on a network filesystem (NFS, CIFS, etc.)
- The file is being modified concurrently
- The most accurate size is needed for loop device setup

The implementation gracefully falls back to i_size_read() if vfs_getattr_nosec()
fails, maintaining backward compatibility.

Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
---
 drivers/block/loop.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 1b6ee91f8eb9..15d5edbc69ce 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -137,12 +137,39 @@ static void loop_global_unlock(struct loop_device *lo, bool global)
 static int max_part;
 static int part_shift;
 
+/**
+ * get_size - calculate the effective size of a loop device
+ * @offset: offset into the backing file
+ * @sizelimit: user-specified size limit
+ * @file: the backing file
+ *
+ * Calculate the effective size of the loop device
+ *
+ * Returns: size in 512-byte sectors, or 0 if invalid
+ */
 static loff_t get_size(loff_t offset, loff_t sizelimit, struct file *file)
 {
+	struct kstat stat;
 	loff_t loopsize;
+	int ret;
+
+	/*
+	 * Get file attributes for validation. We use vfs_getattr() to ensure
+	 * we have up-to-date file size information.
+	 */
+	ret = vfs_getattr_nosec(&file->f_path, &stat, STATX_SIZE, 
+			        AT_STATX_SYNC_AS_STAT);
+	if (ret) {
+		/*
+		 * If we can't get attributes, fall back to i_size_read()
+		 * which should work for most cases.
+		 */
+		loopsize = i_size_read(file->f_mapping->host);
+	} else {
+		/* Use the size from getattr for consistency */
+		loopsize = stat.size;
+	}
 
-	/* Compute loopsize in bytes */
-	loopsize = i_size_read(file->f_mapping->host);
 	if (offset > 0)
 		loopsize -= offset;
 	/* offset is beyond i_size, weird but possible */
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] loop: sync filesystem cache before getting file size in get_size()
@ 2025-08-08 18:59 Cloud User
  0 siblings, 0 replies; 7+ messages in thread
From: Cloud User @ 2025-08-08 18:59 UTC (permalink / raw)
  To: yukuai3, linux-kernel, linux-block, -c, yukuai1

Thanks, Kuai, for the quick review—I really appreciate it.
Please feel free to reach out if you have any questions or
if I missed addressing anything.

My responses to your queries are included inline below.
Rajeev
 
From: Yu Kuai <yukuai1@huaweicloud.com>
Date: Thursday, August 7, 2025 at 9:48 PM
To: Mishra, Rajeev <rajeevm@hpe.com>, axboe@kernel.dk <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>, linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>, yukuai (C) <yukuai3@huawei.com>
Subject: Re: [PATCH 1/1] loop: sync filesystem cache before getting file size in get_size()

Hi,

在 2025/08/08 7:25, Rajeev Mishra 写道:
> The get_size() function now uses vfs_getattr_nosec() with AT_STATX_SYNC_AS_STAT

With a quick code review, I didn't found how can that flag ensure
filesystem cache is synchronized, can you explain in detail? Or Do you
mean getattr for filesystem like fuse to get latest data from server?

Response ---

>> Thanks for the quick review. The AT_STATX_SYNC_AS_STAT flag tells

>>the VFS layer to synchronize cached metadata before returning file attributes.

>>This is particularly important for distributed/network filesystems where

>>the local cache may not reflect the current file size on the server.

---

> to ensure filesystem cache is synchronized before retrieving file size. This
> provides more accurate size information, especially when:
>
> - The backing file size has been changed by another process
> - The file is on a network filesystem (NFS, CIFS, etc.)
> - The file is being modified concurrently

I don't think this make sense in real world. If a file is already used
by loop device, then user should avoid modifying this file directly. For
a file in fuse, I feel it's not good to use it as loop device.

Response---

>>I encountered this issue specifically with Lustre filesystem during testing
>> I did following --

>>1. File was created on Lustre

>>2. dd was done to write data on the file

>>3. ls confirmed the size

>>4. Loop device setup was done on the file immediately

>>5. write was issued with less space

>>6.above happened as file size was not correctly captured by loop device

>>I agree that network/distributed filesystems aren't ideal for loop devices,

>>but they are used in practice (container images on shared storage, diskless

>>systems, etc.). The fallback to i_size_read() ensures no performance penalty

>>for local filesystems while improving reliability for network filesystems.

____


> - The most accurate size is needed for loop device setup
> > The implementation gracefully falls back to i_size_read() if
vfs_getattr_nosec()
> fails, maintaining backward compatibility.

Response –

___

>>> you mean using the flag sync has backward compatibility issue ? or using function itself ?

___


>>
> Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
> ---
>   drivers/block/loop.c | 31 +++++++++++++++++++++++++++++--
>   1 file changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 1b6ee91f8eb9..15d5edbc69ce 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -137,12 +137,39 @@ static void loop_global_unlock(struct loop_device *lo, bool global)
>   static int max_part;
>   static int part_shift;
>  
> +/**
> + * get_size - calculate the effective size of a loop device
> + * @offset: offset into the backing file
> + * @sizelimit: user-specified size limit
> + * @file: the backing file
> + *
> + * Calculate the effective size of the loop device
> + *
> + * Returns: size in 512-byte sectors, or 0 if invalid
> + */
>   static loff_t get_size(loff_t offset, loff_t sizelimit, struct file *file)
>   {
> +     struct kstat stat;
>        loff_t loopsize;
> +     int ret;
> +
> +     /*
> +      * Get file attributes for validation. We use vfs_getattr() to ensure
> +      * we have up-to-date file size information.
> +      */
> +     ret = vfs_getattr_nosec(&file->f_path, &stat, STATX_SIZE,
> +                             AT_STATX_SYNC_AS_STAT);
> +     if (ret) {
> +             /*
> +              * If we can't get attributes, fall back to i_size_read()
> +              * which should work for most cases.
> +              */
> +             loopsize = i_size_read(file->f_mapping->host);
> +     } else {
> +             /* Use the size from getattr for consistency */
> +             loopsize = stat.size;
> +     }

I'm ok switch from i_size_read() to getattr, however, the commit message
is confusing for me :(


Response --
>> I will make the commit message clear and simple. Just wanted to understand

>> if using this will be good “vfs_getattr_nosec(&file->f_path, &stat, STATX_SIZE, 0);”

>> is good I will replace i_size_read() with above code

>> do let me know if this will have any backward compatibility issue

>>Thanks again for your help

 


Thanks,
Kuai
>  
> -     /* Compute loopsize in bytes */
> -     loopsize = i_size_read(file->f_mapping->host);
>        if (offset > 0)
>                loopsize -= offset;
>        /* offset is beyond i_size, weird but possible */
>

^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] loop: sync filesystem cache before getting file size in get_size()
@ 2025-08-08 19:03 Cloud User
  0 siblings, 0 replies; 7+ messages in thread
From: Cloud User @ 2025-08-08 19:03 UTC (permalink / raw)
  To: yukuai1; +Cc: yukuai3, linux-kernel, linux-block

Thanks, Kuai, for the quick review—I really appreciate it.
Please feel free to reach out if you have any questions or
if I missed addressing anything.

My responses to your queries are included inline below.
Rajeev
 
From: Yu Kuai <yukuai1@huaweicloud.com>
Date: Thursday, August 7, 2025 at 9:48 PM
To: Mishra, Rajeev <rajeevm@hpe.com>, axboe@kernel.dk <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org <linux-block@vger.kernel.org>, linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>, yukuai (C) <yukuai3@huawei.com>
Subject: Re: [PATCH 1/1] loop: sync filesystem cache before getting file size in get_size()

Hi,

在 2025/08/08 7:25, Rajeev Mishra 写道:
> The get_size() function now uses vfs_getattr_nosec() with AT_STATX_SYNC_AS_STAT

With a quick code review, I didn't found how can that flag ensure
filesystem cache is synchronized, can you explain in detail? Or Do you
mean getattr for filesystem like fuse to get latest data from server?

Response ---

>> Thanks for the quick review. The AT_STATX_SYNC_AS_STAT flag tells

>>the VFS layer to synchronize cached metadata before returning file attributes.

>>This is particularly important for distributed/network filesystems where

>>the local cache may not reflect the current file size on the server.

---

> to ensure filesystem cache is synchronized before retrieving file size. This
> provides more accurate size information, especially when:
>
> - The backing file size has been changed by another process
> - The file is on a network filesystem (NFS, CIFS, etc.)
> - The file is being modified concurrently

I don't think this make sense in real world. If a file is already used
by loop device, then user should avoid modifying this file directly. For
a file in fuse, I feel it's not good to use it as loop device.

Response---

>>I encountered this issue specifically with Lustre filesystem during testing
>> I did following --

>>1. File was created on Lustre

>>2. dd was done to write data on the file

>>3. ls confirmed the size

>>4. Loop device setup was done on the file immediately

>>5. write was issued with less space

>>6.above happened as file size was not correctly captured by loop device

>>I agree that network/distributed filesystems aren't ideal for loop devices,

>>but they are used in practice (container images on shared storage, diskless

>>systems, etc.). The fallback to i_size_read() ensures no performance penalty

>>for local filesystems while improving reliability for network filesystems.

____


> - The most accurate size is needed for loop device setup
> > The implementation gracefully falls back to i_size_read() if
vfs_getattr_nosec()
> fails, maintaining backward compatibility.

Response –

___

>>> you mean using the flag sync has backward compatibility issue ? or using function itself ?

___


>>
> Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
> ---
>   drivers/block/loop.c | 31 +++++++++++++++++++++++++++++--
>   1 file changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 1b6ee91f8eb9..15d5edbc69ce 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -137,12 +137,39 @@ static void loop_global_unlock(struct loop_device *lo, bool global)
>   static int max_part;
>   static int part_shift;
>  
> +/**
> + * get_size - calculate the effective size of a loop device
> + * @offset: offset into the backing file
> + * @sizelimit: user-specified size limit
> + * @file: the backing file
> + *
> + * Calculate the effective size of the loop device
> + *
> + * Returns: size in 512-byte sectors, or 0 if invalid
> + */
>   static loff_t get_size(loff_t offset, loff_t sizelimit, struct file *file)
>   {
> +     struct kstat stat;
>        loff_t loopsize;
> +     int ret;
> +
> +     /*
> +      * Get file attributes for validation. We use vfs_getattr() to ensure
> +      * we have up-to-date file size information.
> +      */
> +     ret = vfs_getattr_nosec(&file->f_path, &stat, STATX_SIZE,
> +                             AT_STATX_SYNC_AS_STAT);
> +     if (ret) {
> +             /*
> +              * If we can't get attributes, fall back to i_size_read()
> +              * which should work for most cases.
> +              */
> +             loopsize = i_size_read(file->f_mapping->host);
> +     } else {
> +             /* Use the size from getattr for consistency */
> +             loopsize = stat.size;
> +     }

I'm ok switch from i_size_read() to getattr, however, the commit message
is confusing for me :(


Response --
>> I will make the commit message clear and simple. Just wanted to understand

>> if using this will be good “vfs_getattr_nosec(&file->f_path, &stat, STATX_SIZE, 0);”

>> is good I will replace i_size_read() with above code

>> do let me know if this will have any backward compatibility issue

>>Thanks again for your help

 


Thanks,
Kuai
>  
> -     /* Compute loopsize in bytes */
> -     loopsize = i_size_read(file->f_mapping->host);
>        if (offset > 0)
>                loopsize -= offset;
>        /* offset is beyond i_size, weird but possible */
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-08-15  3:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-07 23:25 [PATCH 1/1] loop: sync filesystem cache before getting file size in get_size() Rajeev Mishra
2025-08-08  2:47 ` Yu Kuai
     [not found]   ` <PH7PR84MB2079A6A4EFE799BCA7E5738CAA2FA@PH7PR84MB2079.NAMPRD84.PROD.OUTLOOK.COM>
2025-08-11  1:20     ` Yu Kuai
2025-08-10 15:15 ` Christoph Hellwig
2025-08-15  3:25 ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2025-08-08 18:59 Cloud User
2025-08-08 19:03 Cloud User

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).