linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ntfs: add code to make FIBMAP ioctl work
@ 2014-10-15 19:19 Sergei Antonov
  2014-10-15 23:12 ` Andreas Dilger
  2014-10-16 13:20 ` Anton Altaparmakov
  0 siblings, 2 replies; 10+ messages in thread
From: Sergei Antonov @ 2014-10-15 19:19 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Anton Altaparmakov, Sergei Antonov

Implement ntfs_bmap() function for the "bmap" address space operation.
Now user space programs can send FIBMAP ioctl to get files' placement on
NTFS the same way they can do it on a number of other linux filesystems.

The result is returned in FIGETBSZ units, which is equal to sector size
in ntfs driver. An error value zero is returned for resident, compressed,
encrypted files, and for holes in sparse files.

Tested on drives with different sector sizes and different cluster sizes.

Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Sergei Antonov <saproj@gmail.com>
---
 fs/ntfs/aops.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index d267ea6..3972b6b 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -1538,6 +1538,36 @@ err_out:
 
 #endif	/* NTFS_RW */
 
+static sector_t ntfs_bmap(struct address_space *mapping, sector_t block)
+{
+	struct inode *vi = mapping->host;
+	ntfs_inode *ni = NTFS_I(vi);
+	ntfs_volume *vol = ni->vol;
+	const unsigned clust2sect_bits = vol->cluster_size_bits -
+		vol->sector_size_bits;
+	const sector_t clust2sect_mask = (1 << clust2sect_bits) - 1;
+	LCN lcn;
+
+	BUG_ON(S_ISDIR(vi->i_mode));
+	ntfs_debug("Requested LCN for block %llu.", (unsigned long long)block);
+	if (!NInoNonResident(ni) || NInoCompressed(ni) || NInoEncrypted(ni))
+		return 0;
+
+	down_read(&ni->runlist.lock);
+	lcn = ntfs_attr_vcn_to_lcn_nolock(ni, block >> clust2sect_bits, false);
+	up_read(&ni->runlist.lock);
+
+	if (lcn < 0) {
+		if (lcn == LCN_HOLE)
+			ntfs_warning(vol->sb, "Cannot get LCN for a hole.");
+		else
+			ntfs_error(vol->sb, "Error getting LCN: %lld.", lcn);
+		return 0;
+	}
+
+	return (lcn << clust2sect_bits) + (block & clust2sect_mask);
+}
+
 /**
  * ntfs_aops - general address space operations for inodes and attributes
  */
@@ -1546,6 +1576,7 @@ const struct address_space_operations ntfs_aops = {
 #ifdef NTFS_RW
 	.writepage	= ntfs_writepage,	/* Write dirty page to disk. */
 #endif /* NTFS_RW */
+	.bmap		= ntfs_bmap,		/* Map file offset to volume. */
 	.migratepage	= buffer_migrate_page,	/* Move a page cache page from
 						   one physical page to an
 						   other. */
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-15 19:19 [PATCH] ntfs: add code to make FIBMAP ioctl work Sergei Antonov
@ 2014-10-15 23:12 ` Andreas Dilger
  2014-10-16 11:37   ` Sergei Antonov
  2014-10-16 13:23   ` Anton Altaparmakov
  2014-10-16 13:20 ` Anton Altaparmakov
  1 sibling, 2 replies; 10+ messages in thread
From: Andreas Dilger @ 2014-10-15 23:12 UTC (permalink / raw)
  To: Sergei Antonov; +Cc: linux-fsdevel, Anton Altaparmakov

[-- Attachment #1: Type: text/plain, Size: 2789 bytes --]

On Oct 15, 2014, at 1:19 PM, Sergei Antonov <saproj@gmail.com> wrote:
> Implement ntfs_bmap() function for the "bmap" address space operation.
> Now user space programs can send FIBMAP ioctl to get files' placement on
> NTFS the same way they can do it on a number of other linux filesystems.
> 
> The result is returned in FIGETBSZ units, which is equal to sector size
> in ntfs driver. An error value zero is returned for resident, compressed,
> encrypted files, and for holes in sparse files.

Any thoughts about implementing FIEMAP for NTFS?  This allows byte offset
granularity, and reduces the number of system calls significantly.

Cheers, Andreas

> Tested on drives with different sector sizes and different cluster sizes.
> 
> Cc: Anton Altaparmakov <aia21@cantab.net>
> Signed-off-by: Sergei Antonov <saproj@gmail.com>
> ---
> fs/ntfs/aops.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
> 
> diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
> index d267ea6..3972b6b 100644
> --- a/fs/ntfs/aops.c
> +++ b/fs/ntfs/aops.c
> @@ -1538,6 +1538,36 @@ err_out:
> 
> #endif	/* NTFS_RW */
> 
> +static sector_t ntfs_bmap(struct address_space *mapping, sector_t block)
> +{
> +	struct inode *vi = mapping->host;
> +	ntfs_inode *ni = NTFS_I(vi);
> +	ntfs_volume *vol = ni->vol;
> +	const unsigned clust2sect_bits = vol->cluster_size_bits -
> +		vol->sector_size_bits;
> +	const sector_t clust2sect_mask = (1 << clust2sect_bits) - 1;
> +	LCN lcn;
> +
> +	BUG_ON(S_ISDIR(vi->i_mode));
> +	ntfs_debug("Requested LCN for block %llu.", (unsigned long long)block);
> +	if (!NInoNonResident(ni) || NInoCompressed(ni) || NInoEncrypted(ni))
> +		return 0;
> +
> +	down_read(&ni->runlist.lock);
> +	lcn = ntfs_attr_vcn_to_lcn_nolock(ni, block >> clust2sect_bits, false);
> +	up_read(&ni->runlist.lock);
> +
> +	if (lcn < 0) {
> +		if (lcn == LCN_HOLE)
> +			ntfs_warning(vol->sb, "Cannot get LCN for a hole.");
> +		else
> +			ntfs_error(vol->sb, "Error getting LCN: %lld.", lcn);
> +		return 0;
> +	}
> +
> +	return (lcn << clust2sect_bits) + (block & clust2sect_mask);
> +}
> +
> /**
>  * ntfs_aops - general address space operations for inodes and attributes
>  */
> @@ -1546,6 +1576,7 @@ const struct address_space_operations ntfs_aops = {
> #ifdef NTFS_RW
> 	.writepage	= ntfs_writepage,	/* Write dirty page to disk. */
> #endif /* NTFS_RW */
> +	.bmap		= ntfs_bmap,		/* Map file offset to volume. */
> 	.migratepage	= buffer_migrate_page,	/* Move a page cache page from
> 						   one physical page to an
> 						   other. */
> -- 
> 2.1.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-15 23:12 ` Andreas Dilger
@ 2014-10-16 11:37   ` Sergei Antonov
  2014-10-16 13:23   ` Anton Altaparmakov
  1 sibling, 0 replies; 10+ messages in thread
From: Sergei Antonov @ 2014-10-16 11:37 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-fsdevel@vger.kernel.org, Anton Altaparmakov

On 16 October 2014 01:12, Andreas Dilger <adilger@dilger.ca> wrote:
> On Oct 15, 2014, at 1:19 PM, Sergei Antonov <saproj@gmail.com> wrote:
>> Implement ntfs_bmap() function for the "bmap" address space operation.
>> Now user space programs can send FIBMAP ioctl to get files' placement on
>> NTFS the same way they can do it on a number of other linux filesystems.
>>
>> The result is returned in FIGETBSZ units, which is equal to sector size
>> in ntfs driver. An error value zero is returned for resident, compressed,
>> encrypted files, and for holes in sparse files.
>
> Any thoughts about implementing FIEMAP for NTFS?  This allows byte offset
> granularity, and reduces the number of system calls significantly.

FIEMAP is good. But it is also less ubiquitous than FIBMAP (absent in
fat, hfsplus, ...).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-15 19:19 [PATCH] ntfs: add code to make FIBMAP ioctl work Sergei Antonov
  2014-10-15 23:12 ` Andreas Dilger
@ 2014-10-16 13:20 ` Anton Altaparmakov
  2014-10-16 15:07   ` Sergei Antonov
  1 sibling, 1 reply; 10+ messages in thread
From: Anton Altaparmakov @ 2014-10-16 13:20 UTC (permalink / raw)
  To: Sergei Antonov; +Cc: linux-fsdevel, Anton Altaparmakov

Hi Sergei,

On 15 Oct 2014, at 20:19, Sergei Antonov <saproj@gmail.com> wrote:
> Implement ntfs_bmap() function for the "bmap" address space operation.
> Now user space programs can send FIBMAP ioctl to get files' placement on
> NTFS the same way they can do it on a number of other linux filesystems.
> 
> The result is returned in FIGETBSZ units, which is equal to sector size
> in ntfs driver. An error value zero is returned for resident, compressed,
> encrypted files, and for holes in sparse files.
> 
> Tested on drives with different sector sizes and different cluster sizes.

Thanks for your patch.  It skirts/looks over some limitations so instead of applying your patch I have open sourced the bmap implementation from Tuxera NTFS driver (with permission) and have pushed it to the open source NTFS tree at git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs.git.

I have sent a pull request to Linus so he will hopefully merge it into mainline.

You can compare it to yours but most important points are that you do not forbid non-data attributes and you completely ignore initialized size which leads to exposing stale on-disk data on read and to losing the written data on writes which is obviously bad.  You also don't verify that the value to be returned isn't too large to fit in the return value size so you can potentially return truncated values on 32-bit kernels with 32-bit sector_t on a large volume where the sector number would need 64-bits which would then lead to disk corruption as the user would read/write random sectors on disk instead of the correct ones.

Best regards,

	Anton

> Cc: Anton Altaparmakov <aia21@cantab.net>
> Signed-off-by: Sergei Antonov <saproj@gmail.com>
> ---
> fs/ntfs/aops.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
> 
> diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
> index d267ea6..3972b6b 100644
> --- a/fs/ntfs/aops.c
> +++ b/fs/ntfs/aops.c
> @@ -1538,6 +1538,36 @@ err_out:
> 
> #endif	/* NTFS_RW */
> 
> +static sector_t ntfs_bmap(struct address_space *mapping, sector_t block)
> +{
> +	struct inode *vi = mapping->host;
> +	ntfs_inode *ni = NTFS_I(vi);
> +	ntfs_volume *vol = ni->vol;
> +	const unsigned clust2sect_bits = vol->cluster_size_bits -
> +		vol->sector_size_bits;
> +	const sector_t clust2sect_mask = (1 << clust2sect_bits) - 1;
> +	LCN lcn;
> +
> +	BUG_ON(S_ISDIR(vi->i_mode));
> +	ntfs_debug("Requested LCN for block %llu.", (unsigned long long)block);
> +	if (!NInoNonResident(ni) || NInoCompressed(ni) || NInoEncrypted(ni))
> +		return 0;
> +
> +	down_read(&ni->runlist.lock);
> +	lcn = ntfs_attr_vcn_to_lcn_nolock(ni, block >> clust2sect_bits, false);
> +	up_read(&ni->runlist.lock);
> +
> +	if (lcn < 0) {
> +		if (lcn == LCN_HOLE)
> +			ntfs_warning(vol->sb, "Cannot get LCN for a hole.");
> +		else
> +			ntfs_error(vol->sb, "Error getting LCN: %lld.", lcn);
> +		return 0;
> +	}
> +
> +	return (lcn << clust2sect_bits) + (block & clust2sect_mask);
> +}
> +
> /**
>  * ntfs_aops - general address space operations for inodes and attributes
>  */
> @@ -1546,6 +1576,7 @@ const struct address_space_operations ntfs_aops = {
> #ifdef NTFS_RW
> 	.writepage	= ntfs_writepage,	/* Write dirty page to disk. */
> #endif /* NTFS_RW */
> +	.bmap		= ntfs_bmap,		/* Map file offset to volume. */
> 	.migratepage	= buffer_migrate_page,	/* Move a page cache page from
> 						   one physical page to an
> 						   other. */

-- 
Anton Altaparmakov <anton at tuxera.com> (replace at with @)
Senior Kernel Engineer, Tuxera Inc., http://www.tuxera.com/
Linux NTFS maintainer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-15 23:12 ` Andreas Dilger
  2014-10-16 11:37   ` Sergei Antonov
@ 2014-10-16 13:23   ` Anton Altaparmakov
  1 sibling, 0 replies; 10+ messages in thread
From: Anton Altaparmakov @ 2014-10-16 13:23 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Sergei Antonov, linux-fsdevel

Hi,

On 16 Oct 2014, at 00:12, Andreas Dilger <adilger@dilger.ca> wrote:
> On Oct 15, 2014, at 1:19 PM, Sergei Antonov <saproj@gmail.com> wrote:
>> Implement ntfs_bmap() function for the "bmap" address space operation.
>> Now user space programs can send FIBMAP ioctl to get files' placement on
>> NTFS the same way they can do it on a number of other linux filesystems.
>> 
>> The result is returned in FIGETBSZ units, which is equal to sector size
>> in ntfs driver. An error value zero is returned for resident, compressed,
>> encrypted files, and for holes in sparse files.
> 
> Any thoughts about implementing FIEMAP for NTFS?  This allows byte offset
> granularity, and reduces the number of system calls significantly.

Sure, would be nice to have it.  And I will get round to it one day but no customer has ever requested it so at least I haven't implemented it yet...

Best regards,

	Anton

> Cheers, Andreas
> 
>> Tested on drives with different sector sizes and different cluster sizes.
>> 
>> Cc: Anton Altaparmakov <aia21@cantab.net>
>> Signed-off-by: Sergei Antonov <saproj@gmail.com>
>> ---
>> fs/ntfs/aops.c | 31 +++++++++++++++++++++++++++++++
>> 1 file changed, 31 insertions(+)
>> 
>> diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
>> index d267ea6..3972b6b 100644
>> --- a/fs/ntfs/aops.c
>> +++ b/fs/ntfs/aops.c
>> @@ -1538,6 +1538,36 @@ err_out:
>> 
>> #endif	/* NTFS_RW */
>> 
>> +static sector_t ntfs_bmap(struct address_space *mapping, sector_t block)
>> +{
>> +	struct inode *vi = mapping->host;
>> +	ntfs_inode *ni = NTFS_I(vi);
>> +	ntfs_volume *vol = ni->vol;
>> +	const unsigned clust2sect_bits = vol->cluster_size_bits -
>> +		vol->sector_size_bits;
>> +	const sector_t clust2sect_mask = (1 << clust2sect_bits) - 1;
>> +	LCN lcn;
>> +
>> +	BUG_ON(S_ISDIR(vi->i_mode));
>> +	ntfs_debug("Requested LCN for block %llu.", (unsigned long long)block);
>> +	if (!NInoNonResident(ni) || NInoCompressed(ni) || NInoEncrypted(ni))
>> +		return 0;
>> +
>> +	down_read(&ni->runlist.lock);
>> +	lcn = ntfs_attr_vcn_to_lcn_nolock(ni, block >> clust2sect_bits, false);
>> +	up_read(&ni->runlist.lock);
>> +
>> +	if (lcn < 0) {
>> +		if (lcn == LCN_HOLE)
>> +			ntfs_warning(vol->sb, "Cannot get LCN for a hole.");
>> +		else
>> +			ntfs_error(vol->sb, "Error getting LCN: %lld.", lcn);
>> +		return 0;
>> +	}
>> +
>> +	return (lcn << clust2sect_bits) + (block & clust2sect_mask);
>> +}
>> +
>> /**
>> * ntfs_aops - general address space operations for inodes and attributes
>> */
>> @@ -1546,6 +1576,7 @@ const struct address_space_operations ntfs_aops = {
>> #ifdef NTFS_RW
>> 	.writepage	= ntfs_writepage,	/* Write dirty page to disk. */
>> #endif /* NTFS_RW */
>> +	.bmap		= ntfs_bmap,		/* Map file offset to volume. */
>> 	.migratepage	= buffer_migrate_page,	/* Move a page cache page from
>> 						   one physical page to an
>> 						   other. */

-- 
Anton Altaparmakov <anton at tuxera.com> (replace at with @)
Senior Kernel Engineer, Tuxera Inc., http://www.tuxera.com/
Linux NTFS maintainer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-16 13:20 ` Anton Altaparmakov
@ 2014-10-16 15:07   ` Sergei Antonov
  2014-10-16 15:36     ` Anton Altaparmakov
  0 siblings, 1 reply; 10+ messages in thread
From: Sergei Antonov @ 2014-10-16 15:07 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel@vger.kernel.org, Anton Altaparmakov

On 16 October 2014 15:20, Anton Altaparmakov <anton@tuxera.com> wrote:
> Hi Sergei,
>
> On 15 Oct 2014, at 20:19, Sergei Antonov <saproj@gmail.com> wrote:
>> Implement ntfs_bmap() function for the "bmap" address space operation.
>> Now user space programs can send FIBMAP ioctl to get files' placement on
>> NTFS the same way they can do it on a number of other linux filesystems.
>>
>> The result is returned in FIGETBSZ units, which is equal to sector size
>> in ntfs driver. An error value zero is returned for resident, compressed,
>> encrypted files, and for holes in sparse files.
>>
>> Tested on drives with different sector sizes and different cluster sizes.
>
> Thanks for your patch.  It skirts/looks over some limitations so instead of applying your patch I have open sourced the bmap implementation from Tuxera NTFS driver (with permission) and have pushed it to the open source NTFS tree at git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs.git.

Cool. Thanks.
It is good you liberate the code.

> I have sent a pull request to Linus so he will hopefully merge it into mainline.
>
> You can compare it to yours but most important points are that you do not forbid non-data attributes and you completely ignore initialized size which leads to exposing stale on-disk data on read and to losing the written data on writes which is obviously bad.  You also don't verify that the value to be returned isn't too large to fit in the return value size so you can potentially return truncated values on 32-bit kernels with 32-bit sector_t on a large volume where the sector number would need 64-bits which would then lead to disk corruption as the user would read/write random sectors on disk instead of the correct ones.

I looked at your code and do not quite understand the handling of
initialized size. You read it into a local copy while holding the
lock, but then after the lock is released it may change and, I'm
afraid, cause the same problems you mentioned.

What is an example of a platform with 32-bit sector_t? Interesting to know.

Is it really possible that ntfs_bmap() is called for non-data
attributes? In other words, what other attributes are accessible as
files? Just curious.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-16 15:07   ` Sergei Antonov
@ 2014-10-16 15:36     ` Anton Altaparmakov
  2014-10-16 20:03       ` Sergei Antonov
  0 siblings, 1 reply; 10+ messages in thread
From: Anton Altaparmakov @ 2014-10-16 15:36 UTC (permalink / raw)
  To: Sergei Antonov; +Cc: linux-fsdevel@vger.kernel.org, Anton Altaparmakov

Hi Sergei,

On 16 Oct 2014, at 16:07, Sergei Antonov <saproj@gmail.com> wrote:
> I looked at your code and do not quite understand the handling of
> initialized size. You read it into a local copy while holding the
> lock, but then after the lock is released it may change and, I'm
> afraid, cause the same problems you mentioned.

Correct.  That is why FIBMAP is a stupid interface!  However initialized_size can only ever go up and not down so in the worst case you will get result zero, i.e. hole instead of the now real sector number you could access.

The only time it can shrink is when the file size shrinks and I am afraid if anyone is using FIBMAP against a file that can be truncated they will corrupt the file system!

> What is an example of a platform with 32-bit sector_t? Interesting to know.

Any 32-bit system that does not configure the kernel for 64-bit sector_t.  Sorry can't remember the .config option name at the moment.  Would need to look it up.

> Is it really possible that ntfs_bmap() is called for non-data
> attributes? In other words, what other attributes are accessible as
> files? Just curious.

I admit in current kernel driver it cannot.  It might in future - reparse points for example are very odd in many ways and though they are often indicated as directories rather than files they could not always be and what attributes a reparse point has is anyones guess - depends on the reparse point.

Then there are system files that do not have an unnamed $DATA attribute at all so in a way the check should probably be extended to ensure the attribute is $DATA _and_ unnamed but again anyone using FIBMAP on a system file is asking to cause problems...

Best regards,

	Anton
-- 
Anton Altaparmakov <anton at tuxera.com> (replace at with @)
Senior Kernel Engineer, Tuxera Inc., http://www.tuxera.com/
Linux NTFS maintainer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-16 15:36     ` Anton Altaparmakov
@ 2014-10-16 20:03       ` Sergei Antonov
  2014-10-16 20:43         ` Anton Altaparmakov
  0 siblings, 1 reply; 10+ messages in thread
From: Sergei Antonov @ 2014-10-16 20:03 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel@vger.kernel.org, Anton Altaparmakov

On 16 October 2014 17:36, Anton Altaparmakov <anton@tuxera.com> wrote:
> Hi Sergei,
>
> On 16 Oct 2014, at 16:07, Sergei Antonov <saproj@gmail.com> wrote:
>> I looked at your code and do not quite understand the handling of
>> initialized size. You read it into a local copy while holding the
>> lock, but then after the lock is released it may change and, I'm
>> afraid, cause the same problems you mentioned.
>
> Correct.  That is why FIBMAP is a stupid interface!  However initialized_size can only ever go up and not down so in the worst case you will get result zero, i.e. hole instead of the now real sector number you could access.
>
> The only time it can shrink is when the file size shrinks and I am afraid if anyone is using FIBMAP against a file that can be truncated they will corrupt the file system!

I've tried to merge all your arguments in my head :) and concluded
that a practical importance of the check against initialized_size is
exaggerated :).

It is not a reliable protection (API safety issues) implemented in a
way that it may not always work (limited locking scope). Can be an
indicator of programmer's diligence, though - "you see, I care about
it!". Sorry for this ramble. Your patch will do the job.

>> What is an example of a platform with 32-bit sector_t? Interesting to know.
>
> Any 32-bit system that does not configure the kernel for 64-bit sector_t.  Sorry can't remember the .config option name at the moment.  Would need to look it up.

Thanks. Found it. It is CONFIG_LBDAF.

>> Is it really possible that ntfs_bmap() is called for non-data
>> attributes? In other words, what other attributes are accessible as
>> files? Just curious.
>
> I admit in current kernel driver it cannot.  It might in future - reparse points for example are very odd in many ways and though they are often indicated as directories rather than files they could not always be and what attributes a reparse point has is anyones guess - depends on the reparse point.
>
> Then there are system files that do not have an unnamed $DATA attribute at all so in a way the check should probably be extended to ensure the attribute is $DATA _and_ unnamed but again anyone using FIBMAP on a system file is asking to cause problems...

Aren't those files handled by 'ntfs_mst_aops' (the one without a bmap
function) rather than 'ntfs_normal_aops'?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-16 20:03       ` Sergei Antonov
@ 2014-10-16 20:43         ` Anton Altaparmakov
  2014-10-16 21:55           ` Anton Altaparmakov
  0 siblings, 1 reply; 10+ messages in thread
From: Anton Altaparmakov @ 2014-10-16 20:43 UTC (permalink / raw)
  To: Sergei Antonov; +Cc: linux-fsdevel@vger.kernel.org

Hi Sergei,

On 16 Oct 2014, at 21:03, Sergei Antonov <saproj@gmail.com> wrote:
> On 16 October 2014 17:36, Anton Altaparmakov <anton@tuxera.com> wrote:
>> On 16 Oct 2014, at 16:07, Sergei Antonov <saproj@gmail.com> wrote:
>>> I looked at your code and do not quite understand the handling of
>>> initialized size. You read it into a local copy while holding the
>>> lock, but then after the lock is released it may change and, I'm
>>> afraid, cause the same problems you mentioned.
>> 
>> Correct.  That is why FIBMAP is a stupid interface!  However initialized_size can only ever go up and not down so in the worst case you will get result zero, i.e. hole instead of the now real sector number you could access.
>> 
>> The only time it can shrink is when the file size shrinks and I am afraid if anyone is using FIBMAP against a file that can be truncated they will corrupt the file system!
> 
> I've tried to merge all your arguments in my head :) and concluded
> that a practical importance of the check against initialized_size is
> exaggerated :).

I am afraid you are showing a serious lack of understanding of how NTFS works as a file system!  I completely disagree with your statement.  It is perfectly valid for an application to open() an UNCHANGING file and then run FIBMAP against it and then read from it directly (example applications: lilo and grub).  It is completely NOT ok for the FIBMAP to return uninitialized sectors and the application to then read random garbage from disk instead of the zeroes that are meant to be there because you did not pay attention to the fact that initialized_size is less than file size and thus failed to return 0 instead of the sector number!

It has nothing to do with the fact that it might change.  You must read zero beyond initialized_size.  The whole point of it is that the area above it has never been written thus must be considered to be zero on disk no matter what old disk contents there might be from when the sectors were in use for something else/by some other file or directory or other metadata or even a previous file system that might have been there before.

> It is not a reliable protection (API safety issues) implemented in a
> way that it may not always work (limited locking scope). Can be an
> indicator of programmer's diligence, though - "you see, I care about
> it!". Sorry for this ramble. Your patch will do the job.

The locking is there because initialized_size is 64-bit and this cannot be read on 32-bit without locking.  Surely you knew that already?  If not you need to do some more studying of 64-bit memory accesses on 32-bit systems...  (-;  E.g. have a look at i_size_read()/i_size_write() and inode_{get,add,sub,set}_bytes()...

And if you really don't understand it: a 32-bit CPU has to use two separate "read 32-bit memory location" instructions to read a 64-bit value so reading the lower 32-bits and higher 32-bits can happen a long time apart (if process is preempted between the two read instructions for example) and in the meantime a different process can write a new value in which case when the second read instruction finally executes it reads the high 32-bits from the new value and suddenly you have read random junk - the low 32-bits of old value and high 32-bits of new value and you think that is a valid 64-bit value which it is not.  Thus you must lock to prevent this from happening hence why all 64-bit accesses of shared data are locked in Linux kernel (at least on 32-bit CPUs).

>>> What is an example of a platform with 32-bit sector_t? Interesting to know.
>> 
>> Any 32-bit system that does not configure the kernel for 64-bit sector_t.  Sorry can't remember the .config option name at the moment.  Would need to look it up.
> 
> Thanks. Found it. It is CONFIG_LBDAF.

So it is, thanks for finding it.  It is not an intuitive name!

>>> Is it really possible that ntfs_bmap() is called for non-data
>>> attributes? In other words, what other attributes are accessible as
>>> files? Just curious.
>> 
>> I admit in current kernel driver it cannot.  It might in future - reparse points for example are very odd in many ways and though they are often indicated as directories rather than files they could not always be and what attributes a reparse point has is anyones guess - depends on the reparse point.
>> 
>> Then there are system files that do not have an unnamed $DATA attribute at all so in a way the check should probably be extended to ensure the attribute is $DATA _and_ unnamed but again anyone using FIBMAP on a system file is asking to cause problems...
> 
> Aren't those files handled by 'ntfs_mst_aops' (the one without a bmap
> function) rather than 'ntfs_normal_aops'?

Only directories and attributes containing B-trees are handled by mst_aops.  Not all system files are B-trees and often there are both B-trees and data attributes and the latter are usually not mst protected thus normal_aops are used.

Best regards,

	Anton
-- 
Anton Altaparmakov <anton at tuxera.com> (replace at with @)
Senior Kernel Engineer, Tuxera Inc., http://www.tuxera.com/
Linux NTFS maintainer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ntfs: add code to make FIBMAP ioctl work
  2014-10-16 20:43         ` Anton Altaparmakov
@ 2014-10-16 21:55           ` Anton Altaparmakov
  0 siblings, 0 replies; 10+ messages in thread
From: Anton Altaparmakov @ 2014-10-16 21:55 UTC (permalink / raw)
  To: Sergei Antonov; +Cc: linux-fsdevel@vger.kernel.org

On 16 Oct 2014, at 21:43, Anton Altaparmakov <anton@tuxera.com> wrote:
> Only directories and attributes containing B-trees are handled by mst_aops.  Not all system files are B-trees and often there are both B-trees and data attributes and the latter are usually not mst protected thus normal_aops are used.

Need to correct my above statement.  $MFT and $MFTmirr are both $DATA attribute containing files that use mst_aops.  But they are the only exceptions to the above.

Best regards,

	Anton
-- 
Anton Altaparmakov <anton at tuxera.com> (replace at with @)
Senior Kernel Engineer, Tuxera Inc., http://www.tuxera.com/
Linux NTFS maintainer


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-10-16 21:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-15 19:19 [PATCH] ntfs: add code to make FIBMAP ioctl work Sergei Antonov
2014-10-15 23:12 ` Andreas Dilger
2014-10-16 11:37   ` Sergei Antonov
2014-10-16 13:23   ` Anton Altaparmakov
2014-10-16 13:20 ` Anton Altaparmakov
2014-10-16 15:07   ` Sergei Antonov
2014-10-16 15:36     ` Anton Altaparmakov
2014-10-16 20:03       ` Sergei Antonov
2014-10-16 20:43         ` Anton Altaparmakov
2014-10-16 21:55           ` Anton Altaparmakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).