* [PATCH] tmpfs: truncate prealloc blocks past i_size
@ 2014-10-29 17:10 Josef Bacik
2014-11-04 1:30 ` Hugh Dickins
0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2014-10-29 17:10 UTC (permalink / raw)
To: hughd, linux-mm, linux-fsdevel
One of the rocksdb people noticed that when you do something like this
fallocate(fd, FALLOC_FL_KEEP_SIZE, 0, 10M)
pwrite(fd, buf, 5M, 0)
ftruncate(5M)
on tmpfs the file would still take up 10M, which lead to super fun issues
because we were getting ENOSPC before we thought we should be getting ENOSPC.
This patch fixes the problem, and mirrors what all the other fs'es do. I tested
it locally to make sure it worked properly with the following
xfs_io -f -c "falloc -k 0 10M" -c "pwrite 0 5M" -c "truncate 5M" file
Without the patch we have "Blocks: 20480", with the patch we have the correct
value of "Blocks: 10240". Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
mm/shmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 185836b..79b7fb5 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -574,7 +574,7 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
i_size_write(inode, newsize);
inode->i_ctime = inode->i_mtime = CURRENT_TIME;
}
- if (newsize < oldsize) {
+ if (newsize <= oldsize) {
loff_t holebegin = round_up(newsize, PAGE_SIZE);
unmap_mapping_range(inode->i_mapping, holebegin, 0, 1);
shmem_truncate_range(inode, newsize, (loff_t)-1);
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] tmpfs: truncate prealloc blocks past i_size
2014-10-29 17:10 [PATCH] tmpfs: truncate prealloc blocks past i_size Josef Bacik
@ 2014-11-04 1:30 ` Hugh Dickins
2014-11-04 15:55 ` Josef Bacik
0 siblings, 1 reply; 4+ messages in thread
From: Hugh Dickins @ 2014-11-04 1:30 UTC (permalink / raw)
To: Josef Bacik; +Cc: hughd, linux-mm, linux-fsdevel
On Wed, 29 Oct 2014, Josef Bacik wrote:
> One of the rocksdb people noticed that when you do something like this
>
> fallocate(fd, FALLOC_FL_KEEP_SIZE, 0, 10M)
> pwrite(fd, buf, 5M, 0)
> ftruncate(5M)
>
> on tmpfs the file would still take up 10M, which lead to super fun issues
> because we were getting ENOSPC before we thought we should be getting ENOSPC.
> This patch fixes the problem, and mirrors what all the other fs'es do. I tested
> it locally to make sure it worked properly with the following
>
> xfs_io -f -c "falloc -k 0 10M" -c "pwrite 0 5M" -c "truncate 5M" file
>
> Without the patch we have "Blocks: 20480", with the patch we have the correct
> value of "Blocks: 10240". Thanks,
>
> Signed-off-by: Josef Bacik <jbacik@fb.com>
That is a very good catch, and thank you for the patch. But I am not
convinced that the patch is correct - even if it does happen to end
up doing what other filesystems do here (I haven't checked).
Your patch makes it look like a fix to an off-by-one, but that is
not really the case. What if you change your final ftruncate(5M)
to ftruncate(6M): what should happen then?
My intuition says that what should happen is that i_size is set to 6M,
and the fallocated excess blocks beyond 6M be trimmed off: so that
it's both an extending and a shrinking truncate at the same time.
And I think that behavior would be served by removing the
"if (newsize < oldsize)" condition completely.
But perhaps I'm wrong: can you or anyone shed more light on this,
or point to documentation of what should happen in these cases?
Thanks,
Hugh
> ---
> mm/shmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 185836b..79b7fb5 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -574,7 +574,7 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
> i_size_write(inode, newsize);
> inode->i_ctime = inode->i_mtime = CURRENT_TIME;
> }
> - if (newsize < oldsize) {
> + if (newsize <= oldsize) {
> loff_t holebegin = round_up(newsize, PAGE_SIZE);
> unmap_mapping_range(inode->i_mapping, holebegin, 0, 1);
> shmem_truncate_range(inode, newsize, (loff_t)-1);
> --
> 1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] tmpfs: truncate prealloc blocks past i_size
2014-11-04 1:30 ` Hugh Dickins
@ 2014-11-04 15:55 ` Josef Bacik
0 siblings, 0 replies; 4+ messages in thread
From: Josef Bacik @ 2014-11-04 15:55 UTC (permalink / raw)
To: Hugh Dickins; +Cc: linux-mm, linux-fsdevel, dave Chinner
On 11/03/2014 08:30 PM, Hugh Dickins wrote:
> On Wed, 29 Oct 2014, Josef Bacik wrote:
>
>> One of the rocksdb people noticed that when you do something like this
>>
>> fallocate(fd, FALLOC_FL_KEEP_SIZE, 0, 10M)
>> pwrite(fd, buf, 5M, 0)
>> ftruncate(5M)
>>
>> on tmpfs the file would still take up 10M, which lead to super fun issues
>> because we were getting ENOSPC before we thought we should be getting ENOSPC.
>> This patch fixes the problem, and mirrors what all the other fs'es do. I tested
>> it locally to make sure it worked properly with the following
>>
>> xfs_io -f -c "falloc -k 0 10M" -c "pwrite 0 5M" -c "truncate 5M" file
>>
>> Without the patch we have "Blocks: 20480", with the patch we have the correct
>> value of "Blocks: 10240". Thanks,
>>
>> Signed-off-by: Josef Bacik <jbacik@fb.com>
>
> That is a very good catch, and thank you for the patch. But I am not
> convinced that the patch is correct - even if it does happen to end
> up doing what other filesystems do here (I haven't checked).
>
> Your patch makes it look like a fix to an off-by-one, but that is
> not really the case. What if you change your final ftruncate(5M)
> to ftruncate(6M): what should happen then?
>
> My intuition says that what should happen is that i_size is set to 6M,
> and the fallocated excess blocks beyond 6M be trimmed off: so that
> it's both an extending and a shrinking truncate at the same time.
> And I think that behavior would be served by removing the
> "if (newsize < oldsize)" condition completely.
>
> But perhaps I'm wrong: can you or anyone shed more light on this,
> or point to documentation of what should happen in these cases?
>
Yup there's a section in the ftruncate manpage that specifically says
"expanding truncate is for losers."
Dave you want to weigh in here? Looking at both btrfs and xfs we only
do the trimming if newsize <= oldsize. So if you falloc up to 10M,
write 5M, and truncate up to 6M there is no trimming. I'd say this is
ok since it's an expanding truncate, people doing this are probably
going to want to keep the extra space, as opposed to those who falloc a
chunk and then truncate down to the amount they actually wrote. Thoughts?
Josef
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] tmpfs: truncate at i_size
@ 2015-05-19 15:27 Josef Bacik
2015-06-16 20:02 ` Hugh Dickins
0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2015-05-19 15:27 UTC (permalink / raw)
To: hughd, linux-mm, linux-kernel
If we fallocate past i_size with KEEP_SIZE, extend the file to use some but not
all of this space, and then truncate(i_size) we won't trim the excess
preallocated space. We decided at LSF that we want to truncate the fallocated
bit past i_size when we truncate to i_size, which is what this patch does.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
mm/shmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index de98137..089afde 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -569,7 +569,7 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
i_size_write(inode, newsize);
inode->i_ctime = inode->i_mtime = CURRENT_TIME;
}
- if (newsize < oldsize) {
+ if (newsize <= oldsize) {
loff_t holebegin = round_up(newsize, PAGE_SIZE);
unmap_mapping_range(inode->i_mapping, holebegin, 0, 1);
shmem_truncate_range(inode, newsize, (loff_t)-1);
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] tmpfs: truncate at i_size
2015-05-19 15:27 [PATCH] tmpfs: truncate at i_size Josef Bacik
@ 2015-06-16 20:02 ` Hugh Dickins
2015-06-16 20:07 ` [PATCH] tmpfs: truncate prealloc blocks past i_size Hugh Dickins
0 siblings, 1 reply; 4+ messages in thread
From: Hugh Dickins @ 2015-06-16 20:02 UTC (permalink / raw)
To: Josef Bacik; +Cc: Andrew Morton, linux-mm, linux-kernel
On Tue, 19 May 2015, Josef Bacik wrote:
> If we fallocate past i_size with KEEP_SIZE, extend the file to use some but not
> all of this space, and then truncate(i_size) we won't trim the excess
> preallocated space. We decided at LSF that we want to truncate the fallocated
> bit past i_size when we truncate to i_size, which is what this patch does.
> Thanks,
>
> Signed-off-by: Josef Bacik <jbacik@fb.com>
Sorry for the delay, it's been on my mind but only now I get to it.
Yes, that was agreed at LSF, and I've checked that indeed tmpfs is
out of line here: thank you for fixing it. But I do prefer your
original more explicit description, so I'll send the patch to akpm
now for v4.2, with that description instead (plus a reference to LSF).
Thanks,
Hugh
> ---
> mm/shmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index de98137..089afde 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -569,7 +569,7 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
> i_size_write(inode, newsize);
> inode->i_ctime = inode->i_mtime = CURRENT_TIME;
> }
> - if (newsize < oldsize) {
> + if (newsize <= oldsize) {
> loff_t holebegin = round_up(newsize, PAGE_SIZE);
> unmap_mapping_range(inode->i_mapping, holebegin, 0, 1);
> shmem_truncate_range(inode, newsize, (loff_t)-1);
> --
> 1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH] tmpfs: truncate prealloc blocks past i_size
2015-06-16 20:02 ` Hugh Dickins
@ 2015-06-16 20:07 ` Hugh Dickins
0 siblings, 0 replies; 4+ messages in thread
From: Hugh Dickins @ 2015-06-16 20:07 UTC (permalink / raw)
To: Andrew Morton; +Cc: Josef Bacik, linux-mm, linux-kernel, linux-fsdevel
From: Josef Bacik <jbacik@fb.com>
One of the rocksdb people noticed that when you do something like this
fallocate(fd, FALLOC_FL_KEEP_SIZE, 0, 10M)
pwrite(fd, buf, 5M, 0)
ftruncate(5M)
on tmpfs, the file would still take up 10M: which led to super fun issues
because we were getting ENOSPC before we thought we should be getting
ENOSPC. This patch fixes the problem, and mirrors what all the other
fs'es do (and was agreed to be the correct behaviour at LSF).
I tested it locally to make sure it worked properly with the following
xfs_io -f -c "falloc -k 0 10M" -c "pwrite 0 5M" -c "truncate 5M" file
Without the patch we have "Blocks: 20480", with the patch we have the
correct value of "Blocks: 10240".
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
---
mm/shmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -569,7 +569,7 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
i_size_write(inode, newsize);
inode->i_ctime = inode->i_mtime = CURRENT_TIME;
}
- if (newsize < oldsize) {
+ if (newsize <= oldsize) {
loff_t holebegin = round_up(newsize, PAGE_SIZE);
unmap_mapping_range(inode->i_mapping, holebegin, 0, 1);
shmem_truncate_range(inode, newsize, (loff_t)-1);
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-06-16 20:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-29 17:10 [PATCH] tmpfs: truncate prealloc blocks past i_size Josef Bacik
2014-11-04 1:30 ` Hugh Dickins
2014-11-04 15:55 ` Josef Bacik
-- strict thread matches above, loose matches on Subject: below --
2015-05-19 15:27 [PATCH] tmpfs: truncate at i_size Josef Bacik
2015-06-16 20:02 ` Hugh Dickins
2015-06-16 20:07 ` [PATCH] tmpfs: truncate prealloc blocks past i_size Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).