* [V2 PATCH] tmpfs: add fallocate support
@ 2011-11-18 10:39 Cong Wang
2011-11-19 10:03 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Cong Wang @ 2011-11-18 10:39 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Pekka Enberg, Hugh Dickins, Dave Hansen, Lennart Poettering,
Kay Sievers, KOSAKI Motohiro, WANG Cong, linux-mm
It seems that systemd needs tmpfs to support fallocate,
see http://lkml.org/lkml/2011/10/20/275. This patch adds
fallocate support to tmpfs.
As we already have shmem_truncate_range(), it is also easy
to add FALLOC_FL_PUNCH_HOLE support too.
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
---
mm/shmem.c | 43 +++++++++++++++++++++++++++++++++++++++++++
1 files changed, 43 insertions(+), 0 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d672250..96bf619 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -30,6 +30,7 @@
#include <linux/mm.h>
#include <linux/export.h>
#include <linux/swap.h>
+#include <linux/falloc.h>
static struct vfsmount *shm_mnt;
@@ -1431,6 +1432,47 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
return error;
}
+static long shmem_fallocate(struct file *file, int mode,
+ loff_t offset, loff_t len)
+{
+ struct inode *inode = file->f_path.dentry->d_inode;
+ pgoff_t start = offset >> PAGE_CACHE_SHIFT;
+ pgoff_t end = DIV_ROUND_UP((offset + len), PAGE_CACHE_SIZE);
+ pgoff_t index = start;
+ loff_t i_size = i_size_read(inode);
+ struct page *page = NULL;
+ int ret = 0;
+
+ mutex_lock(&inode->i_mutex);
+ if (mode & FALLOC_FL_PUNCH_HOLE) {
+ if (!(offset > i_size || (end << PAGE_CACHE_SHIFT) > i_size))
+ shmem_truncate_range(inode, offset,
+ (end << PAGE_CACHE_SHIFT) - 1);
+ goto unlock;
+ }
+
+ if (!(mode & FALLOC_FL_KEEP_SIZE)) {
+ ret = inode_newsize_ok(inode, (offset + len));
+ if (ret)
+ goto unlock;
+ }
+
+ while (index < end) {
+ ret = shmem_getpage(inode, index, &page, SGP_WRITE, NULL);
+ if (ret)
+ goto unlock;
+ if (page)
+ unlock_page(page);
+ index++;
+ }
+ if (!(mode & FALLOC_FL_KEEP_SIZE) && (index << PAGE_CACHE_SHIFT) > i_size)
+ i_size_write(inode, index << PAGE_CACHE_SHIFT);
+
+unlock:
+ mutex_unlock(&inode->i_mutex);
+ return ret;
+}
+
static int shmem_statfs(struct dentry *dentry, struct kstatfs *buf)
{
struct shmem_sb_info *sbinfo = SHMEM_SB(dentry->d_sb);
@@ -2286,6 +2328,7 @@ static const struct file_operations shmem_file_operations = {
.fsync = noop_fsync,
.splice_read = shmem_file_splice_read,
.splice_write = generic_file_splice_write,
+ .fallocate = shmem_fallocate,
#endif
};
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-18 10:39 [V2 PATCH] tmpfs: add fallocate support Cong Wang
@ 2011-11-19 10:03 ` Christoph Hellwig
2011-11-19 14:14 ` Kay Sievers
2011-11-20 21:22 ` Hugh Dickins
2011-11-22 22:06 ` Andrew Morton
2 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2011-11-19 10:03 UTC (permalink / raw)
To: Cong Wang
Cc: linux-kernel, akpm, Pekka Enberg, Hugh Dickins, Dave Hansen,
Lennart Poettering, Kay Sievers, KOSAKI Motohiro, linux-mm
On Fri, Nov 18, 2011 at 06:39:50PM +0800, Cong Wang wrote:
> It seems that systemd needs tmpfs to support fallocate,
> see http://lkml.org/lkml/2011/10/20/275. This patch adds
> fallocate support to tmpfs.
What for exactly? Please explain why preallocating on tmpfs would
make any sense.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-19 10:03 ` Christoph Hellwig
@ 2011-11-19 14:14 ` Kay Sievers
2011-11-20 21:39 ` Hugh Dickins
2011-11-21 10:06 ` Christoph Hellwig
0 siblings, 2 replies; 14+ messages in thread
From: Kay Sievers @ 2011-11-19 14:14 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Cong Wang, linux-kernel, akpm, Pekka Enberg, Hugh Dickins,
Dave Hansen, Lennart Poettering, KOSAKI Motohiro, linux-mm
On Sat, Nov 19, 2011 at 11:03, Christoph Hellwig <hch@infradead.org> wrote:
> On Fri, Nov 18, 2011 at 06:39:50PM +0800, Cong Wang wrote:
>> It seems that systemd needs tmpfs to support fallocate,
>> see http://lkml.org/lkml/2011/10/20/275. This patch adds
>> fallocate support to tmpfs.
>
> What for exactly? Please explain why preallocating on tmpfs would
> make any sense.
To be able to safely use mmap(), regarding SIGBUS, on files on the
/dev/shm filesystem. The glibc fallback loop for -ENOSYS on fallocate
is just ugly.
Kay
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-18 10:39 [V2 PATCH] tmpfs: add fallocate support Cong Wang
2011-11-19 10:03 ` Christoph Hellwig
@ 2011-11-20 21:22 ` Hugh Dickins
2011-11-21 10:11 ` Christoph Hellwig
2011-11-22 5:39 ` Cong Wang
2011-11-22 22:06 ` Andrew Morton
2 siblings, 2 replies; 14+ messages in thread
From: Hugh Dickins @ 2011-11-20 21:22 UTC (permalink / raw)
To: Cong Wang
Cc: linux-kernel, akpm, Pekka Enberg, Dave Hansen, Lennart Poettering,
Kay Sievers, KOSAKI Motohiro, Christoph Hellwig, linux-mm
On Fri, 18 Nov 2011, Cong Wang wrote:
> It seems that systemd needs tmpfs to support fallocate,
> see http://lkml.org/lkml/2011/10/20/275. This patch adds
> fallocate support to tmpfs.
>
> As we already have shmem_truncate_range(), it is also easy
> to add FALLOC_FL_PUNCH_HOLE support too.
Thank you, this version looks much much nicer.
I wouldn't call it bug-free (don't you need a page_cache_release
after the unlock_page?), and I won't be reviewing it and testing it
for a week or two - there's a lot about the semantics of fallocate
and punch-hole that's not obvious, and I'll have to study the mail
threads discussing them before checking your patch.
First question that springs to mind (to which I shall easily find
an answer): is it actually acceptable for fallocate() to return
-ENOSPC when it has already completed a part of the work?
But so long as the details don't end up complicating this
significantly, since we anyway want to regularize the punch-hole
situation by giving tmpfs the same interface to it as other filesystems,
I now think it would be a bit perverse to disallow the original
fallocate functionality that you implement here in-kernel.
Thanks,
Hugh
>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Dave Hansen <dave@linux.vnet.ibm.com>
> Cc: Lennart Poettering <lennart@poettering.net>
> Cc: Kay Sievers <kay.sievers@vrfy.org>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: WANG Cong <amwang@redhat.com>
>
> ---
> mm/shmem.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 43 insertions(+), 0 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d672250..96bf619 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -30,6 +30,7 @@
> #include <linux/mm.h>
> #include <linux/export.h>
> #include <linux/swap.h>
> +#include <linux/falloc.h>
>
> static struct vfsmount *shm_mnt;
>
> @@ -1431,6 +1432,47 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
> return error;
> }
>
> +static long shmem_fallocate(struct file *file, int mode,
> + loff_t offset, loff_t len)
> +{
> + struct inode *inode = file->f_path.dentry->d_inode;
> + pgoff_t start = offset >> PAGE_CACHE_SHIFT;
> + pgoff_t end = DIV_ROUND_UP((offset + len), PAGE_CACHE_SIZE);
> + pgoff_t index = start;
> + loff_t i_size = i_size_read(inode);
> + struct page *page = NULL;
> + int ret = 0;
> +
> + mutex_lock(&inode->i_mutex);
> + if (mode & FALLOC_FL_PUNCH_HOLE) {
> + if (!(offset > i_size || (end << PAGE_CACHE_SHIFT) > i_size))
> + shmem_truncate_range(inode, offset,
> + (end << PAGE_CACHE_SHIFT) - 1);
> + goto unlock;
> + }
> +
> + if (!(mode & FALLOC_FL_KEEP_SIZE)) {
> + ret = inode_newsize_ok(inode, (offset + len));
> + if (ret)
> + goto unlock;
> + }
> +
> + while (index < end) {
> + ret = shmem_getpage(inode, index, &page, SGP_WRITE, NULL);
> + if (ret)
> + goto unlock;
> + if (page)
> + unlock_page(page);
> + index++;
> + }
> + if (!(mode & FALLOC_FL_KEEP_SIZE) && (index << PAGE_CACHE_SHIFT) > i_size)
> + i_size_write(inode, index << PAGE_CACHE_SHIFT);
> +
> +unlock:
> + mutex_unlock(&inode->i_mutex);
> + return ret;
> +}
> +
> static int shmem_statfs(struct dentry *dentry, struct kstatfs *buf)
> {
> struct shmem_sb_info *sbinfo = SHMEM_SB(dentry->d_sb);
> @@ -2286,6 +2328,7 @@ static const struct file_operations shmem_file_operations = {
> .fsync = noop_fsync,
> .splice_read = shmem_file_splice_read,
> .splice_write = generic_file_splice_write,
> + .fallocate = shmem_fallocate,
> #endif
> };
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-19 14:14 ` Kay Sievers
@ 2011-11-20 21:39 ` Hugh Dickins
2011-11-21 10:11 ` Christoph Hellwig
2011-11-21 10:06 ` Christoph Hellwig
1 sibling, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2011-11-20 21:39 UTC (permalink / raw)
To: Kay Sievers
Cc: Christoph Hellwig, Cong Wang, linux-kernel, akpm, Pekka Enberg,
Dave Hansen, Lennart Poettering, KOSAKI Motohiro, linux-mm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1568 bytes --]
On Sat, 19 Nov 2011, Kay Sievers wrote:
> On Sat, Nov 19, 2011 at 11:03, Christoph Hellwig <hch@infradead.org> wrote:
> > On Fri, Nov 18, 2011 at 06:39:50PM +0800, Cong Wang wrote:
> >> It seems that systemd needs tmpfs to support fallocate,
> >> see http://lkml.org/lkml/2011/10/20/275. This patch adds
> >> fallocate support to tmpfs.
> >
> > What for exactly? Please explain why preallocating on tmpfs would
> > make any sense.
>
> To be able to safely use mmap(), regarding SIGBUS, on files on the
> /dev/shm filesystem. The glibc fallback loop for -ENOSYS on fallocate
> is just ugly.
The fallback for -EOPNOTSUPP?
Being unfamiliar with glibc, I failed to find the internal_fallocate()
that it appears to use when the filesystem doesn't support the call;
so I don't know if I would agree with you that it's uglier than doing
the same(?) in the kernel.
But since the present situation is that tmpfs has one interface to
punching holes, madvise(MADV_REMOVE), that IBM were pushing 5 years ago;
but ext4 (and others) now a fallocate(FALLOC_FL_PUNCH_HOLE) interface
which IBM have been pushing this year: we do want to normalize that
situation and make them all behave the same way.
And if tmpfs is going to support fallocate(FALLOC_FL_PUNCH_HOLE),
looking at Amerigo's much more attractive V2 patch, it would seem
to me perverse to permit the deallocation but fail the allocation.
The principle of least surprise argues that we grant your wish:
provided it doesn't grow much more complicated once I look more
closely.
Hugh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-19 14:14 ` Kay Sievers
2011-11-20 21:39 ` Hugh Dickins
@ 2011-11-21 10:06 ` Christoph Hellwig
2011-11-22 5:39 ` Cong Wang
1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2011-11-21 10:06 UTC (permalink / raw)
To: Kay Sievers
Cc: Christoph Hellwig, Cong Wang, linux-kernel, akpm, Pekka Enberg,
Hugh Dickins, Dave Hansen, Lennart Poettering, KOSAKI Motohiro,
linux-mm
On Sat, Nov 19, 2011 at 03:14:48PM +0100, Kay Sievers wrote:
> On Sat, Nov 19, 2011 at 11:03, Christoph Hellwig <hch@infradead.org> wrote:
> > On Fri, Nov 18, 2011 at 06:39:50PM +0800, Cong Wang wrote:
> >> It seems that systemd needs tmpfs to support fallocate,
> >> see http://lkml.org/lkml/2011/10/20/275. This patch adds
> >> fallocate support to tmpfs.
> >
> > What for exactly? ??Please explain why preallocating on tmpfs would
> > make any sense.
>
> To be able to safely use mmap(), regarding SIGBUS, on files on the
> /dev/shm filesystem. The glibc fallback loop for -ENOSYS on fallocate
> is just ugly.
That is the kind of information which needs to be in the changelog.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-20 21:39 ` Hugh Dickins
@ 2011-11-21 10:11 ` Christoph Hellwig
2011-11-21 22:13 ` Hugh Dickins
2011-11-22 5:50 ` Cong Wang
0 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2011-11-21 10:11 UTC (permalink / raw)
To: Hugh Dickins
Cc: Kay Sievers, Christoph Hellwig, Cong Wang, linux-kernel, akpm,
Pekka Enberg, Dave Hansen, Lennart Poettering, KOSAKI Motohiro,
linux-mm
On Sun, Nov 20, 2011 at 01:39:12PM -0800, Hugh Dickins wrote:
> > To be able to safely use mmap(), regarding SIGBUS, on files on the
> > /dev/shm filesystem. The glibc fallback loop for -ENOSYS on fallocate
> > is just ugly.
>
> The fallback for -EOPNOTSUPP?
Probably for both. Note that the fallocate man page actually documents
the errors incorrecly - it documents ENOSYS for filesystems not
supporting fallocate, and EOPNOTSUPP for not recognizing the mode, but
we actually return EOPNOTSUPP for either case. ENOSYS is only returned
by kernels not implementing fallocate at all.
> Being unfamiliar with glibc, I failed to find the internal_fallocate()
> that it appears to use when the filesystem doesn't support the call;
> so I don't know if I would agree with you that it's uglier than doing
> the same(?) in the kernel.
Last time I looked it basically did a pwrite loop writing zeroes.
Unfortunately it did far too small I/O sizes and thus actually causes
some major overhead e.g. on ext3.
> But since the present situation is that tmpfs has one interface to
> punching holes, madvise(MADV_REMOVE), that IBM were pushing 5 years ago;
> but ext4 (and others) now a fallocate(FALLOC_FL_PUNCH_HOLE) interface
> which IBM have been pushing this year: we do want to normalize that
> situation and make them all behave the same way.
FALLOC_FL_PUNCH_HOLE was added by Josef Bacik, who happens to work for
Red Hat, but I doubt he was pushing any corporate agenda there, he was
mostly making btrfs catch up with the 15 year old XFS hole punching
ioctl.
> And if tmpfs is going to support fallocate(FALLOC_FL_PUNCH_HOLE),
> looking at Amerigo's much more attractive V2 patch, it would seem
> to me perverse to permit the deallocation but fail the allocation.
Agreed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-20 21:22 ` Hugh Dickins
@ 2011-11-21 10:11 ` Christoph Hellwig
2011-11-22 5:39 ` Cong Wang
1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2011-11-21 10:11 UTC (permalink / raw)
To: Hugh Dickins
Cc: Cong Wang, linux-kernel, akpm, Pekka Enberg, Dave Hansen,
Lennart Poettering, Kay Sievers, KOSAKI Motohiro,
Christoph Hellwig, linux-mm
On Sun, Nov 20, 2011 at 01:22:10PM -0800, Hugh Dickins wrote:
> First question that springs to mind (to which I shall easily find
> an answer): is it actually acceptable for fallocate() to return
> -ENOSPC when it has already completed a part of the work?
No, it must undo all allocations if it returns ENOSPC.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-21 10:11 ` Christoph Hellwig
@ 2011-11-21 22:13 ` Hugh Dickins
2011-11-22 5:50 ` Cong Wang
1 sibling, 0 replies; 14+ messages in thread
From: Hugh Dickins @ 2011-11-21 22:13 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Josef Bacik, Kay Sievers, Cong Wang, linux-kernel, akpm,
Pekka Enberg, Dave Hansen, Lennart Poettering, KOSAKI Motohiro,
linux-mm
On Mon, 21 Nov 2011, Christoph Hellwig wrote:
> On Sun, Nov 20, 2011 at 01:39:12PM -0800, Hugh Dickins wrote:
>
> > But since the present situation is that tmpfs has one interface to
> > punching holes, madvise(MADV_REMOVE), that IBM were pushing 5 years ago;
> > but ext4 (and others) now a fallocate(FALLOC_FL_PUNCH_HOLE) interface
> > which IBM have been pushing this year: we do want to normalize that
> > situation and make them all behave the same way.
>
> FALLOC_FL_PUNCH_HOLE was added by Josef Bacik, who happens to work for
> Red Hat, but I doubt he was pushing any corporate agenda there, he was
> mostly making btrfs catch up with the 15 year old XFS hole punching
> ioctl.
Yeah, my apologies to Josef and to IBM and to XFS
for my regrettable little outburst of snarkiness :(
>
>
> > And if tmpfs is going to support fallocate(FALLOC_FL_PUNCH_HOLE),
> > looking at Amerigo's much more attractive V2 patch, it would seem
> > to me perverse to permit the deallocation but fail the allocation.
>
> Agreed.
Thanks a lot for useful info, and saving me looking up the ENOSPC issue.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-20 21:22 ` Hugh Dickins
2011-11-21 10:11 ` Christoph Hellwig
@ 2011-11-22 5:39 ` Cong Wang
1 sibling, 0 replies; 14+ messages in thread
From: Cong Wang @ 2011-11-22 5:39 UTC (permalink / raw)
To: Hugh Dickins
Cc: linux-kernel, akpm, Pekka Enberg, Dave Hansen, Lennart Poettering,
Kay Sievers, KOSAKI Motohiro, Christoph Hellwig, linux-mm
ao? 2011a1'11ae??21ae?JPY 05:22, Hugh Dickins a??e??:
> On Fri, 18 Nov 2011, Cong Wang wrote:
>
>> It seems that systemd needs tmpfs to support fallocate,
>> see http://lkml.org/lkml/2011/10/20/275. This patch adds
>> fallocate support to tmpfs.
>>
>> As we already have shmem_truncate_range(), it is also easy
>> to add FALLOC_FL_PUNCH_HOLE support too.
>
> Thank you, this version looks much much nicer.
>
> I wouldn't call it bug-free (don't you need a page_cache_release
> after the unlock_page?), and I won't be reviewing it and testing it
> for a week or two - there's a lot about the semantics of fallocate
> and punch-hole that's not obvious, and I'll have to study the mail
> threads discussing them before checking your patch.
Yeah, sorry, I missed unlock_page()...
>
> First question that springs to mind (to which I shall easily find
> an answer): is it actually acceptable for fallocate() to return
> -ENOSPC when it has already completed a part of the work?
Ah, good point, I will fix this as what Christoph suggested.
>
> But so long as the details don't end up complicating this
> significantly, since we anyway want to regularize the punch-hole
> situation by giving tmpfs the same interface to it as other filesystems,
> I now think it would be a bit perverse to disallow the original
> fallocate functionality that you implement here in-kernel.
>
Ok, I think you mean you are fine to accept it now?
Anyway, thanks a lot for your comments!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-21 10:06 ` Christoph Hellwig
@ 2011-11-22 5:39 ` Cong Wang
0 siblings, 0 replies; 14+ messages in thread
From: Cong Wang @ 2011-11-22 5:39 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Kay Sievers, linux-kernel, akpm, Pekka Enberg, Hugh Dickins,
Dave Hansen, Lennart Poettering, KOSAKI Motohiro, linux-mm
ao? 2011a1'11ae??21ae?JPY 18:06, Christoph Hellwig a??e??:
> On Sat, Nov 19, 2011 at 03:14:48PM +0100, Kay Sievers wrote:
>> On Sat, Nov 19, 2011 at 11:03, Christoph Hellwig<hch@infradead.org> wrote:
>>> On Fri, Nov 18, 2011 at 06:39:50PM +0800, Cong Wang wrote:
>>>> It seems that systemd needs tmpfs to support fallocate,
>>>> see http://lkml.org/lkml/2011/10/20/275. This patch adds
>>>> fallocate support to tmpfs.
>>>
>>> What for exactly? ??Please explain why preallocating on tmpfs would
>>> make any sense.
>>
>> To be able to safely use mmap(), regarding SIGBUS, on files on the
>> /dev/shm filesystem. The glibc fallback loop for -ENOSYS on fallocate
>> is just ugly.
>
> That is the kind of information which needs to be in the changelog.
>
I will fix the changelog.
Thanks, Christoph and Kay.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-21 10:11 ` Christoph Hellwig
2011-11-21 22:13 ` Hugh Dickins
@ 2011-11-22 5:50 ` Cong Wang
1 sibling, 0 replies; 14+ messages in thread
From: Cong Wang @ 2011-11-22 5:50 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Hugh Dickins, Kay Sievers, linux-kernel, akpm, Pekka Enberg,
Dave Hansen, Lennart Poettering, KOSAKI Motohiro, linux-mm
ao? 2011a1'11ae??21ae?JPY 18:11, Christoph Hellwig a??e??:
> On Sun, Nov 20, 2011 at 01:39:12PM -0800, Hugh Dickins wrote:
>>> To be able to safely use mmap(), regarding SIGBUS, on files on the
>>> /dev/shm filesystem. The glibc fallback loop for -ENOSYS on fallocate
>>> is just ugly.
>>
>> The fallback for -EOPNOTSUPP?
>
> Probably for both. Note that the fallocate man page actually documents
> the errors incorrecly - it documents ENOSYS for filesystems not
> supporting fallocate, and EOPNOTSUPP for not recognizing the mode, but
> we actually return EOPNOTSUPP for either case. ENOSYS is only returned
> by kernels not implementing fallocate at all.
We need to fix man page of fallocate(2)...
>
>> Being unfamiliar with glibc, I failed to find the internal_fallocate()
>> that it appears to use when the filesystem doesn't support the call;
>> so I don't know if I would agree with you that it's uglier than doing
>> the same(?) in the kernel.
>
> Last time I looked it basically did a pwrite loop writing zeroes.
> Unfortunately it did far too small I/O sizes and thus actually causes
> some major overhead e.g. on ext3.
>
>> But since the present situation is that tmpfs has one interface to
>> punching holes, madvise(MADV_REMOVE), that IBM were pushing 5 years ago;
>> but ext4 (and others) now a fallocate(FALLOC_FL_PUNCH_HOLE) interface
>> which IBM have been pushing this year: we do want to normalize that
>> situation and make them all behave the same way.
>
> FALLOC_FL_PUNCH_HOLE was added by Josef Bacik, who happens to work for
> Red Hat, but I doubt he was pushing any corporate agenda there, he was
> mostly making btrfs catch up with the 15 year old XFS hole punching
> ioctl.
I sent a patch to util-linux-ng too,
http://thread.gmane.org/gmane.linux.utilities.util-linux-ng/5045
>
>
>> And if tmpfs is going to support fallocate(FALLOC_FL_PUNCH_HOLE),
>> looking at Amerigo's much more attractive V2 patch, it would seem
>> to me perverse to permit the deallocation but fail the allocation.
>
> Agreed.
>
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-18 10:39 [V2 PATCH] tmpfs: add fallocate support Cong Wang
2011-11-19 10:03 ` Christoph Hellwig
2011-11-20 21:22 ` Hugh Dickins
@ 2011-11-22 22:06 ` Andrew Morton
2011-11-23 4:30 ` Cong Wang
2 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2011-11-22 22:06 UTC (permalink / raw)
To: Cong Wang
Cc: linux-kernel, Pekka Enberg, Hugh Dickins, Dave Hansen,
Lennart Poettering, Kay Sievers, KOSAKI Motohiro, linux-mm
On Fri, 18 Nov 2011 18:39:50 +0800
Cong Wang <amwang@redhat.com> wrote:
> It seems that systemd needs tmpfs to support fallocate,
> see http://lkml.org/lkml/2011/10/20/275. This patch adds
> fallocate support to tmpfs.
>
> As we already have shmem_truncate_range(), it is also easy
> to add FALLOC_FL_PUNCH_HOLE support too.
>
>
> ...
>
> +static long shmem_fallocate(struct file *file, int mode,
> + loff_t offset, loff_t len)
> +{
> + struct inode *inode = file->f_path.dentry->d_inode;
> + pgoff_t start = offset >> PAGE_CACHE_SHIFT;
> + pgoff_t end = DIV_ROUND_UP((offset + len), PAGE_CACHE_SIZE);
> + pgoff_t index = start;
> + loff_t i_size = i_size_read(inode);
> + struct page *page = NULL;
> + int ret = 0;
> +
> + mutex_lock(&inode->i_mutex);
It would be saner and less racy-looking to read i_size _after_ taking
i_mutex.
And if you do that, there's no need to use i_size_read() - just a plain
old
i_size = inode->i_size;
is OK.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [V2 PATCH] tmpfs: add fallocate support
2011-11-22 22:06 ` Andrew Morton
@ 2011-11-23 4:30 ` Cong Wang
0 siblings, 0 replies; 14+ messages in thread
From: Cong Wang @ 2011-11-23 4:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Pekka Enberg, Hugh Dickins, Dave Hansen,
Lennart Poettering, Kay Sievers, KOSAKI Motohiro, linux-mm
ao? 2011a1'11ae??23ae?JPY 06:06, Andrew Morton a??e??:
> On Fri, 18 Nov 2011 18:39:50 +0800
> Cong Wang<amwang@redhat.com> wrote:
>
>> It seems that systemd needs tmpfs to support fallocate,
>> see http://lkml.org/lkml/2011/10/20/275. This patch adds
>> fallocate support to tmpfs.
>>
>> As we already have shmem_truncate_range(), it is also easy
>> to add FALLOC_FL_PUNCH_HOLE support too.
>>
>>
>> ...
>>
>> +static long shmem_fallocate(struct file *file, int mode,
>> + loff_t offset, loff_t len)
>> +{
>> + struct inode *inode = file->f_path.dentry->d_inode;
>> + pgoff_t start = offset>> PAGE_CACHE_SHIFT;
>> + pgoff_t end = DIV_ROUND_UP((offset + len), PAGE_CACHE_SIZE);
>> + pgoff_t index = start;
>> + loff_t i_size = i_size_read(inode);
>> + struct page *page = NULL;
>> + int ret = 0;
>> +
>> + mutex_lock(&inode->i_mutex);
>
> It would be saner and less racy-looking to read i_size _after_ taking
> i_mutex.
>
> And if you do that, there's no need to use i_size_read() - just a plain
> old
>
> i_size = inode->i_size;
>
> is OK.
Fair enough, I will fix that in V3.
Thanks, Andrew!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-11-23 4:31 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-18 10:39 [V2 PATCH] tmpfs: add fallocate support Cong Wang
2011-11-19 10:03 ` Christoph Hellwig
2011-11-19 14:14 ` Kay Sievers
2011-11-20 21:39 ` Hugh Dickins
2011-11-21 10:11 ` Christoph Hellwig
2011-11-21 22:13 ` Hugh Dickins
2011-11-22 5:50 ` Cong Wang
2011-11-21 10:06 ` Christoph Hellwig
2011-11-22 5:39 ` Cong Wang
2011-11-20 21:22 ` Hugh Dickins
2011-11-21 10:11 ` Christoph Hellwig
2011-11-22 5:39 ` Cong Wang
2011-11-22 22:06 ` Andrew Morton
2011-11-23 4:30 ` Cong Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).